Inside the Black Box of AI Reasoning

  Рет қаралды 2,381

code_your_own_AI

code_your_own_AI

17 күн бұрын

See inside the "thought process" of AI /LLM when tackling complex questions. Based on new research, we explore a simple but elegant way of exploring cross-level complexities of multiple open source LLMs.
Cyber Thoughts: A Peek Inside AI's Mind.
Inside the Thought Process of an AI.
This video primarily investigates the complex reasoning capabilities of large language models (LLMs) using a novel graph-based framework designed to assess and enhance the depth and accuracy of model reasoning across different knowledge levels. The study introduces DEPTHQA, a dataset that decomposes complex, real-world questions into a hierarchy of simpler sub-questions categorized into three distinct depths: factual and conceptual knowledge (D1), procedural knowledge (D2), and strategic knowledge (D3). This hierarchical structuring allows for a detailed evaluation of how well LLMs can escalate their reasoning from basic facts to intricate, analytical problem-solving. It quantitatively measures the model's performance on each level using forward and backward discrepancies-metrics that respectively capture the model's struggles with escalating complexity and its proficiency in handling complex queries relative to their simpler constituents. This dual assessment helps identify specific areas where LLMs fail to integrate or transition between different types of knowledge, illuminating gaps in both foundational understandings and advanced reasoning capabilities.
Through rigorous experimentation with several state-of-the-art LLMs, the study explores the relationship between model capacity and reasoning discrepancies, revealing that larger models generally exhibit fewer discrepancies in both forward and backward dimensions, suggesting a better overall integration of knowledge layers. The research further delves into the impact of model training and architecture on discrepancy outcomes, indicating that models with more extensive training data and sophisticated architectures are more adept at bridging the gap between different knowledge depths.
Additionally, the introduction of a "predict solution" strategy, which involves using the model's own predictions as inputs for subsequent questions, underscores a method to enhance self-referential consistency and depth in reasoning. This approach not only tests the model's ability to utilize its previously generated outputs but also its capacity for self-correction and adaptive learning over multiple turns.
The insights garnered from this study highlight the critical importance of structured reasoning paths and the potential benefits of iterative, context-aware processing in improving the overall effectiveness and reliability of LLMs in complex problem-solving scenarios.
All rights w/ Authors:
Investigating How Large Language Models
Leverage Internal Knowledge to Perform Complex Reasoning
arxiv.org/pdf/2406.19502
#airesearch
#reasoning
#ai

Пікірлер: 1
@algoritm3034
@algoritm3034 15 күн бұрын
It's a great video, but I still haven't figured out how to apply it... Do you need to use their work to evaluate the model, see what types of questions it is wrong on, and, for example, tune it to improve it for them, or how? Can anyone clarify?
BEST OPEN Alternative to OPENAI's EMBEDDINGs for Retrieval QA: LangChain
24:22
AI passed the Turing Test -- And No One Noticed
8:46
Sabine Hossenfelder
Рет қаралды 421 М.
How Many Balloons Does It Take To Fly?
00:18
MrBeast
Рет қаралды 156 МЛН
Самый Молодой Актёр Без Оскара 😂
00:13
Глеб Рандалайнен
Рет қаралды 4 МЛН
Q* explained: Complex Multi-Step AI Reasoning
55:11
code_your_own_AI
Рет қаралды 7 М.
Improve AGENTIC AI (Princeton)
28:26
code_your_own_AI
Рет қаралды 2,7 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 725 М.
LLM - Reasoning SOLVED (new research)
47:51
code_your_own_AI
Рет қаралды 16 М.
NEW Multi-Modal AI by APPLE
26:49
code_your_own_AI
Рет қаралды 2,4 М.
GraphRAG or SpeculativeRAG ?
25:51
code_your_own_AI
Рет қаралды 5 М.
This is why Deep Learning is really weird.
2:06:38
Machine Learning Street Talk
Рет қаралды 372 М.
Official PyTorch Documentary: Powering the AI Revolution
35:53
What Creates Consciousness?
45:45
World Science Festival
Рет қаралды 83 М.
Samsung laughing on iPhone #techbyakram
0:12
Tech by Akram
Рет қаралды 642 М.
S24 Ultra and IPhone 14 Pro Max telephoto shooting comparison #shorts
0:15
Photographer Army
Рет қаралды 8 МЛН
Todos os modelos de smartphone
0:20
Spider Slack
Рет қаралды 57 МЛН