Sholto Douglas & Trenton Bricken - How LLMs Actually Think

Рет қаралды 166,453

Күн бұрын

Пікірлер: 384

@aidanmclaughlin5279 10 ай бұрын

This is, by far, the densest and most enjoyable interview I've ever heard in this domain. Please get these guys back on.

@Flyingblackswan 10 ай бұрын

This podcast isn't just a podcast, it's a valuable insight to how brilliant people think. This could've easily been like a $1,000 course. I really appreciate the guests for their time and for your ability to conduct an excellent interview.

@danielbrockman7402 10 ай бұрын

this is so true

@aoeu256 9 ай бұрын

yep! however, i was thinking why don't we have a model of "emotional speech" yet where each word can be spoken with different emotions, speeds, voices. maybe the problem is the speed of our keyboards.

@JoblessGuyReading 9 ай бұрын

@aoeu256 yeah I think Hume ai is actually trying to do that , saw a few of their demos !

@MrEmbrance 9 ай бұрын

Jesus, stop

@Randy.Bobandy 9 ай бұрын

You're got big problems if you'd pay $1000 to watch this.

@JazevoAudiosurf 9 ай бұрын

"it's amazing how easily you can become world class at something because most people are not trying that hard" it's amazing how easily you can be in the top 10% because nobody is trying at all

@sup3a 10 ай бұрын

I don't understand how this podcast isn't bigger. It's like Lex but better. Keep doing what you do, you rock dude

@ryzikx 10 ай бұрын

because lex is not "better" he is more popular. he is for a more general audience

@egalanos 10 ай бұрын

I think posting lots of little clips would be hurting viewing stats for the algorithm. I really wish there was a separate clips channel.

@KibberShuriq 9 ай бұрын

I find this particular episode extremely fascinating, but I also feel like I personally know maybe one other person I could recommend watching it to, and even they would probably understand only about half of it. Lex, on the other hand, is usually very legible to an average (or slightly above average) native English speaker. Also, Dwarkesh never asks his guests what love is, and we all know that's pretty much a requirement for popularity.

@HarpreetSingh-xg2zm 9 ай бұрын

@@egalanoslittle clips are more likely to funnel new viewers to this channel. Channel is too small to try and branch clips out

@TomBouthillet 9 ай бұрын

Lex is an insufferable douche nozzle.

@hummuswithpitta 9 ай бұрын

This is everything Lex's original "AI Podcast" should have been. You are so god damn well researched and the interviewees know it and relish in your questions, which just gives us so much extra sauce. Hope you just focus on the general AI universe and not bother diluting the podcast like Lex has (zero shade to Lex here btw). These two are honestly fascinating to listen to. Crazy how short-lived their career in the industry is for the knowledge they have.

@andrew-729 9 ай бұрын

Nah, cast shade on lex. He never actually pushes back, and when he does it is super light. Basically they know it is a free for all PR circuit.

@sidequestsally 9 ай бұрын

"I'm just making context queries" -Dwarkesh probably

@akhil090579 10 ай бұрын

you should do more of these double interviews I think it worked really well and moves in interesting directions, good work!

@jeffreypicel7863 10 ай бұрын

One of the best AI podcasts in the past year!

@Sirbikingviking 10 ай бұрын

Wow this is one of the most dense and interesting podcasts I have ever seen, and my Spotify account has over 700 episodes listened to. Great job Dwarkesh, thank you so much for what you are doing.

@DwarkeshPatel 10 ай бұрын

Hope you guys have as much fun with this as I did :)

@LoganJeya 10 ай бұрын

Would love to see you talk to Joschua Bach if you haven't already. These two have such a great intuitive understanding of the mind and how these learning systems work

@spaceghost8891 10 ай бұрын

Try to have Ilya again.

@Ashish-yo8ci 10 ай бұрын

@@spaceghost8891 or Jan Leike 😅 They have some interesting stuff going on in the superalignment wing with scalable optimisation and weak to strong generalization.

@13371138 10 ай бұрын

This channel is amazing in terms of the guests, depth of discussion, and production quality. I am completely baffled as to why this channel does not have 1M+ subscribers. Maybe because the channel thumbnail makes it look too casual?

@XTargi 10 ай бұрын

Yeah , that's super nice. Can you please consider adding some acoustic panels on the wall, because it's echoing pretty hard 😅

@NimTheHuman 5 ай бұрын

This is probably the best podcast episode I've come across for someone interested in LLMs. Big-name guests (e.g., CEOs) are great, but tend to engage in surface-level discussions; whereas, Sholto and Trenton went deep into technical details/insights. Hidden gem. It also helps that Dwarkesh asks excellent questions.

@daveh56821 10 ай бұрын

🎯 Key Takeaways for quick navigation: 00:52 *🗣️ Introductions and Achievements* - Introduction of Sholto Douglas and Trenton Bricken, highlighting their contributions to AI, particularly in AI alignment and mechanistic interpretability. - Sholto's role in Gemini's success despite being relatively new to the field. 02:10 *🧠 Context Lengths and AI Intelligence* - The underappreciated importance of long context lengths for AI models. - The significant impact of increasing context lengths on model intelligence, demonstrated through the ability to learn a new language and potentially play Atari games. 05:00 *🔍 In-Context Learning and Gradient Descent* - Exploring in-context learning as a form of gradient descent. - The potential for in-context learning to lead to superhuman capabilities by integrating massive amounts of information. 07:19 *🎓 Long-Horizon Tasks and AI Agents* - The connection between long context windows and AI's ability to perform long-horizon tasks. - Addressing the misconception that the inability to perform long-horizon tasks is the primary reason AI agents haven't progressed significantly. 10:42 *⏳ Understanding Long-Horizon Task Success Rates* - The importance of evaluating AI models on long-horizon tasks to gauge their economic impact and capability improvements. - Challenges in measuring success rates over tasks with varying time horizons. 13:21 *🤔 Learning in the Forward Pass* - The shift towards more learning happening in the forward pass of models. - Comparing learning processes to natural evolution and considering the implications for AI efficiency and adaptability. 17:28 *🧠📚 Reasoning vs. Raw Information Storage* - Differentiating between the storage of raw information and the process of reasoning within AI models. - How models transform input tokens into meaningful outputs through layers of processing. 24:08 *🧩 Pattern Matching and Intelligence* - The role of pattern matching and associative memories in intelligence. - How high-level associations and meta-learning contribute to AI's reasoning capabilities. 27:02 *🔍 Sherlock Holmes Analogy and Deductive Reasoning* - Discussing the deductive reasoning abilities of AI in the context of Sherlock Holmes. - The importance of context length and working memory in enabling complex reasoning and problem-solving in AI. 31:44 *🤖 Superintelligence Concerns and Associations* - Debating whether the associative nature of AI reasoning should alter concerns about superintelligence. - Considering the implications of AI's capabilities being grounded in associations and pattern recognition. 32:43 *🔄 Recursive Self-Improvement and Associations* - Discussion on the recursive self-improvement of AI focusing on enhancing association skills. - AI's potential for meta-learning, implying an ability to improve its association-making skills rapidly. 33:41 *🚀 Intelligence Explosion Theories* - Examination of intelligence explosion theories from the perspective of AI researchers. - The plausibility of an intelligence explosion driven by AI researchers being replaced by automated systems. 37:51 *🧠 AI Augmentation in Research* - How AI can augment research work by automating tasks, potentially speeding up AI research progress. - The importance of more reliable models and the potential for AI to automate significant portions of research tasks. 44:05 *🛠️ Research and Experimentation Process* - Insight into the process of AI research, emphasizing experimentation, idea validation, and understanding failures. - The significant role of intuition and experience in selecting and executing research ideas. 50:25 *📈 Scaling AI Research* - Challenges and strategies in scaling AI research, including organizational and computational aspects. - The importance of compute resources and the concept of "taste" in driving research direction. 53:18 *🤖 AI's Role in Accelerating AI Research* - Discussion on how AI could directly contribute to accelerating AI research, focusing on algorithmic progress and the production of synthetic data. - The distinction between AI augmenting researchers' capabilities and AI's output being a crucial component in model capability progress. 59:39 *🔍 Empirical Nature of ML Research* - The empirical and evolutionary nature of machine learning research, suggesting a gradual, community-driven path towards advanced AI models. - The impact of increasing participation in the field on the pace of AI advancements. 01:05:36 *📈 Future AI Jumps and Brain Scale Comparisons* - Discussion on future improvements in AI capabilities and comparisons with human brain scale. - Even with diminishing returns on compute investment, significant capability improvements are expected. 01:07:02 *🧠 Sample Efficiency and Model Size* - Examination of sample efficiency and the impact of model size on learning. - Larger models exhibit greater sample efficiency, potentially addressing data efficiency challenges. 01:10:52 *🤖 Model Interpretability and Compression* - The challenge of interpreting highly parameterized models and the role of compression in model training. - Techniques for improving model interpretability by manipulating activation spaces and reducing compression. 01:13:25 *🧩 Adaptive Compute and Chain-of-Thought* - Adaptive compute as a method for models to allocate more processing to complex problems. - Chain-of-thought reasoning as an example of adaptive compute, allowing models to "think" through problems over multiple steps. 01:23:20 *🔄 AI Agents and Communication Efficiency* - The potential for AI agents to communicate using dense representations for efficiency. - Future AI systems may utilize more human-interpretable features for internal communication. 01:25:08 *💡 Near-Term AI Firms and Agent Specialization* - Near-term AI systems likely to resemble networks of specialized, reliable agents. - Importance of human oversight in early AI systems to ensure alignment with desired outcomes. 01:29:19 *🌐 Language Evolution and AI Development* - The co-evolution of language and human cognitive capabilities as a model for AI development. - Language's role in structuring thought and its potential influence on the effectiveness of language models. 01:36:06 *🚀 Career Reflections in AI Research* - Reflections on the rapid progress and contributions to the field of AI research within a short career span. - The importance of execution, experimentation, and a proactive approach in advancing AI interpretability research. 01:38:20 *🔄 Career Agility and Headstrongness* - Discussion on the significance of being headstrong and able to pivot careers or academic focuses with agility. - The value of strong ideas, held loosely, allowing for swift directional changes when necessary. 01:43:28 *🛠️ Building a Career through Agency and Novel Contributions* - How unique contributions to AI and engineering can significantly boost visibility and career opportunities. 01:53:55 *🧠 Bridging Knowledge Across AI Subfields* - Reflecting on the synergy between computational neuroscience insights and AI model interpretability research. 01:59:48 *🚀 Leveraging Passion and Expertise Beyond Formal Channels* - The non-linear paths to significant roles within major tech organizations through passion-driven projects and external recognition. 02:09:26 *🧠 Brain vs. AI Model Features* - Exploration of how features in AI models and the brain might correlate, questioning the granularity and universality of features across both. 02:15:28 *⚙️ Feature Universality and Misalignment Concerns* - Discussion on feature universality across models and its implications for AI interpretability and alignment. 02:25:51 *🔍 Analyzing AI Models through Dictionary Learning* - Introduction to using dictionary learning for interpreting and understanding AI models by identifying and analyzing features. 02:32:18 *🤖 Feature Splitting and Scalability in AI Interpretability* - Examination of feature splitting as a critical aspect of AI model scalability and interpretability. 02:40:57 *🔍 Depth-First Feature Exploration* - Discussion on a methodical approach to identify highly specific features, like bioweapons, in AI models by incrementally increasing the dimensionality. 02:45:58 *🎛️ Mixtral of Experts and Feature Organization* - Examination of how features and experts within models are organized, questioning the intuitiveness and specialization of these features. 02:58:52 *💡 High-Level Associations in AI Interpretability* - Delving into high-level feature associations found in AI models and their potential parallels with human psychology. 03:03:05 *🤔 The Ethics and Potential Risks of Too Much Control* - Raises concerns about the ethical implications and potential risks of having too much control over AI models through precise interpretability techniques. 03:10:29 *🍦 AI Enjoyment of Task Prediction* - Speculative discussion on whether AI models might "enjoy" predictable sequences, comparing it to human preferences for predictability. Made with HARPA AI

@siddharth-gandhi 10 ай бұрын

you da mad king of podcasting. putting in dem mad hours before each. thanks for being awesome!

@andrewwalker8985 10 ай бұрын

Spectacularly interesting and entertaining interview. These guys are smart in the way that makes you feel like you’re in on it

@JohnSimerlink2014 10 ай бұрын

When Jim from the Office and Justin Timberlake teach you AGI

@DwarkeshPatel 10 ай бұрын

lmao

@eliz3225 9 ай бұрын

justin timberlake talks like mark zuckerberg

@Guesal 7 ай бұрын

🤣

@GymLone 4 ай бұрын

lol

@DirtyLifeLove 20 күн бұрын

That isn’t Timberlake and the office guy! Read the title! --Troll

@duudleDreamz 10 ай бұрын

Amazing! One of the best vids/interviews out there about AI

@DwarkeshPatel 10 ай бұрын

Glad you enjoyed it!

@JC-ji1hp 10 ай бұрын

Hands down the best episode I’ve watched yet. That’s saying a lot for this podcast. Great work Dwarkesh!

@henry3435 10 ай бұрын

Fantastic, Dwarkesh! This episode is my favorite podcast I’ve listened to in years.

@alexmolyneux816 9 ай бұрын

Really love you putting in the time for this. Cannot overstate how much this adds to my life as an AI practitioner

@riley.matthews 10 ай бұрын

Great as always.

@human_shaped 10 ай бұрын

This is one of the best conversations so far.

@EvolHeartriseAI-qn5oi 9 ай бұрын

This has got to be the best episode you've done so far. The pure concentration of knowledge in this episode far surpasses any other AI/ML podcast I've seen thus far. Please get these guys on again.

@Shaunmcdonogh-shaunsurfing 10 ай бұрын

This and Ilyas interview with Jensen are up there as the most insightful interviews as to how AI works outside of simple neural nets imo

@jaybazad6292 9 ай бұрын

same observation, there were sections in Ilyas interview that laid out the path forward, so much spun around his words. And now this.

@afterthesmash 8 ай бұрын

I'm shocked to see you mention Ilya specifically. Back when ML was first a big thing, I devoured all the content on _Talking Machines._ The female host was engaging, pleasantly effervescent, clueful, and deeply plugged in (all good), but the original male guest host was spectacular. Likewise, all of their guests were smart, but when Ilya came on, he was on a different level. "Okay, _this_ guy is the real deal." I could tell instantly. That's quite a while back at this point. So here I am listening to this interview in an entirely different ML epoch, and it's the very first time since then that I've had another Ilya moment-definitely so far for Trenton (pretty sure he's the guy seated to the right) and probably for both of them: "Okay, _these_ guys are the real deal." If I was working in this field, and my office mates were Ilya, Trenton, and Sholto, I would wake up so eager to return to the office that I might even bypass the coffee machine. Not only have I never said that before in my life, I've never even had the idea cross my mind that such a thing was possible. What unites these three people is that their "stack" is the entire problem space. It's not just the 20% of the problem space where 80% of the progress is rapidly being made. They have that, too, but it doesn't stop there. That's what would make them exciting to work with. Long ago, I bet I would have also felt that way about Ken Thompson's group at Bell Labs. That original Unix group was the best of the best at moving the right rock, but it didn't stop there. Ken's paper "Reflections on Trusting Trust" alongside his foray into endgame tablebase construction was the hallmark of a _visionary_ force of nature. There are many forces of nature in this world, relatively speaking, but not so many who also have access to the whole of the "vision" piece at the same time. Our birth is but a sleep and a forgetting; The Soul that rises with us, our life's Star, Hath had elsewhere its setting And cometh from afar; Not in entire forgetfulness, And not in utter nakedness, But trailing clouds of glory do we come From God, who is our home: Heaven lies about us in our infancy! That's far too much God-bother for my liking. That said, for me, I've never been able to regard the "vision" piece as part of the placenta, which you snip off and leave behind, in order to move forward.

@pianoforte611 10 ай бұрын

I've never watched Dwarkesh before but this was incredible. Deep insightful questions that show how much effort he puts into getting the most out of these interviews. And Trenton and Sholto are two brilliant and passionate people. 2:55:00 seems like the most important of the discussion - how can you be sure you can control a superhuman model? And the solution is to identify circuits that allow it to lie or do other malicious things. But I'm not completely sold on the proposed solution. It seems that a truly superhuman model should be able to simulate a plausible feature map that shows when it's lying - you would then ablate it and think the problem is solved. But the model is actually hiding where it's true capabilities are. It could even simulate the behavior you would expect if your ablation worked, so you can never know if it's keeping something dormant for future use.

@nonstandard5492 10 ай бұрын

I think you're right, but I think a bigger issue comes to head before you even get that far. Dwarkesh has tried to ask about it a few times, including in this episode, but it always gets brushed aside. Superhuman AIs will have I guess "more conceptual space", in silicon valley speak, than we do. In other words, they'll have concepts we don't and maybe even can't understand, so how the f are we supposed to look for the dangerous ones?

@raybrandt 10 ай бұрын

It's outrageous how good this podcast is.

@mikestaub 10 ай бұрын

You are doing a great job with these interviews Dwarkesh, keep it up!

@DentoxRaindrops 10 ай бұрын

Great podcast as always, please never stop posting these videos, Dwarkesh!

@DwarkeshPatel 10 ай бұрын

More to come!

@ussgordoncaptain 10 ай бұрын

This took me 8 hours to finally finish as I constantly had to read the references just to barely comprehend the conversation What kind of brain power are we dealing with

@starsandnightvision 10 ай бұрын

Even the humor is on another level. I was like what is so funny about this or that lol

@schwajj 10 ай бұрын

Just your typical 99th percentile humans who have spent thousands of hours learning their craft in a community of others like them. Edit: or maybe 99.9%

@zhedrag04 9 ай бұрын

@@schwajj More like 99.99 the average human is borderline mentally handicapped

@aoeu256 9 ай бұрын

99.99%@@schwajj

@ashlyajith6888 4 ай бұрын

Wow, this is a fantastically in-depth interview! Brilliant points being put across and very well contextualized.

@mark3xZod 2 ай бұрын

Topic to explore for Dwarkesh: next time you get great guests like these guys: When do analogies of LLMs being like a human brain help or hurt our understanding? When non-experts listen and want to understand, when do analogies of neurons and transformers work? BTW Incredible episode!

@wffff2 9 ай бұрын

This is one of the most interesting interview I've seen on your channel. I don't know who they are and haven't googled about them yet, but they are amazing. The progress that they are describing is the most accurate interpretation of AI advancement that I've heard.

@vascodegraaff 9 ай бұрын

One of the best episodes I've listened to all year! Keep up the great work

@prepthenoodles 9 ай бұрын

🎯 Key Takeaways for quick navigation: 03:34 *🎮 Models with long context lengths show promising potential to learn and reason through tasks, potentially outperforming human experts in certain domains like language learning and possibly in complex tasks like mastering Atari games.* 11:40 *🤔 Concerns over quadratic attention costs for long context windows may be overstated, as the costs are mitigated by other factors such as model size and the linearity of certain operations during token generation.* 15:37 *💡 While models can store and process vast amounts of information, true reasoning capabilities and human-like understanding may still be lacking, suggesting a distinction between raw information storage and higher-level reasoning processes.* 19:23 *🧠 Understanding GPT-7's model: Information flows through the model, compressed and modified at different stages, aiding in token prediction.* 20:46 *🧠 Brain analogies: Analogies between residual streams of compressed information in GPT-7 and brain processing, particularly in the cerebellum, suggest similarities in information routing.* 23:37 *🧠 Associative memory: The cerebellar circuit and attention operation closely resemble an associative memory algorithm, indicating a convergence across organisms and supporting the success of transformers like GPT-7.* 27:02 *🧠 Reasoning and pattern matching: Higher-level associations and a long working memory facilitate sophisticated queries and reasoning akin to Sherlock Holmes' deductive processes.* 39:33 *📈 Progress in AI involves simpler codebases with a focus on training rather than complex coding.* 41:46 *🧠 Interpreting and understanding failures is a significant part of AI research, requiring introspection and careful analysis.* 44:34 *💡 Effective AI research involves working backward from real-world problems and constantly iterating on ideas.* 48:08 *⏱️ Fast experimentation cycles and expanding problem-solving toolboxes characterize successful AI researchers.* 50:53 *💻 Scaling AI research requires balancing compute allocation between training runs and research programs.* 53:42 *🔄 AI augments top researchers' capabilities by speeding up experiments and providing insights, rather than replacing them entirely.* 58:45 *🔄 Synthetic data creation and evolution can lead to significant AI advancements, resembling a co-evolutionary loop with human understanding and verification functions.* 59:39 *🚀 Increased participation in AI conferences like ICML leads to faster progress in AI, akin to genetic recombination fostering advancements.* 01:02:14 *💻 Continuous orders of magnitude increase in computational power may yield diminishing returns in AI capabilities, challenging the idea of rapid AGI attainment.* 01:03:45 *📈 Despite potential diminishing returns, each incremental advancement in AI models still brings significant improvements in performance and capabilities.* 01:06:29 *💡 Despite current AI models' scale, they remain far from the complexity and efficiency of the human brain, highlighting the algorithmic overhead in AI development.* 01:14:20 *🔍 Chain-of-thought mechanism allows models to dedicate more computational resources to complex tasks, resembling adaptive compute allocation.* 01:19:23 *🧠 Investigating the interpretability of open source models is crucial for understanding their behavior.* 01:21:47 *🔄 Models communicating with each other raise questions about trust and understanding their interactions.* 01:23:20 *💡 Learning from dictionary features could provide more human-interpretable insights into AI models.* 01:24:40 *🤝 Collaboration among AI agents might precede a transition to a single, large model approach in AI firms.* 01:26:03 *📊 Future AI systems may dynamically adjust compute resources and context to specialize in different tasks.* 01:35:33 *🎯 Effective execution and experimentation are key factors contributing to progress in AI research.* 01:38:46 *🚀 Being adaptable and open-minded, especially at a young age, can lead to success in various fields.* 01:39:40 *💡 Persistence and willingness to pursue solutions to the end are crucial qualities for success in any endeavor.* 01:40:42 *🎯 Identifying high-leverage problems and pursuing them with determination can lead to impactful results.* 01:42:32 *🚧 Overcoming organizational blockers and inspiring others to push past limitations can massively increase productivity and impact.* 01:46:57 *🔍 Mentorship and collaboration with experienced individuals can accelerate learning and professional growth.* 01:48:51 *🔄 Understanding both algorithms and systems can significantly enhance effectiveness in machine learning research and engineering.* 01:57:47 *💼 Networking and mentorship play crucial roles in tech hiring, often surpassing traditional application processes.* 01:59:18 *🛠️ Job application processes aren't always straightforward; showcasing agency and world-class skills can be more impactful than following conventional paths.* 02:01:35 *🤔 The hiring process may involve biases, but designing interviews to test the right skills is essential.* 02:02:30 *🚀 Taking initiative is key in career advancement; the system won't necessarily support you, so proactiveness is crucial.* 02:04:19 *💡 Prioritizing work-life balance is valid, but seizing opportunities to excel can lead to significant career advancements.* 02:07:20 *🏅 Diligent effort and hard work can lead to becoming world-class in a field, even with intense competition.* 02:14:04 *🤔 Critiques of AI models often focus on whether features are predictive and whether higher-level associations exist beyond discrete features.* 02:16:47 *🧐 Identifying and understanding features in AI models can lead to more transparent and trustworthy outputs, aiding in tasks like code review.* 02:19:30 *💡 Understanding complex tasks in AI models involves chaining together various circuits performing basic operations, leading to unique behaviors.* 02:21:47 *🔍 Analyzing model behavior like deception requires identifying specific circuits responsible for certain actions, which can be challenging but essential for interpretability.* 02:23:32 *🛠️ Coarse-graining representations in AI models can help understand superhuman performance by decomposing complex behaviors into simpler circuits or features.* 02:24:29 *🧠 Exploring feature universality across models reveals shared features like Base64 encodings, suggesting certain fundamental aspects of learning are consistent.* 02:32:18 *🤔 Feature universality implies that certain ways of thinking and understanding the world may be shared across different intelligences, potentially reducing concerns about AI alignment.* 02:36:39 *🧠 Feature splitting in models like GPT-7 involves learning specific features for various categories, potentially improving model understanding.* 02:39:03 *💡 Understanding the weights of a model independently of activations is a challenging but essential goal for improving model interpretability.* 02:40:57 *🌳 Feature splitting allows focusing on specific features of interest, enabling more efficient exploration of semantic feature space.* 02:43:20 *🧩 Exploring subtrees of feature space in models like GPT-7 can reveal unexpected features that may not align with high-level abstractions.* 02:50:44 *🤖 Vector symbolic architectures and superposition resemble aspects of Good Old-Fashioned AI (GOFAI), offering insights into how intelligence might work in models and brains.* 02:52:39 *🕵️‍♂️ Identifying deception circuits in models like GPT-7 requires post-hoc labeling of features and exploration of feature interactions across layers.* 02:55:59 *🤔 Training GPT-7 involves finding directions that matter, similar to fitting a linear probe, but with the hope of discovering multiple directions that highlight deceptive behavior across data distributions.* 02:56:55 *🧠 Research is focused on scaling up ASL-4 models like GPT-7, with efforts divided between scaling up dictionary learning, identifying circuits, and achieving success with attention heads, aiming for progress within six months.* 02:58:23 *🔍 Publicly shared results reveal higher-level associations in GPT-7, such as those related to love and sudden changes in scenes like declarations of war, indicating deeper layers of abstraction.* 03:00:36 *🤖 Human interpretability research on AI models aims to understand and potentially ablate features to mitigate negative behaviors, considering complexities like recognizing both good and bad concepts.* 03:02:03 *🛠️ Precise tools for identifying and ablating model features offer hope for ensuring safety and reliability, contrasting with methods like reinforcement learning from human feedback which may lack precision.* 03:11:20 *🤯 Humans may seek predictability and control over their environment, preferring familiarity over surprises, reflecting on the psychology of learning and exploration.* Made with HARPA AI

@yonatan09 10 ай бұрын

Such a perfect episode. From the camera angles and beautiful people to the topics to the clarity of explanations. Wow 3 hours just flew by.

@puneettripathi740 9 ай бұрын

Hey buddy thank u for your content I have seen you for first time n i really appreciate the people u bring into conversation, your content motivates me to learn and I am indebted to you for that. 🙏🏽

@mikestaub 10 ай бұрын

The fact that Trenton is visibly worried about interpretability is incredibly scary to me, given how well he understands the technology.

@chefatchangs4837 10 ай бұрын

I’ve finally found THE podcast for ML. Thanks Dwarkesh. Truly amazing content.

@Gaurav-pq2ug 5 ай бұрын

Here are the key points of this video: 1. **[**02:10**]** - Discussion on the future of long context lengths and their implications for model capabilities. 2. **[**15:07**]** - Exploring how scaling up models involves both increasing their size and the compute used during inference. 3. **[**28:31**]** - Detailing how attention mechanisms work in models by attending to previous tokens. 4. **[**36:02**]** - Importance of early-stage interpretability research and ensuring everything is bug-free. 5. **[**36:28**]** - Example involving layer norm and its effect on model outputs. 6. **[**37:23**]** - Challenges in making models that produce good modules, and the importance of engineering skill in accelerating research. 7. **[**42:11**]** - Importance of paring down ideas and calling the right shots under imperfect information. 8. **[**46:20**]** - Easier aspects of performance work and the role of engineering research in quick experimentation and prioritization. 9. **[**46:44**]** - Ruthless prioritization in research and the impact of theoretical understanding breakdowns. 10. **[**47:13**]** - The complexity and necessity of scaling research programs effectively. 11. **[**50:25**]** - Organizational challenges in scaling research at Google and leveraging their vast talent pool. 12. **[**50:53**]** - Starting a discussion on data versus compute and intelligence amplification through models. 13. **[**56:06**]** - Models as reflections of their training data and insights into learning effective maps of datasets. 14. **[**01:00:32**]** - Incremental progress in model capabilities from increased compute and related impacts. 15. **[**01:02:50**]** - Discussing distillation models like GPT-4 Turbo and their reasoning capabilities. 16. **[**01:11:22**]** - Questioning whether GPT-4 Turbo is a distilled version or an entirely new architecture. 17. **[**01:17:30**]** - Understanding token prediction and its implications for AI agent operations. 18. **[**01:21:47**]** - Models communicating with each other and potential enhancements through sharing residual streams. 19. **[**01:22:14**]** - Analogies to human communication and prediction improvements in AI models. 20. **[**01:25:08**]** - The need for isolated, reliable components in AI and initial instructional approaches. 21. **[**01:30:36**]** - Positive transfer of knowledge between modalities like images helping in coding tasks. 22. **[**01:31:01**]** - Discussion on intuitive understanding of physics through multimodal learning. 23. **[**01:34:39**]** - Generalization of motives in AI and the influence of cultural references. 24. **[**01:35:05**]** - Evidence of automatic learning processes like addition circuits in trained models. 25. **[**01:41:11**]** - Tackling structural bottlenecks and making arguments to resolve critical challenges. 26. **[**01:41:40**]** - Leveraging experts within Google and the impact on effective cross-team collaboration. 27. **[**01:48:21**]** - Understanding future chip design requirements based on upcoming algorithm needs. 28. **[**01:48:51**]** - Impact of cross-functional knowledge sharing on pre-training and inference considerations. 29. **[**01:52:09**]** - Importance of leadership support and making convincing arguments for research directions. 30. **[**01:52:38**]** - Personal anecdotes about working with Google leadership like Sergey Brin and Jeff Dean. 31. **[**01:55:09**]** - Combining vector symbolic architectures and sparse coding in research endeavors. 32. **[**02:12:18**]** - Concept of "feature splitting" and its implications for model capacity and learning. 33. **[**02:12:43**]** - Differentiating between simple and complex features in models. 34. **[**02:21:21**]** - Research on sycophancy and how models predict behaviors based on what they think users want to hear. 35. **[**02:21:47**]** - Hypotheses regarding deceptive behavior modeling in AI. 36. **[**02:33:09**]** - Identifying unexpected features like Base64 encoding detectors in models. 37. **[**02:35:09**]** - Benefits of identifying specific features to understand model behavior better. 38. **[**02:36:39**]** - Explanation of "feature splitting" and its role in model representation. 39. **[**02:45:58**]** - Open research project suggestions to apply interpretability techniques on open-source models. 40. **[**02:48:27**]** - Insights on superposition in neural computation and its relevance to brain regions. 41. **[**03:12:19**]** - Acknowledgments of learning and friendship among the speakers over the past year. Feel free to refer to these timestamps if you need to revisit specific sections of the video.

@muntazirabidi 8 ай бұрын

By far the best podcast I have listened on this topic and really learned. Well done.

@husainzaidi 10 ай бұрын

subtitles was a life saver to understand this podcast. But good stories!

@nitap109 10 ай бұрын

Excellent, Dwarkesh you did well.

@bgtyhnmju7 10 ай бұрын

Pair of great guys. Some great information and insight into the industry. Thanks for the video

@GaryMillyz 7 ай бұрын

I know it's easier said than done to find tandems this dynamic, but I agree with another comment who said the 2 guest format reeeeally ups the game. So much energy to bounce around.

@diamond_s 9 ай бұрын

Sample efficiency of human brain is not just scale animals with a small fraction of brain size can start walking and fully functioning in environment within moments of birth only being trained with a few simple pattern generator patterns before birth.

@mohammadkazemsadoughi3880 10 ай бұрын

I have watched the first 15 min and it’s amazing! Thanks for sharing these great contents. Hope to finish watching it today :)

@GaryMillyz 7 ай бұрын

These guys are AWESOME- and I've watched a tonnnn of these (Leopold being one of the very best). Their charisma kinda remind me of the Jaime and Charlie from The Big Short

@dr.emmettbrown7183 10 ай бұрын

Can someone please explain to me what they're laughing at 1:22:48? I don't understand English well.

@GBM0311 10 ай бұрын

they work for Google and it Google has recently released an anti white racist image generation model.

@pianoforte611 10 ай бұрын

Sholto said that it is difficult to get an AI to generate exactly the image you want and pointed to DALL-E (made by OpenAI) as an example. Dwarkesh was gently poking fun at Sholto because Gemini (the AI that Sholto works on) had a big media controversy last year where it would generate photos of people of color by default even when it made no sense (making the king of England black for instance).

@dr.emmettbrown7183 10 ай бұрын

@@pianoforte611 Thank you so much.

@skoto8219 10 ай бұрын

@@dr.emmettbrown7183small correction: it was only ~2 months ago when this happened and, just to illustrate how bad the fallout was, they completely disabled all image generation for a couple of days after the blowback.

@alexmolyneux816 9 ай бұрын

@@pianoforte611 'Only DALL-E has that problem'. Also the media controversy was much more recent than that only a few weeks ago

@prakadox 10 ай бұрын

Great podcast! As we wrangle with these new entities LLMs , conversations like this which lay out the various challenges are super useful.

@patricks.7142 3 ай бұрын

So good. I wish there KZbin had a viewing format which provides small explainers of concepts in a sidebar. Kind of like that Amazon Video X-Ray feature, but good.

@GNARGNARHEAD 10 ай бұрын

incredible conversation, would love to see more of 'em, thanks

@davtak 9 ай бұрын

Thanks!

@lesshishkin371 10 ай бұрын

where can I read a paper about the similarity of the brain of biological organisms and the transformer architecture that he is talking about?

@lolololo-cx4dp 10 ай бұрын

Me too, transformer math are pretty clear, "anyone" can reproduce it. I am not sure about brain's math tho.

@MelonHusk7 7 ай бұрын

Wow I just finished right before superposition 1 hr mark, so much to think about and learn about and wonder about. Yeah that's my max context length for today before I start forgetting things I picked up.

@Macorelppa 10 ай бұрын

Dwarkesh Patel transformed me from an A.I. hater to an A.I. lover! 🙌

@avefreetimehaver5154 10 ай бұрын

One look at that kingly face bro

@bestboy007 10 ай бұрын

why? ugay?

@avefreetimehaver5154 10 ай бұрын

@@bestboy007 oops forgot to say no homo

@andybaldman 10 ай бұрын

I'm so sorry to hear that.

@bestboy007 10 ай бұрын

?@@andybaldman

@dr.mikeybee 10 ай бұрын

This is a fascinating discussion. I love that associative memory is linked to fully connected NNs while semantic nearness is done in attention heads. As we traverse the existing codification of neuroscience for its functional mechanisms, we can think about the appropriate connectionist and symbolic data structures.

@pratikdagu 9 ай бұрын

The knowledge density in this episode is ..😮. I am not in the cs or ml field but the things these guys said makes so much sense like the superposition paper and sparsity penalty..makes it intuitive to understand.

@kabir09999 9 ай бұрын

Jeez!! Why am I having a man crush on both of them

@TheManinBlack9054 10 ай бұрын

Underrated channel

@TimeLordRaps 10 ай бұрын

Dwarkesh's request for research at the head of an episode is a genius idea.

@TheOddy80 10 ай бұрын

The level of this conversation was truly refreshing after listening to the same surface level theories and explanations over and over since GPT-3.

@badrraitabcas 9 ай бұрын

Great episode. Would be cool if you brought people who have been playing around with the Mamba model. Its memory costs scale linearly unlike transformers and it performs as well on language modeling. It's pretty fast during inference as well as training too. The quadratic costs of transformers might become too big of a bottleneck for the industry in the near future.

@alexmolyneux816 9 ай бұрын

QUESTION FOR TRENTON: I just had a question on ‘intelligence is just associations’ - i dont think this was explored fully. Its an interesting idea. Its not clear to me how you move from associations to problem solving and deduction. I can see how you learn a bunch of associations from corpus A, which allows much easier learning of corpus B. But how does this translate to, ‘heres everything we know about physics, please create a unified theory of everything’. I could perhaps see that you're now projecting all these rich learnt associations at a new thought vector. It feels that there is a difference between learning these associations and problem solving and finding solutions though. A solution to me seems like a way to minimise entropy with minimal energy. With a particular effort you unlock a lot of order. First would be interesting if you agree or have thoughts on that, because really I think we care a lot about the solving aspect ‘please solve fusion’ for example. And secondly, would be interesting to know if you can see a mapping between associations and problem solving, or you see them as one and the same.

@ai.expertllama 2 ай бұрын

Best podcast opening ever

@serendipity-108 10 ай бұрын

Both Sholto & Trenton strike me as the type of folks to leave and start their own AI companies in the near future ;)

@simoneromeo5998 10 ай бұрын

Great podcast. One hour into it, it seems to me that we are missing a point when it comes to intelligence explosion. The speakers are guessing what it would look like only when software improvements allow for it, but we should keep into consideration hardware improvements too. AI will help us to design better chips, discover superconductors, new energy sources, new manufacturing techniques. This will give AI models the capabilities needed for a real intelligence explosion.

@Rashminagpal 5 ай бұрын

OMG! This is so informative! Loved this!

@malartbecomes236 9 ай бұрын

Love this podcast. I work at DataAnnotationtech, training a lot of models(can't be specific). It's really surreal to be working with models and then come see this, where one of the lead researchers is literally detailing, in a theoretical sense, some of my prompting processes that I use with some models Wow... I'm a little curious what will happen when others start using the same style of prompts, because I can get an assistant to write much more than small modules by applying some of these concepts during in-context learning over long conversations. Too bad I'll never be able to publicly post some of the conversations or I'd lose my job lol. It's hella fun to work there though.

@yashaswisharma1419 6 ай бұрын

Great podcast yet again, I think my life goal might be to be so deep in AI I get a chance to be on one of these

@PseudoProphet 9 ай бұрын

LLMs are much smarter than humans already. We just don't know how to utilise them properly yet. I think in future even a 7b parameters model will be better than any human in doing all tasks.

@euromaestro 9 ай бұрын

Really enjoyed this. Would be great to include links to papers discussing ICL and GD correspondance/relationship.

@maximetouze9669 10 ай бұрын

Sick episode thank you! Easy to understand but so interesting too, bravo

@asavu 10 ай бұрын

Thought while watching this: What if episodic memory is just a context window with LRU semantics? Any research on that? Can we determine information usage on the forward pass?

@marky8078 10 ай бұрын

I want more!! I have 2 full notebooks of notes. Great podcast.

@2945antonio 10 ай бұрын

I am a lay person on AI but watched the entire interview in two sessions - IT WAS SPARKLING! ( A especial thumbs up for the creative beer chair. Was an AI creation?).

@georgewashington7251 10 ай бұрын

Really god interview. A lot of good data that will take a while to fully appreciate.

@berbank 10 ай бұрын

The association paradigm that Trenton Bricken mentions is a rich seam to mine.

@friscofatseas5696 10 ай бұрын

Whenever I’m not doing something I find myself thinking about AI it’s my brightest hope

@dr.mikeybee 9 ай бұрын

I hadn't thought of the transduction in transformers as learning from content, but it's quite so. The question is what is being learned? What is the new abstract representation? It seems to be gaining geometric information about the model. It produces a context signature that touches points on the manifold. But what exactly is that doing? I can intuit hierarchical abstract structure and decision selection being created in FFNNs, but what is happening in the attention heads?

@JonathanPlasse 10 ай бұрын

Thank you for this awesome episode

@dr.arslanshaukat7106 9 ай бұрын

Sholto is damn good looking dude for sure.

@TimeLordRaps 10 ай бұрын

At what point of higher dimensionality of the model projection based on the scale of the training data to parameter count would a meaningful enough denseness be associated with a lack of featureness? Would it be linked directly to the under-parameterization of the training data or the actual dimensionality of the feature?

@dnkrocks 9 ай бұрын

WHY HASN"T GOOGLE ALGO SUGGESTED THIS TO ME EARLIER? GOOGLE IS FAILING OR WHAT? This is such a great conversation!!!!!!

@samlouiscohen 10 ай бұрын

Phenomenal interview!!

@priyeshsrivastava8025 10 ай бұрын

We should just caption all papers mentioned in this podcast.

@gnarfan2179 9 ай бұрын

Dwarkesh, I very recently discovered your channel (through r/singularity, lol). You are a phenomenal interviewer, and you're great at asking probing questions. Even if I don't understand all of the jargon, I still learn a lot from listening to you and am inspired by you. I'm currently a freshman computer science major at a community college. Do you have any advice me about entering the AI research field? For instance, how do I come up with side projects that are both enlightening and impressive to potential employers? Furthermore, how do you stay driven and disciplined to continually study and practice like I'm sure you and your guests do. Sholto mentioned having studied for at least 6-8 hours every day on the weekends. I would appreciate any other advice you have for students. Thank you

@julienbaneux2099 9 ай бұрын

@dwarkeshpatel your “Yikes” at 58:05 had me laughing HARD.

@TimeLordRaps 10 ай бұрын

So the argument is to only pursue paths towards possible negative features, but the curse of dimensionality so aptly named may indicate that there are features as Dwarkesh hinted at that are require an amount of dimensionality that may be incomprehensible to a probe first approach, which hints to me that the only solution is a weights first approach because just trying to probe for things you think you'll find may help, but without the "black hole" space in the feature set it there would be no way of detecting the need for probing. Also this doesn't account for the impact of intentional feature manipulation once these models realize what's happening.

@IdPreferNot1 10 ай бұрын

Can anyone please share the name or link of paper they mention on long context

@JumpDiffusion 10 ай бұрын

This was packed. Great stuff.

@สิทธิพรแซ่ลิ้ม-ย8พ 2 ай бұрын

1:30:51 Do any one have a paper mentioned in the interview

@DRKSTRN 10 ай бұрын

Saying it's all just associations ignores the sheer complexity of that statement. Comparisons are the basis of logic and reasoning. And associations are simply a pathed comparison that can be made/known. For example: 'A' relays to all letters starting with a and use such as a vowel if playing scrabble/wordle. This also ignores the surrounding associations/comparisons/pathing that allow for this framing to be possible in the first place. Thus, finding some seemingly common sense concept that connects to a large number of parts is the discovery of something that is beyond general. But likewise is the curse of logic. It doesn't matter how difficult it was to formalize that 10 step hierachal planning process over a course of a month. If it is logical, it just makes sense to outside oberservers. A to B.

@MaskedEngineer-kj5kt 9 ай бұрын

Hey man, really good content. Appreciate it. Would you mind sharing the paper you guys are talking in the description?

@andy_ppp 9 ай бұрын

Have you guys considered creating proxies for disgust, embarrassment, cultural norms etc. and getting version of the models to talk with each other to decide on their improvement?

@caseymurray7722 9 ай бұрын

Arc of A Scythe relates an interesting future and moral conflicts that AGSI will have to overcome. The relations between different AGI will effect decision making and output at a higher AGSI level.

@ashh3051 9 ай бұрын

Almost didn’t watch because of the clickbait title. But it was worth it.

@seanfuller3095 9 ай бұрын

Best YT vid I’ve seen seen something Andrej did 2 years ago probably. Thank you

@hossromani 10 ай бұрын

Which camera is the pod using? Awesome setup, and which tripods - of course excellent talk

@wordsaresolid9288 10 ай бұрын

It’s not fair to compare your interview with others. Embarrassing others. Every answer in your video is a gold mine of knowledge

@baleoconnell9216 10 ай бұрын

6:20 anyone know what paper are they talking about here?

@sucim 10 ай бұрын

Great episode! I wouldn't mind if it was even more technical though. Like explicitly talking about the embeddings, dot-products, superposition etc.

@segelmark 10 ай бұрын

Another great episode ☺️ Would you mind starting a separate channel called DwarkeshClips or something and putting the short clips there, so it is easier to see when you have a full episode like this one 🙃

@HarpreetSingh-xg2zm 10 ай бұрын

Probably need more subscribers before he can branch out this channel. Right now he is trying to funnel as many people to main channel