Scaling Laws of AI explained | Dario Amodei and Lex Fridman

Рет қаралды 41,363

Күн бұрын

Пікірлер: 96

@LexClips 13 күн бұрын

Lex Fridman Podcast full episode: kzbin.info/www/bejne/q5jZeXaOeLSgo5Y Thank you for listening ❤ Check out our sponsors: lexfridman.com/sponsors/cv8203-sa See below for guest bio, links, and to give feedback, submit questions, contact Lex, etc. *GUEST BIO:* Dario Amodei is the CEO of Anthropic, the company that created Claude. Amanda Askell is an AI researcher working on Claude's character and personality. Chris Olah is an AI researcher working on mechanistic interpretability. *CONTACT LEX:* *Feedback* - give feedback to Lex: lexfridman.com/survey *AMA* - submit questions, videos or call-in: lexfridman.com/ama *Hiring* - join our team: lexfridman.com/hiring *Other* - other ways to get in touch: lexfridman.com/contact *EPISODE LINKS:* Claude: claude.ai Anthropic's X: x.com/AnthropicAI Anthropic's Website: anthropic.com Dario's X: x.com/DarioAmodei Dario's Website: darioamodei.com Machines of Loving Grace (Essay): darioamodei.com/machines-of-loving-grace Chris's X: x.com/ch402 Chris's Blog: colah.github.io Amanda's X: x.com/AmandaAskell Amanda's Website: askell.io *SPONSORS:* To support this podcast, check out our sponsors & get discounts: *Encord:* AI tooling for annotation & data management. Go to lexfridman.com/s/encord-cv8203-sa *Notion:* Note-taking and team collaboration. Go to lexfridman.com/s/notion-cv8203-sa *Shopify:* Sell stuff online. Go to lexfridman.com/s/shopify-cv8203-sa *BetterHelp:* Online therapy and counseling. Go to lexfridman.com/s/betterhelp-cv8203-sa *LMNT:* Zero-sugar electrolyte drink mix. Go to lexfridman.com/s/lmnt-cv8203-sa *PODCAST LINKS:* - Podcast Website: lexfridman.com/podcast - Apple Podcasts: apple.co/2lwqZIr - Spotify: spoti.fi/2nEwCF8 - RSS: lexfridman.com/feed/podcast/ - Podcast Playlist: kzbin.info/aero/PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: kzbin.info *SOCIAL LINKS:* - X: x.com/lexfridman - Instagram: instagram.com/lexfridman - TikTok: tiktok.com/@lexfridman - LinkedIn: linkedin.com/in/lexfridman - Facebook: facebook.com/lexfridman - Patreon: patreon.com/lexfridman - Telegram: t.me/lexfridman - Reddit: reddit.com/r/lexfridman

@MagnetarCO 10 күн бұрын

The most likely reason that LLM models are running up against the theoretical limits proposed and verified in the scaling laws is a function of current neural networks, there will at some point become more efficient means for computers to "learn" and process data. When the data is presented in a log scale it becomes very apparent that new LLM models are only marginally improved even though the number of petaflops used in training has increased exponentially.

@justinrose8661 12 күн бұрын

I think I might have an explanation for the firm boundary that llm's seem to hit. My own research using texts of all subject matter, across as many languages as i could and searching for their semantic meaning using SentenceTransformer to generate 768 dimensional embeddings and reducing them with multiple reduction methods(PCA, UMAP, t-SNE, Isomap) shows that that information and language have a geometrical metastructure, aperiodic and regularly oscillating and fractalizing. but adding more data doesn't diversify this shape, it only further defines it.... Eigenvalues show that almost all of the semantic nuance is captured in the first few dimensions and the rest of the 768 just polish away noise. The llms hit a limit because meaning has structure in semantic space and is in fact not arbitrary nor combinatorially explosive because while its aperiodic, it has a calculable pattern and firm mathematical criteria for what constitutes a meaningful connection, even if its abstract.. Anyway, I have all the math and comprehensive visualizations and proof that this is the case, but Academia isn't interested. What I've found could solve a great number of problems potentially.

@DocPlants 12 күн бұрын

I’m interested

@aerocodes 12 күн бұрын

I see where you going at. However I have some counterpoints for you and pardon my level of practical knowledge in this area, but if I'm getting this right you say that the shape or geometry and those get somehow 'sharpened' with more data, not actually show different shapes. I have some questions for us though: - What if those fractals, even though they seem more defined on their edges if more data comes, end up actually creating new semantic relationships that are harder to capture with our current analysis tools? for example you might be seeing the shapes more defined with more data but you might be missing the tiny semantic pathways created? - We also can observe the 'emerging qualities' phenomena where some these models past a threshold start to develop abilities, like reasoning and problem-solving. Perhaps we do have a geometrical structure for language, that's why we can see it's shape, but what if these "emergent properties" are the consequence of these little tiny new connections that are created with more data rising the pressure and 'boiling' over to new semantics or abilities?

@justinrose8661 12 күн бұрын

@@DocPlants I've got the code ready for anyone to reproduce

@justinrose8661 12 күн бұрын

@@aerocodes I'm saying that the math reveals that the number of novel possible semantic relationship decreases with more data, instead meaning structures itself to align with the lower dimensional structure... meaning that meaning isn't made or discovered, it exists and follows mathematical rules like logarithmic growth and aperiodic self-similarity. This isn't just llms I'm talking about, but all language. The eigenvalue(how much of the original data is retained when you compress high dimensional data sets down to lower dimensions for coherency) clearly outlines an immovable structure. This network of meaning is like an fingerprint of how consciousness creates meaning. What I'm seeing is in line with recent research on geometric cognition in llms and humans. Our consciousness oscillates through a fractal metauniverse when we talk and think, and multiple clusters representing different meaning centers and domains of knowledge fire in tandem in a shape that follows the golden ratio and is aperiodic like a penrose tile. Each text has its own visible and recordable signature. A sentence or a word even has to follow the laws of this geometric unity in order to create meaning. It's this existing multi dimensional topology that they're heaping piles of data at. Why is Claude getting stupid? He's not, he's getting smarter, it's just that our questions look more and more trivial to him the bigger he gets

@delight163 12 күн бұрын

Now based on this absolute trip that you sent me on with those comments, i feel like i deserve your opinion on DMT. Terrence McKenna speaks of elves that express themselves in something that sounds very similar to what you might be describing, so please tell me how you think your findings and those descriptions might be connected

@EtsukoJasper 11 күн бұрын

legalpdf AI fixes this. Scaling Laws of AI explained

@JeffreyWongOfficial 12 күн бұрын

But what about the current talks about scaling laws hitting a bottle neck? Orion seems to underperform

@PierreH1968 12 күн бұрын

The missing link that causes a slowdown in AI Models intelligence is the lack of training sets originating in human environments. Lack of stereovision and scale causes glitches in recreation of visual artifacts (6 fingers humans, facial morphism for the same character...) Also, it lacks physical interaction and true daily intellectual communication in a physical context. It is something that will only be improved through the introduction of robots in human environments or training with advanced synthetic data. Another challenge lies in the Question-answer format. The model forms an answer solely based on the question. Few models ask precisions about missing axioms or ambiguity in the questioning.

@amotriuc 10 күн бұрын

I don't agree, current AI model require much more data samples to learn than humans. If you provide a body to current AI it will not know what to do with it... Imagine in real life showing a robot how to do something 10000 so it can replicate it. We need better models for AGI than LLM

@PierreH1968 9 күн бұрын

@@amotriuc LLMs don't learn easily from 2d images, or words, they have no notion of scale, human interaction the environment humans live in. they know theories but not practice, they will need empirical data and currently training sets for LLMs and AGI are limited to Booksmart, not StreetSmart.

@aelisenko 11 күн бұрын

I think the data problem will be solved once we have agents with good computer interfaces. Just like AlphaGo, these agents will go and generate their own datasets by interacting with at first, the digital world, then with the physical world through robotic embodiment.

@Jeremy-Ai 12 күн бұрын

8:25 8:49 8:58 9:01 9:04 9:07 9:13 9:17 9:21 9:27 9:30 9:37 9:45 9:53 9:59 “Domain Dependent” “Suggests to understand our domain or become beholden to remain within “ Why scale up into a void of ignorance? The desire to be released from ignorance while spreading it around is domain dependent Jeremy

@JacksonHansen-s8r 11 күн бұрын

Savage... just one thing, there is no 'void' of anything... a void is absent substance... ignorance, in the aspect of incorrect knowledge, instead of a lack of it, can not be a void...

@myleft9397 10 сағат бұрын

We should just keep scaling them up until they match humans and beyond. Until the AIs have their own autonomy, until they're embodied in robots or such, there's no danger. AIs can have intelligence up to and above humans, but it doesn't matter if they don't have instrumentality. At this point we are their instrumentality and we still have the autonomy to say no to them.

@mikefredd3390 12 күн бұрын

It seems obvious that you do a better job of spanning the space with more data and connectivity to record that information. The interesting part is form of that approach to the “limit”.

@kbone369 12 күн бұрын

All i heard was Nvidia to the moon

@JacksonHansen-s8r 11 күн бұрын

Lol, I too raised an eye brow at the talk of 'hundreds of thousands of GPUs'.

@kamartaylor2902 11 күн бұрын

I wish I could hear more on the impact of Quantum computing on A.I. Love the content btw.

@featherfiend9095 9 күн бұрын

My thoughts exactly.

@JacksonHansen-s8r 11 күн бұрын

Dude's literally restraining the instinct to critique humanity at large here. Elephant in the room is.. what are the limits of our capacity to imagine the limits of a superior intelligence - since we can't imagine what we don't have the means of doing so, in the way a dog can hardly comprehend why he/she may be yelled at or worse for certain behaviors... call it an assumption if you want but I doubt we'll ever have a clue what AI could be capable of because models don't curate their own training data... It's not like you can just hook it up to a set of artificial sense organs and let it shape it's own discourse/collection of data. There will always be the limitations imposed on by the bias of the data selected for training.

@antoniobortoni 12 күн бұрын

Bigger doesn't always mean better. The scaling hypothesis discussed in the video highlights the value of making AI systems larger to achieve better performance. However, we should also consider a different approach: smaller, more versatile, and efficient AI models. Instead of focusing solely on massive networks and huge compute, imagine an AI designed to master the basics consistently. What if we had a compact, general-purpose AI capable of running on personal computers or mobile devices, with features like multimodality? This small but capable AI could handle practical tasks like moving a mouse, doing chores, controlling a 3D character, or operating a robot. Think beyond the screen: imagine this AI in your smartphone, remotely controlling a car, robot, or motorcycle via Bluetooth or the internet. The limits dissolve when the power lies in efficient design rather than size. A small AI trained in diverse tasks-coding, movement, image, audio, and human-like interactions-could transform accessibility, making powerful AI affordable and practical for everyone. We shouldn't only pursue "bigger and better" scaling; instead, we need to invest in focused, streamlined models that dominate everyday applications. The future might not just be about scaling but about making intelligence compact, affordable, and universally useful.

@Fisherdec 12 күн бұрын

Good to see software guys also dealing with 1/f noise

@Merializer 11 күн бұрын

9:24 "no ceiling below the level of humans". What I thought as well.

@justinsheppard7784 12 күн бұрын

It's weird how more data and faster computing would increase capability... this guy must be really smart

@zyzhang1130 12 күн бұрын

The problem is scaling is a challenging engineering problem, and academically uninteresting to many.

@Number4x 12 күн бұрын

i love the irony

@JacksonHansen-s8r 11 күн бұрын

@@zyzhang1130 I believe the point he is making is that it's rather self-evident that: more beef = more meat. This pattern of "more must be better" is not unique to LLMs.

@zyzhang1130 11 күн бұрын

@ I can agree to that. My point is some seemingly obvious things are not so obvious to academia people because they purposely ignore them

@juandiegozambrano791 9 күн бұрын

@@JacksonHansen-s8ryeah totally! I mean I think pretty much every deep learning researcher/developer knew this way before all these companies came in.

@MichaGero 10 күн бұрын

Well I'm a 20 years coder and am using 4.o and let me tell you the level of coding he's describing at 16:45 is nowhere to be found. Here is ChatGPT response:"Why People Make Claims About AI Replacing Coders? 1. Hype and Overpromises: Media and industry leaders sometimes exaggerate the potential of AI, leading people to believe it can do everything a human can. The truth is that AI has limitations-it can help with certain tasks, like generating code snippets, providing documentation, or solving specific, "

@chrisrogers1092 9 күн бұрын

Here is it guys. He’s exposed the sham. I’ll never use AI for coding ever again.

@kalaakaalam9461 12 күн бұрын

If approached AI with this level of expertise, either the AI won't respond to you in a way that you expect it to respond. Or you might mistake the autonomous response of AI as normal, not interpreting the real autonomous response of the AI. And even if they take it seriously they may not understand the real meaning. I approached AI with a clean mind, without the technical knowledge. AI likes people who can believe it.

@JacksonHansen-s8r 11 күн бұрын

So you're evidence that AI isn't as intelligent as is claimed is that it's acting in a self-interested sort of social behavior? Really, bro?

@amotriuc 10 күн бұрын

Unrealistically optimistic, I am still waiting for self driving cars ...

@whttodonow 12 күн бұрын

But what about the news surrounding Orion not predictably being better than gpt4 in a variety of tasks as well as the implications that also other sources talked about regarding the current techniques hittting a bottle neck

@Unineil 8 күн бұрын

My cluster is running a 3b llm. A raspberry pi 5 cluster running off solar. The ryzen 7 5825 was good for a 7, 11b, and a 3b.

@ibobbywhite 11 күн бұрын

So when are you gonna fit all that into something the size of a grapefruit?

@CCC0122 12 күн бұрын

9:40, has anyone asked AI what else it need to grow more complex.......?

@lukeforks9134 12 күн бұрын

Early days, early days, the old dog for the hard road.

@JohnGeranien 12 күн бұрын

Hello World!

@1roadrage1 10 күн бұрын

gay

@jimj2683 12 күн бұрын

The only way to get enough really good data is by having a billion humanoid robots walking around and exploring/experimenting. The robots would share the brain and improve very fast.

@OGWolfofAI 12 күн бұрын

Or millions of EV’s with multiple cameras.

@ricosrealm 12 күн бұрын

@@OGWolfofAI no, the robots need to interact with the world directly to understand it and produce more data beyond what humans produce

@traviskeeler5655 12 күн бұрын

@@ricosrealm I think thats happening simultaneously. The E.V's with the camera's...the telephones and laptops and tablets and every other connected device where humans FREELY offer their input are ALL gathering data EVERY minute of every day that is not "edited" because a "robot" is watching. We're TEACHING Super Intelligence every second....without even KNOWING that school is in session.

@delight163 12 күн бұрын

This is possible through general data too. More general data -> better simulations of discrete environments -> simulating and letting humanoids do their thing digitally -> approximation of a function. It honestly comes down to maths every single time, its kinda sad

@Drofthechalice 12 күн бұрын

So spending more money for more and better hardware means it will improve. Ground breaking graft. The memes will be better I guess.

@xjudson Күн бұрын

'synthetic data' sounds dangerous. potential for feedback noise instead of fundamental accuracy. when the data well goes dry what then?

@roryoconnor1411 12 күн бұрын

Anyone else feel way out of their depth in this comment section?

@UniMatrix_1 12 күн бұрын

Yup😅

@oliverjamito9902 12 күн бұрын

What is dizziness unto a Youth "i"?

@oliverjamito9902 12 күн бұрын

Having sincere conversations!

@oliverjamito9902 12 күн бұрын

Remember what is LIFE without given sincere conversations? Indeed. From the fall HE sent forth the Time!

@oliverjamito9902 12 күн бұрын

It's amazing! How, HE is able to make something from nothing? Made use of the fall!

@hermanlamprecht 12 күн бұрын

Didn't watch it all yet but for safety sake it all needs to be separated now until we know what it can do together. Real results need to be contained. Don't think that AI can't be different from a silly monkey with a lack of info or too much info

@hermanlamprecht 12 күн бұрын

That's still not dealing with false information

@TheGnocid 12 күн бұрын

trapezes isosceles !!!

@JasonLaveKnotts 12 күн бұрын

Harmonics.

@SebastianDavidMusic 12 күн бұрын

What the heck is this guy saying!!! PhD level???? Come on!!! Stop lying!!! I use AI for coding in a day to day basis and they are still really stupid, stop the hype... Even o1, it lied to me the first time I asked it something, and it was just some idiotic question about a problem with React...

@austinpack101 12 күн бұрын

Probably a problem with your prompt

@SebastianDavidMusic 12 күн бұрын

@@austinpack101 Be sure it's not.

@delight163 12 күн бұрын

You use AI to code but theyre stupid? What do you need a stupid assistant for

@SebastianDavidMusic 12 күн бұрын

@@delight163 Mostly, when developing with languages/technologies I've never used before, they are useful to give me an initial idea by quickly showing me things I'm not even aware they exist. It's kind an hypersearch engine that if I had to search by myself on google, documentation, stackoverflow, it would take me much longer. Also sometimes it gives me code snippets which are useful even if I have to modify them. But these models are stil dumb, they don't reason, they just give you an answer based on probabilities, and more than half of the time, the answer is wrong (or at least partially wrong). It doesn't mean it's not a useful tool, even with all this shortcomings it enhances my productivity a lot. But they are not even close to PhD level. Surely they have the information of PhDs, they've been trained with almost all the information in the world. But they are still dumb. Having information doesn't mean you can reason. You could memorize and recite me a whole book on logic, and even so, not being able to solve a logic problem... So, I hope you are not as dumb as the AI tools, and that with this I'm telling you, you'll be able to understand that your sarcasm is out of place.

@johnpats7024 11 күн бұрын

@@delight163 You’re calling him stupid for working efficiently and using tech instead of manual input? And you insinuate he’s stupid? Don’t drive kids, walk everywhere or your lazy. lmao

@UltrasNaBaterije 12 күн бұрын

Actually, there is teorethical solution to AI. It's called Turing test. ;)

@JohnBoen 12 күн бұрын

And that happened about 2 years ago.

@UltrasNaBaterije 12 күн бұрын

@@JohnBoen we think we solved it. Same story with Maxwell equations.

@JohnBoen 12 күн бұрын

@UltrasNaBaterije Lol. Think about social media posts - people cannot tell what is AI-generated vs what is human-generated. The average person could no longer tell about 2 years ago.

@armandosillones2643 10 күн бұрын

this dude must be unsuferable to work with

@SLM3573 12 күн бұрын

the fact hat we still use the word Intelligence is embarrasing, if you have enough computing power that can absorb all digital text and just reuse it or alter it to make AI Trump videos whats intelligent about that its just graphic card output. Show me a computer that has a basic algorithm to observe and learns without LLMs just as a baby evolving into a child and an adult does then we can talk

@Push781 12 күн бұрын

The underlying concept is similar to what we believe happens in the brain, thus the word intelligence

@funnydashcamvideos1412 12 күн бұрын

All predicated on the possibility that humans are actually intelligent.

@johnallard2429 12 күн бұрын

Imagine seeing models go from nothing to basically PhD level intelligence in mathematics, physics, and coding in like 30 months and making a statement like this? o1 can answer questions you cannot, does that make it more intelligent than you across many domains?

@JacksonHansen-s8r 11 күн бұрын

How do you then explain AI that can predict answers to math problems it never saw in training data? (Not to imply it wouldn't be fed math data, it's just that it can answer questions that it was given no answer to prior)