The ChatGPT Paradox: Impressive Yet Incomplete

  Рет қаралды 37,938

Machine Learning Street Talk

Machine Learning Street Talk

Күн бұрын

Пікірлер: 105
@MachineLearningStreetTalk
@MachineLearningStreetTalk Ай бұрын
MLST is sponsored by Tufa Labs: Are you interested in working on ARC and cutting-edge AI research with the MindsAI team (current ARC winners)? Focus: ARC, LLMs, test-time-compute, active inference, system2 reasoning, and more. Future plans: Expanding to complex environments like Warcraft 2 and Starcraft 2. Interested? Apply for an ML research position: benjamin@tufa.ai
@Rezidentghost997
@Rezidentghost997 Ай бұрын
Who wouldn't be ?😊
@NaanFungibull
@NaanFungibull Ай бұрын
This is absolute gold. I love the format of just letting a brilliant mind explore a topic deeply. What a gifted speaker. So much knowledge transmission in so little time. I feel like I'm much closer to understanding what the state of the art is currently, and it sparked some new ideas for me.
@BryanBortz
@BryanBortz Ай бұрын
This guy is great. You should bring him back in the future.
@ginogarcia8730
@ginogarcia8730 Ай бұрын
i dunno why but my brain is like - future future... but it just means later episode lmao
@AIroboticOverlord
@AIroboticOverlord Ай бұрын
Yes, we will put him in cryostasis. To have the ability of bringing him back in the future. How far into the future we should wake him up? To far and your dead unless you also want to technologicly go hibernating the time this man is gone?
@AIroboticOverlord
@AIroboticOverlord Ай бұрын
​@@ginogarcia8730i was more focused on the future combo with bringing him back :)
@EddieKMusic
@EddieKMusic Ай бұрын
⁠@@AIroboticOverlord I think let’s wake him 100 years after his hibernation start.
@AIroboticOverlord
@AIroboticOverlord Ай бұрын
@@EddieKMusic we can also just hibernate the commenter dude because it will be recorded anyway. Assuming youtube is still online and the channel didnt become victim of yt's low treshold cancel culture(for the peeps that have no clue and think wtf is he talking about ill give one clear example of which i got endless! "Russel Brand" !) He should be all good to go. It will be ready waiting to company him for its first futuristic breakfast ;) YAW. P.S Make sure your brainfreeze is over before checkin it hehe.
@kongd
@kongd Ай бұрын
This is excellent, thank you so much for sharing professor Dietterich.
@tonystarkagi
@tonystarkagi Ай бұрын
the best channel on AI rn. my fav. love all your videos.
@palimondo
@palimondo Ай бұрын
Thanks for including the references to the mentioned papers (and with timestamps in video!). Could you please also always include in the description the date when the interview was recorded? Many of the interviews you are releasing now predated the o1 model preview release. So it is possible that some of your guest have since somewhat updated their assessments of the LLM‘s (in)capability to reason in light of o1-preview release. This is not to say they would have completely reneged on their fundamental objections-but I would love to see how more nuanced these have become after the 12th of September 2024.
@MachineLearningStreetTalk
@MachineLearningStreetTalk Ай бұрын
Was filmed at ICML 2024, and if I had a pound for every person who said "this was before o1 came out". It doesn't substantially change anything said, I'm pretty sure Thomas would agree - perhaps he will respond to your comment
@bujin5455
@bujin5455 Ай бұрын
42:23. Actually it's not surprising, and it's not complicated. The reason society has such a low tolerance for airline fatalities verses automotive boils down to agency. When you get in a car, and you drive yourself, you are taking the risk on yourself. You're deciding if you think the situation is safe enough to drive, you're trusting your own skill and execution, and it's up to the individual to make that assessment on a case by case basis. When you entrust yourself to an airline, you are trusting the pilot, and the airline's maintenance, and there are so many more failure modes with an aircraft than a car, and the cost of failure is so much higher. So if you are going to surrender your agency to another, you want to believe that other is more capable than you, especially where the nominal failure mode is much more extreme. 42:41. Absolutely, automated cars will be held to a MUCH higher standard than operated cars. No doubt about it.
@authenticallysuperficial9874
@authenticallysuperficial9874 Ай бұрын
Yes
@calmhorizons
@calmhorizons Ай бұрын
Or it could be as simple as recency bias and availability heuristics. A car accident is only international news if a former princess is in the vehicle. But plane crashes are reported globally and their investigation plays out for days like a whodunnit.
@mkhex87
@mkhex87 Ай бұрын
This guy is shockingly broad and deep
@littleones-yeahh
@littleones-yeahh Ай бұрын
girthy
@miniboulanger0079
@miniboulanger0079 Ай бұрын
On the ROT-13 topic, it's interesting to note Claude Opus 3 (haven't tested Sonnet 3.5) is quite good at not only any arbitrary ROT-X but also any arbitrary alphabetical substitution. There's too many possibilities for any particular key to have likely been in the training data, which implies it has learned the underlying algorithm.
@steve_jabz
@steve_jabz Ай бұрын
I think one day anthropic will just train a model to directly call the circuit that performs the operation instead of trying to intervene without being asked. Thought that's where they were going with the Scaling Monosemanticity paper
@palimondo
@palimondo Ай бұрын
Did you test Opus with base64 decoding, by any chance? Because Claude 3.5 Sonnet as well as other models (4o) do suffer from the probabilistic correction issue that was mentioned in the interview. Is Opus different?
@yurona5155
@yurona5155 Ай бұрын
Up to some reasonable (sub)word length 26^n isn't really all that much data. I.e. synthetic data will likely go a long way - at least with ROT-X.
@ClydeWright
@ClydeWright Ай бұрын
o1-preview is even better, and can tackle more complicated ciphers even
@bujin5455
@bujin5455 Ай бұрын
It's like we have all the pieces of AGI, but we don't know how to orchestrate them. Humans can decide when it's time to rely on the "gut" system 1 thinking, we can decide when to pursue system 2 thinking, and to refine our skills with system 2, which then tends to finetune and reorient our system 1 thinking. We can decide when to override our gut, because we understand that despite our "training data" (instinctive sense) that the logic shows something very different. We can look at our instincts in a subject, then pursue refining and formalizing our understanding of those intuitions. But to do all of this there is a meta-cognition mechanism, which we tend to refer to as consciousness, that directs these efforts. The term "understanding" tends to speak not to system 1 or system 2, but to that mechanism that is mediating all of these efforts. So far, we don't seem to have a theory on how to create this mechanism, and we're hoping it's going to emerge out of scale, but state seem exceedingly unlikely. I think we clearly have a run way to seriously scale up the tools we currently have, but a true human like intelligence seems to require a breakthrough we haven't yet made. Until then, we're just building ever more powerful digital golems, without actually breathing real living intelligence into them. And perhaps that's for the best.
@gk10101
@gk10101 Ай бұрын
human like intelligence will be an illusion. as soon as you buy into it, you'll have it.
@yachalupson
@yachalupson Ай бұрын
'Bridging the left / right hemisphere divide' is the analogy I hear here: "The real opportunity is to mix.. formal [reasoning] with the intuitive, experience based, rich contextual knowledge.." Such a striking parallel to the call to rebalance 'Master' and 'Emissary' (à la Iain McGilchrist), facing humanity at present.
@yachalupson
@yachalupson Ай бұрын
I listen to these talks with deep interest for the same reasons you seem to engage in them: the mirror they hold to neurology, perception, meaning, metaphysics etc is exquisite. Thanks for sharing the richness.
@MLDawn
@MLDawn Ай бұрын
Years ago had an interview with him over anomaly detection. He is a world renowned expert on anomaly detection.
@Corteum
@Corteum Ай бұрын
You mean like anomalous UAP phenomenon?
@for-ever-22
@for-ever-22 Ай бұрын
Great interview as usual. Somehow it keeps getting better. Appreciate your hardwork that contributes to open education 🎉
@jmirodg7094
@jmirodg7094 Ай бұрын
Best talk I've seen on AI for a while! I have a lot of hope for the use of graph and theorems proyers in reasoning but graphs need to evolve to catch more subtleties, it is a blunt tool for now.
@XOPOIIIO
@XOPOIIIO Ай бұрын
To solve the problem of truthfullness, the model have to have a world model, so it could understand what fits and what don't fit into one. I don't think it's possible to have a large, complex and consistent world model that is wrong at the same time. Current LLMs don't have a world model, they can simulate world models of different people.
@greatestone4eva
@greatestone4eva Ай бұрын
i'm glad ppl finally brought up expert systems. it's the basis of building a proper AI and a proper supervised dataset. I've been explaining this since 2017. glad to see a fellow who gets it
@CyguhijcDatsun
@CyguhijcDatsun Ай бұрын
enterprise-ai AI fixes this (Code complete projects in PHP or Python). The ChatGPT Paradox: Impressive Yet Incomplete
@thedededeity888
@thedededeity888 Ай бұрын
Yooooo, so glad ya'll brought Thomas on the show finally. Also shoutout to Gurobi! :)
@Lowlin85
@Lowlin85 Ай бұрын
Great interview, good work you guys!
@similarvideo
@similarvideo 26 күн бұрын
Great video 😊thanks for sharing
@steve_jabz
@steve_jabz Ай бұрын
Shouldn't o1 be better at quantifying uncertainty if it's trained the way we think it is? Hopefully we get an open source version of this so we can try training it to give a confidence value based on similar trajectories in the rl that lead to unverifiable answers
@warsin8641
@warsin8641 Ай бұрын
Bro AI is already smarter than most humans even with hallucinations
@VR_Wizard
@VR_Wizard Ай бұрын
Please in post processing rotate the video to get it straight if the camera was ot set up correctly. For me this is a huge distraction. Most might not care but for me it matters.
@anoojpatel5428
@anoojpatel5428 Ай бұрын
Osband's (from DeepMind or now OpenAI) Epi(stemic)Nets and Prior Nets work is extremely effective and efficient to implement on top of most large models or as a surrogate/extension of existing models to get joint probabilities, thus measuring epistemic uncertainty quite well. He built an LLM training loop which helped the model training with better sample efficiency based on uncertainty. Definitely worth the read!
@hozcarhz
@hozcarhz Ай бұрын
Que cantidad de información tan valiosa 😮. Gracias por compartir 😊
@markleydon4118
@markleydon4118 Ай бұрын
Terrific interview. One question: why isn’t this available on your podcast feed? I subscribe to MLST on the Apple podcast app but have not seen this particular interview there.
@MatthewPendleton-kh3vj
@MatthewPendleton-kh3vj Ай бұрын
Nice to see a comment section that isn't full of "LLMs are exactly like human brains! Just scale up and you'll get generalization at a human level!" Very good talk!
@memegazer
@memegazer Ай бұрын
Also want to push back on the "single author papers" narrative. There is a well established history of citation of proceeding work. The collabrative effort has always been a part of good research to my view. The only difference now is more people are will to accept collabrative responsibility, which I agree is a boon to all science, not just computer science...bc it incentivises communal resposnibility and shared credit and accountability. But mostly bc it incentivises zero sum monoply. Wich has plauged scientific research with perverse incentives for millennia. Good vid
@Aedonius
@Aedonius Ай бұрын
OpenCog AGI has a hypergraph. I think what is needed is exactly the technique mentioned around 22:30 or extracting the knowledge within an LLM into a. formal language. I think filling this graph is basically the missing part of OpenCog.
@wonseoklee80
@wonseoklee80 Ай бұрын
Human System 1 thinking is also probabilistic-you tend to lean toward what you have experienced before. Naming the alphabet backward is always challenging for humans. LLMs have effectively mastered human System 1 thinking. Adding reasoning and agency to LLMs will result in behaviors surprisingly similar to human behavior in AIs.
@jeunjetta
@jeunjetta Ай бұрын
Marimo interactive notebooks can be shared as a self-contained HTML page or markdown. Good alternative for PDF papers
@arunkumarchithanar
@arunkumarchithanar 26 күн бұрын
Thanks!
@jeff__w
@jeff__w Ай бұрын
6:30 *THOMAS G. DIETTERICH:* “We often read into what it's written more than the model necessarily knows or understands. And you could ask, “Well, does that really matter-if it outputs the right answer and we interpret it correctly, that's the right answer.” Well, if we happen to look at a broken clock during one of the two minutes in 24-hours that happen to coincide with the actual time and we interpret the clock as showing the right time, I guess that it doesn’t really matter if the clock is broken or not. _Edit:_ Just to be clear, I realize that Prof. Dietterich is _not_ endorsing that position.
@asage5801
@asage5801 24 күн бұрын
Many have already said that a new “architecture” is needed to create actual “thinking”.
@theK594
@theK594 Ай бұрын
I learned a lot, great thoughts!
@wwkk4964
@wwkk4964 Ай бұрын
This was great!
@memegazer
@memegazer Ай бұрын
I love arguments that take the shape "These models are statistical parrorts that correlate to numbers that reflect reality" Bc it demostrates how if the metrics were that simple there would be no aligment, jailbreak, or halluciantions issues. Clearly this is wild speculations about how "correlcation to reality" should be defined rather than valid metrics about why the models are not really predicting things from something deeper than humans can measure consistently.
@memegazer
@memegazer Ай бұрын
It shows me that the results tansformer models produce can easily be ommited from being "autocorrected" when there is a shortcut that allows some equilibrium about between epistemology and intelligence. I don't object necessarily...just take note how goal posts are being shifted.
@memegazer
@memegazer Ай бұрын
I guess my conclusion is if AI is framed as just being a stastical parrot, simply bc it was trained exclusively on human data...that would force humanity to examine an upleasant truth about what we consider intelligence. Force us to consider that epstimelogy was more of a collaberative effort than this narrative that individual creativity is supreme.
@memegazer
@memegazer Ай бұрын
Shielding general intelligence from a stastical metric is a great way to avoid that sort of conclusion. But I can't help but grow skeptical that is valid if the vast majority of human discovering is the result of stastical anomaly.
@memegazer
@memegazer Ай бұрын
Kind makes it seem like an argument about the most efficient way to put monkey's at a typewriter, and only count wins as the output aligns with current consensus.
@Bakobiibizo
@Bakobiibizo Ай бұрын
Do we have anything like a set of definitive set of papers that make up the base of human knowledge anywhere?
@shinkurt
@shinkurt Ай бұрын
Learned a lot
@kowboy702
@kowboy702 Ай бұрын
I suspect we have more tolerance for car accidents because it’s highly individual determined mean you're more implicated in your own accidents
@HUEHUEUHEPony
@HUEHUEUHEPony 19 күн бұрын
or because you live in car dependent USA where having no car means death.
@WyrdieBeardie
@WyrdieBeardie Ай бұрын
Many models can converse in ROT-13. And as the conversation goes on, it gets really weird... More "honest" in some ways, but it will speak more in metaphors and analogies. 🤔
@h.c4898
@h.c4898 Ай бұрын
finally someone who knows what he is talking about not doomers, "AGI" evangelists or corporate preachers. 😁😁
@noway8233
@noway8233 Ай бұрын
Wow , thats a lot of good info , cool😅
@yurona5155
@yurona5155 Ай бұрын
1:04:39 Quantize all the scientists! 🥳
@panzrok8701
@panzrok8701 Ай бұрын
Why should a model know everything? Just give it a search bar + Wikipedia. A model should be valued based on it's intelligence and not it's memory or knowledge.
@Rockyzach88
@Rockyzach88 Ай бұрын
ABI - Artificial Broad Intelligence :D
@goranmajic4943
@goranmajic4943 Ай бұрын
Great human being. We Need more rational people around AI and less Prophets.
@mhm2908
@mhm2908 Ай бұрын
Just have to jump in around 23:49 to express horror that that he suggested that journalists are part of the small set of 'ground truthers'.
@HUEHUEUHEPony
@HUEHUEUHEPony 19 күн бұрын
and you believe that old scientists with perverse economical and political incentives are ground truthers as well?
@mhm2908
@mhm2908 18 күн бұрын
@ I did not say that
@googleyoutubechannel8554
@googleyoutubechannel8554 Ай бұрын
Uh... no, everyone always wanted breadth... from before digital computers even existed... we just never knew how, we still don't, but we learned like a billion monkeys building a billion different models that if you throw enough data and stir it with linear algebra long enough with even the dumbest loss functions, eventually, you get chatGPT et all.
@MarceloTeixeiraa
@MarceloTeixeiraa Ай бұрын
Impressive what can I do with o1 preview, but is impressive How can give you extupids answers for complex prompts.
@angloland4539
@angloland4539 Ай бұрын
❤️🍓☺️
@GoodBaleada
@GoodBaleada Ай бұрын
The interviewers final concern was if humans would still be able screw up science by being the final arbiters of first principles. YOU DON'T HAVE THE TOOLS TO CONTEXTUALIZE LIKE LLMS DO GET OVER IT!!!!!
@europa_bambaataa
@europa_bambaataa 27 күн бұрын
Why is the host not on camera? It's weirding me out
@memegazer
@memegazer Ай бұрын
"There is a distinction between simulating something and acutally doing it" Perhaps this so, but not unless you can introduce real metrics the simulation neglects. Otherwise you are left simply with the speculation "perhaps the simulations fails to account for thing that are real and omitted from simulated measurements" I mean that is very liberal and agnostic interpetation. But hardly an account for how and why a given simulation has failed.
@tylermoore4429
@tylermoore4429 Ай бұрын
Don't get this. Obviously LLM's have no concept of ground truth, and all their knowledge exists at the same ontological level, just tokens with some internal relationships. So the only way for them to have anything more than a probabilistic kind of epistemic certainty/uncertainty is to train in the souces of the knowledge we are feeding the model, and the level of confidence it can attach to the different sources, wikipedia versus reddit say. Over and above this, certain other practices of epistemic hygiene that humans adopt, such as self-consistency, logical soundness, citing your sources seem like they should be implementable in LLM's.
@Decocoa
@Decocoa Ай бұрын
Is that basically RAG?
@oncedidactic
@oncedidactic Ай бұрын
Not to take away from your point, but you would think the data and the training would impart some level of epistemic ranking and hygiene. ie discussion of the dependableness (or not) of Wikipedia is abundant, so would reflect on Wikipedia content in the weights
@demetriusmichael
@demetriusmichael Ай бұрын
This is already done. Reddit content for example are trained based on upvotes.
@ckq
@ckq Ай бұрын
Pretty sure they already do that to some extent. Not sure about the specifics tho
@uber_l
@uber_l Ай бұрын
Wonder if it's worth to llmize reasoning. Could gather quality data from smart guys, such as scientists, mensa members. 'What was a difficult problem that you solved? Describe step by step in high detail and provide context'. Problem-solution.
@domalec
@domalec 28 күн бұрын
Can any human claim to be complete? If not then why would any human invention be complete? Incompletion cannot produce completion.
@memegazer
@memegazer Ай бұрын
"Playing chess is not statistically differen than using language" Yes using langauge is more complex than playing chess. But that simple fact does not entail the logical conclusion that "LLMs can only arrive at superhuman levels of using language based on occurence of language learned from breaking it down into sequential tokens" Anybody should see why this argument immediately fails. If frequency of occurance of how tokens statistically follow as a probility was the problem space, then with more compute anybody would be far more efficient to stack a frequency search ontop of a data base than it would be to ask a machine learning algorithin to find some better optimum, assuming the data cannot lie. One method is far more efficient than the other. Most people, especially in those with computer science degrees, cannot accept or grasp why.
@memegazer
@memegazer Ай бұрын
I don't think these "experts" mean to adopt bad faith arguments But I will criticize them for not knowing better a latent space of a LLM is not accurately described with an appeal to only stastical frequence of token appereance in the training set of the data it reflects as a model you can interact with these two maps of prediction tables are not one to one...and that is more interesting than pointing out that the deviation is not interesting bc we can imagine some computational overlap that could be labeled as "simulation, synthetic, or mere emulation"
@Rockyzach88
@Rockyzach88 Ай бұрын
Wait, what about reddit? Did I actually contribute to something?
@bodethoms8014
@bodethoms8014 Ай бұрын
No. You don’t matter in the grand scheme of things
@Rockyzach88
@Rockyzach88 Ай бұрын
@@bodethoms8014 BS, I'll be addressed as "the ai whisperer" going forward.
@bodethoms8014
@bodethoms8014 Ай бұрын
@@Rockyzach88 AI Whisperer? More like Hallucination Whisperer. Whisper all you want, the AI still isn’t listening
@Rockyzach88
@Rockyzach88 Ай бұрын
@@bodethoms8014 lol so angry
@andykjm
@andykjm Ай бұрын
This dude is speaking gibberish! LLMs don’t “understand” or “execute” ROT. This is why it doesn’t give the correct decoding.
@space999-gf6uq
@space999-gf6uq Ай бұрын
First
@iamr0b0tx
@iamr0b0tx Ай бұрын
arrrg **shakes fist**
@IPutFishInAWashingMachine
@IPutFishInAWashingMachine Ай бұрын
We got chatgpt before gta 6
@ClydeWright
@ClydeWright Ай бұрын
How are the newest models doing on TruthfulQA? Can’t find any evals on this recently. Why?
@asunhug
@asunhug 16 күн бұрын
Thank you for the great food for thought 🫡🫡🫡🫡🫡
WE GOT ACCESS TO GPT-3! [Epic Special Edition]
3:57:17
Machine Learning Street Talk
Рет қаралды 332 М.
GEOMETRIC DEEP LEARNING BLUEPRINT
3:33:23
Machine Learning Street Talk
Рет қаралды 267 М.
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 12 МЛН
Why no RONALDO?! 🤔⚽️
00:28
Celine Dept
Рет қаралды 32 МЛН
Twin Telepathy Challenge!
00:23
Stokes Twins
Рет қаралды 83 МЛН
Do you think that ChatGPT can reason?
1:42:28
Machine Learning Street Talk
Рет қаралды 68 М.
Tim Ferriss: How to Learn Better & Create Your Best Future | Huberman Lab Podcast
3:39:09
Scientists Discuss Music and the Origins of Language
51:19
StarTalk
Рет қаралды 720 М.
Max Tegmark: AI and Physics | Lex Fridman Podcast #155
3:02:44
Lex Fridman
Рет қаралды 1,8 МЛН
How Deep Neural Networks Work - Full Course for Beginners
3:50:57
freeCodeCamp.org
Рет қаралды 4,4 МЛН
Top Minds in AI Explain What’s Coming After GPT-4o | EP #130
25:30
Peter H. Diamandis
Рет қаралды 300 М.
AI and Quantum Computing: Glimpsing the Near Future
1:25:33
World Science Festival
Рет қаралды 476 М.
Tutorial | LLMs in 5 Formulas (360°)
2:40:31
Harvard Data Science Initiative
Рет қаралды 858 М.
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 12 МЛН