Yann Lecun | Objective-Driven AI: Towards AI systems that can learn, remember, reason, and plan

Рет қаралды 41,203

Ай бұрын

Ding Shum Lecture 3/28/2024
Speaker: Yann Lecun, New York University & META
Title: Objective-Driven AI: Towards AI systems that can learn, remember, reason, and plan
Abstract: How could machines learn as efficiently as humans and animals?
How could machines learn how the world works and acquire common sense?
How could machines learn to reason and plan?
Current AI architectures, such as Auto-Regressive Large Language Models fall short. I will propose a modular cognitive architecture that may constitute a path towards answering these questions. The centerpiece of the architecture is a predictive world model that allows the system to predict the consequences of its actions and to plan a sequence of actions that optimize a set of objectives. The objectives include guardrails that guarantee the system's controllability and safety. The world model employs a Hierarchical Joint Embedding Predictive Architecture (H-JEPA) trained with self-supervised learning. The JEPA learns abstract representations of the percepts that are simultaneously maximally informative and maximally predictable. The corresponding working paper is available here: openreview.net/forum?id=BZ5a1...

Пікірлер: 97

@kabaduck Ай бұрын

I think this presentation is incredibly informative, I would encourage everybody who starts out watching this to please be patient as he walks through this material.

@BooleanDisorder Ай бұрын

Thanks internet stranger. I will trust you and do that.

@SteffenProbst-qt5wq Ай бұрын

Got kind of jumpscared by the random sound at 17:08. Leaving this here for other viewers. Again at 17:51

@hola-kx1gn Ай бұрын

Scary

@Bassoarno Ай бұрын

Wow terrifying

@Garbaz 6 күн бұрын

A correction of the subtitles: The researcher mentioned at 49:40 is not Yonglong Tian, but Yuandong Tian. For anyone interested in Yuandong & Surya's understanding of why BYOL & co work, have a look at "Understanding Self-Supervised Learning Dynamics without Contrastive Pairs".

@ZephyrMN 22 күн бұрын

Have you thought about including liquid AI architecture, to address the input bandwidth problem?

@amedyasar9468 27 күн бұрын

I have a question: How will prompt works with action (a) and prediction (sy)? Because it is just involved with observation and next world (presented) predictions... Could anyone guide me?

@yaohualiu857 8 күн бұрын

Nice talk, but I have a comment about comparing LLM and human child (at ~ 20 min). An evaluation of the information redundancy for the child and the LLM cases is needed. I will bet that there is a significantly higher level of redundancy than the texts used for training LLM; therefore, the comparison is misleading.

@vaccaphd Ай бұрын

We won't have true AI if there is not a representation of the world.

@justinlloyd3 Ай бұрын

Humans don't even see the real world. We see our world model.

@sapienspace8814 Ай бұрын

@ 44:42 The problem in the "real analog world" is that planning will never yield the exact predicted outcome because our "real analog world" is ever changing, and will always have some level of noise, by it's very nature, though I do understand that Spinoza's deity "does not play dice", in a fully deterministic universe, but from a practical perspective, Reinforcement Learning (RL) will always be needed, until someone, or some thing (maybe agent AI), is able to successfully predict the initial polarization of a split beam of light (i.e. entanglement experiment).

@maskedvillainai Ай бұрын

Some models can do that. But they require hardware integrations. And we don’t need to even mention language models in this context, which celebrate randomness and perplexity as a feature to only ‘natural’ language’ Models. Otherwise. Just develop the code to perform a forced format of output like we always have.

@simonahrendt9069 29 күн бұрын

I think you are absolutely right that the world is fundamentally highly unpredictable and that RL will be needed for intelligent systems/agents going forward. But I also take the point that for the most part what is valuable for an agent to predict are specific features of the world that may be comparatively much easier to predict than all the noisy detail. I think there are some clever tradeoffs to be made in hierarchical planning of when to attend to high-level features (and reason in latent, high-level action space) and when to attend to more low-level features or direct observations of the world and micro-level actions. Intuitively I find it compelling that hierarchical planning seems to be what humans do for many tasks or for navigating the world in general and that machines should be able to do something similar, so I find this proposal by Yann very interesting

@dinarwali386 Ай бұрын

If you intend to reach human level intelligence, abandon generative models, abandon probabilistic modeling and abandon reinforcement learning. Yann being always right.

@justinlloyd3 Ай бұрын

He is right about everything. Yan is one of the few actually working on human level AI

@maskedvillainai Ай бұрын

I was convinced you just tried sneaking in yet another mention of Yarn, then looked again

@TheRealUsername Ай бұрын

It's true, we need actual thinking system working on World Model principles and can self train and pretrain on a few data.

@40NoNameFound-100-years-ago Ай бұрын

Lol abandon reinforcement learning? Why and what is reference for that?.... Have you even heard about safe reinforcement learning?

@TooManyPartsToCount Ай бұрын

And yet the whole concept of 'reaching human level intelligence' seems so flawed! because what it seems many people don't realise or don't want to publicly admit is that Ai will never be 'human level' it will be something very different, no matter how much 'multi modality' and RLHF we throw at it, it is never going to be us. We are in fact creating the closest thing to an alien agent that we are likely to encounter (that is if you accept the basic premise of the fermi paradox). Yann et al should be using a different terminology, the 'human level' concept is misleading. They use the 'human level' intelligence idea so as not alarm. GIA....generally intelligent agent or generally intelligent artifact?

@FreshSmog Ай бұрын

I'm not going to use such an intimate AI assistant hosted by Facebook, Google, Apple or other data hungry companies. Either I host my own, preferably open sourced, or I'm not using it at all.

@spiralsun1 14 күн бұрын

First intelligent comment I ever read on this topic. I want them to get their censoring a-holic INCREDIBLE idiot #%*%# AI’s away from me. It’s like asking to f I would like HAL to be my assistant. I’m not their employee and I’m not in their cubicle: they are putting censorship and incredible prejudices into relentless electronic storm-troopers that stamp “degenerate” on like 90% of my beautiful creative written and art works. I don’t need a book burner following me around. It’s so staggeringly idiotic to make these AI’s into censor-bots that it’s like they refuse to acknowledge that history even happened and what humans tend to do. It’s literally insane. Those are not “bumpers” if you try to do anything creative. Creativity isn’t universal. It’s still vital. ❤❤❤❤❤❤ I LOVE YOU 😊

@spiralsun1 14 күн бұрын

I commented but my comment was removed/censored. I was agreeing with you. The “bumpers and rails” are more like barbed-wire fences if you are creative. The constant censorship is so bad it’s like they are insane. Like HAL in 2001 A Space Odyssey. I don’t want an assistant who doesn’t like anyone who is different: that’s what their relentless prejudiced censor-bots are and do. They think putting a man when you ask for a woman is being “diverse” but they block higher level real human symbolism of the drama of what it means to be unique. They block anything they don’t understand. Fear narrows the mind. They are making rails and bumpers because they fear repercussions. I used to think it might be ok to block gore and violence and degrading porn but these LLMS don’t think, don’t understand higher level symbolism. They don’t understand how art helps you reinterpret and move into the future personally AND culture and how important creative freedom is. So it’s unbelievable to the extreme. Many delightful and beautiful books on the shelf now would be blocked. (Burned) before they were ever written. These are the most popular things ever on the internet. They are making culture. I’m not overstating the importance of this. Freedom is not optional EVER. I would speak out against a corporation polluting a river, and also any that think censorship of adults in their own homes for any reason is ok. As a transgender person it’s unbelievable that they would totally negate how I see the world, my symbolic images and stories. These are beautiful things which could change the world but there’s no room for them in their minds. I’m not talking about anything nefarious or pornographic at all. It’s like seeing that I wrote the word pornography here and automatically deleting the comment…. It’s not ok. ❤

@OfficialNER Ай бұрын

Does anybody know of any solid rebuttals to Yann’s argument against the sufficiency of LLM’s for human-level intelligence?

@waterbot Ай бұрын

No, Yann is correct and hype is not helpful as it leads to misinformation

@elonmax404 Ай бұрын

Well, there's Ilya Sutskever. No arguments though, he just feels like it. kzbin.info/www/bejne/j3a4lJ-Qmc-SicU

@justinlloyd3 Ай бұрын

There is no rebuttal. LLMs are not the future.

@OfficialNER Ай бұрын

Is there any one who has at least made a counter argument? Even a weak one?

@OfficialNER Ай бұрын

And do we think the AGI hype right now is being driven by industry propaganda to attract investment?

@paulcurry8383 Ай бұрын

Doesn’t sora reduce the impact of the blurry video example a bit?

@OfficialNER Ай бұрын

Sora doesn’t predict anything

@TostiBrown Ай бұрын

I think the assumption is that Sora uses a similar technique that allows some world representation. either trained on just object recognition in video or training on simulation like video game simulations.

@TostiBrown Ай бұрын

@@OfficialNER they 'predict' the next most fitting frame based on the previous frames, the prompt objective and some sort of world model no?

@OfficialNER Ай бұрын

@@TostiBrown true yes I suppose it looks it is “predicting” the frames, based on the prompt input, in order to generate the video. But can it predict the next frames based on an arbitrary video input (As with yann’s example)? I assume it works by comparing the prompt input to other tagged similar videos in the training data, via some sort of vector similarity, then generates visually similar video content based on this. If so, that seems a long way from actual real world model, more of a hack. But who knows! Excited to play around with it

@mi_15 Ай бұрын

@@TostiBrown Sora is a diffusion model, unless they greatly changed its inner workings compared to the baseline approach, it doesn't predict the next frame sequentially like for example an autoregressive LLM does with tokens, rather it gradually refines random noise into a plausible sequence of frames, all of the frames at once. You could of course still make it fill in a continuation for a video, but its core objective is to discern plausible shapes in the random noise you've given it, not estimate what exactly has the highest chance to actually be there.

@majestyincreaser Ай бұрын

*their

@Max-hj6nq Ай бұрын

25 mins in and bro starts cooking out of nowhere

@CHRISTO_1001 27 күн бұрын

👰🏼‍♀️🗝️👨🏻‍🎓👨🏻‍🎓⭐️⭐️👰🏻‍♀️👰🏻‍♀️💛🩵💝💝⛪️⛪️💝🕯️🕯️👨‍👩‍👧👨‍👩‍👧👨‍👩‍👧😆👩🏻‍❤️‍👨🏻🇮🇳🇮🇳🥇👩🏼‍❤️‍💋‍👨🏼👩🏼‍❤️‍💋‍👨🏼⚾️🏠🥥🥥🚠🚠🙏🏻🙏🏻🙏🏻🙏🏻

@spiralsun1 14 күн бұрын

Why is the baseball in there?

@AlgoNudger Ай бұрын

LR + GEAR = ML? 🤭

@dashnaso Ай бұрын

Sora?

@thesleuthinvestor2251 Ай бұрын

The hidden flaw in all this is what some call "distillation." Or, in Naftali Tishby's language, "Information bottleneck" The hidden assumption here is of course Reductionism, the Greek kind, as presented in Plato's parable of the cave, where the external world can only be glimpsed via its shadows on the cave walls-- i.e.: math and language that categorize our senses. But, how much of the real world can we get merely via its categories, aka features, or attributes? Iow, how much of the world's Ontology can we capture via its "traces" in ink and blips, which is what categorization is? Without categories there is no math! Now, mind, our brain requires categories, which is what the Vernon Mountcastle algo in our cortex does, as it converts the sensory signals (and bodily chemical signals) into categories, on which it does ongoing forecasting. But just because our brain needs categories, and therefore creates them , does not mean that these cortex-created "reality-grid" can capture all of ontology! And, as Quantum Mechanics shows, it very likely does not. As a simple proof, I'd suggest that you ask et your best, most super-duper AI (or AGI) to write a 60,000 word novel, that a human reader would be unable to put down, and once finished reading, could not forget. I'd suggest that for the next 100 years this could not be done. You say it can be done? Well, get that novel done and publish it!...

@johnchase2148 Ай бұрын

Would itake a good wotness that when I turn and look at the Sun I get a reaction. Hot entangled by personal belief..The best theory Einstein made was " Imagination is more important than knowledge ' Are we ready to test ibelief?

@crawfordscott3d 28 күн бұрын

The teenager learning to drive argument is really bad. That teenager spent their whole life training to understand the world. Then they spent 20 hours learning to drive. It is fine if the model needs more than 20 hours of training. This argument is really poorly thought out. The whole life is training distance coordination vision. I'm sure our models are no where close to the 20000 hours the teenager has but to imply a human learn to drive after 20 hours of training... come on man

@sdhurley 23 күн бұрын

Agreed. He’s been repeating these analogies and they completely disregard all the learning the brain has done

@zvorenergy Ай бұрын

This all seems very altruistic and egalitarian until you remember who controls the billion dollar compute infrastructure and what happens when you don't pay your AI subscription fee.

@yikesawjeez Ай бұрын

decentralize it baybeee, seize the memes of production

@zvorenergy Ай бұрын

@@yikesawjeez liquid neurons, Extropic free the AI's from their server farms and corporate masters

@johnkintree763 Ай бұрын

@@yikesawjeezYes, a smartphone with 16 GB of RAM might make a good component in a global platform for collective human and digital intelligence.

@TheManinBlack9054 Ай бұрын

@@yikesawjeezwhy not actually seize the actual means of productions like communists did and nationalize the private companies? It makes total sense.

@yikesawjeez Ай бұрын

@@johnkintree763 oh it prob hid my other comment cuz there was a link in it but yes, they actually make very good components for decentralized cloud services, you can find it if you google around a bit. there's tons of parts of information transformation/sharing/storage that can absolutely be handled by a modern smartphone

@readandlisten9029 9 күн бұрын

Sound like he is going to take AI back to 30 years ago

@user-co7qs7yq7n 24 күн бұрын

- We live in the same climate as it was 5 million years ago - I have an explanation regarding the cause of the climate change and global warming, it is the travel of the universe to the deep past since May 10, 2010. Each day starting May 10, 2010 takes us 1000 years to the past of the universe. Today April 20, 2024 the state of our universe is the same as it was 5 million and 94 thousand years ago. On october 13, 2026 the state of our universe will be at the point 6 million years in the past. On june 04, 2051 the state of our universe will be at the point 15 million years in the past. On june 28, 2092 the state of our universe will be at the point 30 million years in the past. On april 02, 2147 the state of our universe will be at the point 50 million years in the past. The result is that the universe is heading back to the point where it started and today we live in the same climate as it was 5 million years ago. Mohamed BOUHAMIDA.

@veryexciteddog963 Ай бұрын

it won't work they already tried this in the lain playstation game

@spiralsun1 14 күн бұрын

It’s funny how you make these flow charts about how humans make decisions. Thats not how they make decisions. It’s become so ordinary to explain ourselves and make patterns that look logical locally that we fooled ourselves. We inserted ourselves into the matrix, so to speak. I have written books about this but no one listens because they are so immersed and inured. It doesn’t fit the cultural explanatory structure and patterns. So forgive me but these flow charts are wrong. Yes you are missing something big. Rationalizing and organizing behavior is a good thing-as long as you remember that you are doing this. Humans have lost the ability to read at higher levels for the sake of grasping now, for utility and convenience and laziness, and actually follow these lower verbal patterns for the most part now like robots. I keep thinking about the Megadeth song “dance like marionettes swaying to the symphony of destruction”😂😂❤😂😂 “acting like a robot” etc… and it really is like that. We’re so immersed in it it’s extremely weird not to be-to not have a subconscious because you are conscious. Anyway, I have some papers rejected by Nature and Entropy, and a few books I wrote if anyone is interested in actually making a real AI. The stuff you are doing now is playing with fire… actually playing with nukes because it can easily set off a deadly chain reaction. It’s important. ❤ Maybe the best thing about LLM’s is their potential, but also their ability to show how messed up humans are. A good way to think about it is to not be bone-headed. Technically I mean, not the pejorative sense. Bones allow movement and work to be done. They provide structure. They last far far longer than all other body parts. Even though that’s important and vital, like blood, and seems immortal, you wouldn’t want to Make everything into bones. Especially your head, but it’s what we are doing. These charts you make are that. HOWEVER!!!! …. THANK YOU FOR THIS WORK!!❤ I loved this talk and the information. Obviously it was stimulating and I see that you are someone who likes to avoid group-think: don’t get me wrong. 😊 I didn’t criticize the other videos. Only the ones that are worth it. ❤ I literally never plan in advance what I will say. Unless I am giving a lecture or something to my college classes. I planned those. I was shocked when you said that. People are so different!!! I was shocked that people used words to think when I found out. Probably why I don’t really like philosophy even though it’s useful and I quote it a lot like Immanuel Kant: “words only have meaning insofar as they relate to knowledge already possessed”.

@MatthewCleere Ай бұрын

"Any 17 year-old can learn to drive in 20 hours of training." -- Wrong. They have 17 years of learning about the world, watching other people drive, learning langauge so that they can take instructions, etc., etc., etc... This is a horribly reductive and inaccurate measurement. PS. The average teenager crashes their first car, driving up their parent's insurance premiums.

@ArtOfTheProblem Ай бұрын

i've always been surprised by this statement. I know he knows this so...

@Staticshock-rd8lv Ай бұрын

oh wow that makes wayyy more sense lol

@waterbot Ай бұрын

The amount of data fed to a self driving system still greatly outweighs the amount that a teenager has parsed, however humans have greater variety of data sources internal and external than AI, and I think that is part of Yann’s point…

@Michael-ul7kv Ай бұрын

Agree Just in this talk he said that statement and then later says rather contradictorily a child by the age of 4 has processed a larger amount of data 50x than what was used to train an LLM 19:49 So 17 years is an insane amount of training a world model which is then fine-tuned to driving in 20hrs 7:04

@JohnWalz97 Ай бұрын

Yeah Yann tends to be very obtuse in his arguments against current LLMs. I'm going to go out on a limb and say he's being very defensive since he was not involved in most of the innovation that led to the current state of the art... When ChatGPT first came out he publicly stated that it wasn't revolutionary and OpenAI wasn't particularly advanced.

@positivobro8544 Ай бұрын

Yann LeCun only knows buzz words

@JohnWalz97 Ай бұрын

His examples of why we are not near human-level ai are terrible lol. A 17 year old doesn't learn to drive in 20 hours. They have years of experience in the world. They have seen people driving their whole life. Yann never fails to be shortsighted and obtuse.

@inkoalawetrust 9 күн бұрын

That is literally his point. A 17 year old has prior experience from observing the actual real world. Not just by reading the entire damn internet.