Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals

Рет қаралды 81,702

Күн бұрын

Пікірлер: 114

@JaimeChavezDJ Ай бұрын

Oriol is genius, but, for a moment, I'd love to acknowledge Hannah, specifically her JOY in receiving this geeky information (that we all love), making it so accessible, and orchestrating the flow of this conversation. Kudos. Keep radiating that jubilant smile... the breath of fresh air!

@AlternativeTakes Ай бұрын

She is def very good

@SevenHeavenlyig Ай бұрын

Yeah i also liked how she was asking counter questions

@Crux69 Ай бұрын

I keep thinking the NotebookLM podcaster is modeled after her :D

@gustinian Ай бұрын

Hannah is such a genuinely outstanding interviewer; she has that rare combination of charisma, intelligence, wit and infectious enthusiastic curiosity.

@loucasi Ай бұрын

She’s so smooth in her interview style. Amazing work

@ricopags Ай бұрын

agreed, she's fantastic for this. she comes across as genuinely curious and passionate about learning

@John-sd5li Ай бұрын

Best presenter I have ever seen, she really knew what she said and actively engage in the conversation. Oriol Vinyals is great, good scientist, he don't fail into the hype cycle like many AI's influencer (Stare at you Sam) and give us very clear picture of what is going on now.

@OyvindSOyvindS Ай бұрын

These are the best podcasts on the net. It’s so great to witness a host so knowledgable and intelligent, asking questions to get good answers, rather than trying to show off own knowledge.

@josephbarney4523 13 күн бұрын

I love these podcasts from Google. I make good use of Gemini on my phone and am happy each time it gets smarter. Great presenter and the scientist helping design Gemini is a genius!

@drhxa Ай бұрын

Hannah's amazing at this. Thanks for sharing, it's fascinating

@JustinHalford Ай бұрын

A wonderful, clear, and compelling window into an exciting future. Kudos to both Hannah and Oriol.

@lightheartedblog Ай бұрын

I like how Hannah breaks down complex topics into simplest of terms and her curiosity is so engaging!!!

@john_dren Ай бұрын

cant recall when's the last time I was this excited

@stevenpham6734 Ай бұрын

It's when you came too early

@PaulMarshall 13 күн бұрын

Really enjoyed the conversation and appreciated the insights into the advancements beyond the foundation models. Thank you

@piasetzky Ай бұрын

It is a rare thing to watch such an amazing interviewer. Very interesting clip, yet even more thanks to the professor ;)

@EchoYoutube Ай бұрын

Ngl Google, just waiting for Live Video with Astra. Agents are awesome, but I make home use robots and repair cars.. so camera would be more versatile for a hands on help than agents. This IS really cool and helpful for a vast amount of people; I commend you guys.

Ай бұрын

In AI studio, you can test "stream realtime" and stream your camera to Gemini 2.0 flash. Worth a try!

@TrishanPanch Ай бұрын

This is the single best podcast series anywhere.

@mrpicky1868 Ай бұрын

it's not. she is intentionally very soft of Doom side of AGI

@markfitz8315 Ай бұрын

Get a pen out - take some notes - explore the definitions - a great primer for concepts and analogies. Beyond that, for AI enthusiasts like me, this is a cracking conversation that’s worth a few listens. Good job.

@En1Gm4A Ай бұрын

Thanks a lot I really enjoy these podcasts. 🙏

@DimanjanDahal Ай бұрын

Hanna Fry the best asset of Deepmind

@RaviAnnaswamy Ай бұрын

Listening to this after Ilyas neurips talk this is so much more humble and detailed with tons of new insights and ideas that one can pursue. Ilya might still create another groundbreaking gpt like innovation for sure but the level of innovation engineering and then integration reminds us google ecosystem is so vibrant which we had forgotten for some time They were just steering their ship last couple of years and seem to be catching up if not overtaking on innovation scale

@Hamzairshad5 Ай бұрын

Thank you for adding subtitles

@pandoraeeris7860 Ай бұрын

I, for one, welcome our agentic, robotic overlords! 🤖

@Sindigo-ic6xq Ай бұрын

you are everywhere i am haha

@GNARGNARHEAD Ай бұрын

"I'd like to remind them, as a trusted TV personality I can be helpful in rounding up others" - Hanna Fry 😂

@MarkWheels00 Ай бұрын

Not funny. This is a genuine risk

@GaminHasard Ай бұрын

Nah. It is Neo feudalism starting point. New social contract needed.

@teydam_2020 Ай бұрын

Amazing. Im a podcast-hater, but I listen to this and a few UCLA podcasts, so worth it.

@180robon 2 күн бұрын

Ditto on Hannah! I could listen to her read code

@Feel_theagi Ай бұрын

PC agents seems like the next big win. So many business processes are carried out on computer.

@En1Gm4A Ай бұрын

We need a language middle layer for agents created and used by LLMs. Basically the symbolic language has to be created by an LLM and than used by one. That's the Agent setup. That way they will come up with e=mc2 eventually.

@ahmedtremo Ай бұрын

Thanks for sharing the knowledge with the world!

@radicalrodriguez5912 Ай бұрын

Best new model for a while

@escapingthmatrix Ай бұрын

Love the conversation, the crispiness of the audio really brings it in. What type of microphones are you all using?

@mikaelcodes Ай бұрын

Such a dearth of good interviewers in the world.

@hunterkudo9832 Ай бұрын

Great interview.

@ZanDatsu Ай бұрын

How long before I can complete Portal 2 co-op mode with an AI partner. I feel like that should be a benchmark.

@grekiki Ай бұрын

Sounds like it might be hard to not make it too good.

@ZanDatsu Ай бұрын

@@grekiki If the AI is programmed specifically for Portal 2, sure it would be. I more meant an AI that has the kind of situational understanding and adaptability that a game like Portal 2 would require, without being specifically trained for one game. It would be closer to a real AGI at that point and Portal 2 would be a good benchmark to measure progress, IMO.

@jayk9068 Ай бұрын

New benchmark for the Turing test!

@ulriknash5502 26 күн бұрын

Wow, the idea mentioned at around 22:45 is like the adding of sensors to the AI central nervous system. Right now, the sensory input for AI is the data on the internet. But why not gather data from live cameras, microphones, and other devices, and let the AI learn from this? Where these devices are pointed can be preset, or may become part of the learning process.

@WhatIsRealAnymore 11 күн бұрын

This is the idea behind embodied AI. In other words. Robots. That will be 2025.

@bhavtosh5328 Ай бұрын

AGI is closer but they don't want to say it directly.

@dr.mikeybee Ай бұрын

It makes sense to translate speech to text before trying to learn from video. Using the correct abstract representation is important.

@LiamHayes-c4y Ай бұрын

Great interview!

@polabadiaconejos3251 Ай бұрын

It's inspiring to see catalans in these research positions :)

@USONOFAV 29 күн бұрын

Gemini 2.0 with great prompt filter. Everything goes to this filter and alter or augment the prompt if needed

@BrianMosleyUK Ай бұрын

47:02 what do we mean by superintelligence really? Add strong reasoning to the amazing scale of memory and inference that we have now, and surely we are there? Perhaps the ability to continue learning as the test time compute generates new realisations that aren't in the training set?

@mohamedkarim-p7j Ай бұрын

Thank for sharing👍

@BlackHermit Ай бұрын

The subtitles are so good!

@ilovetrees-k1i Ай бұрын

Please kindly paste the address of the primer about AI agent you mentioned at the beginning，Thx😊

@user-mj2lm5fh1j Ай бұрын

Amazing insights. But we have a long way to go. I will discover AGI my friend!! I will be back to this comment after doing this discovery.

@dr.mikeybee Ай бұрын

This was a very interesting talk. I learned some things.

@ravindersyal6613 Ай бұрын

A model trained on a video about out found truth could be trained on reward for the particular ground truth like e=mc square

@adamgibbons4262 19 күн бұрын

We already have algorithms for data compression. It sounds like we need algorithms for data explosion. Translating into a range of languages for example. Or converting metrics?

@charliekelland7564 21 күн бұрын

I feel like we are missing a trick if we expect models to generalize truth-grounded concepts (like physical laws) without embodiment and superhuman senses (eg Lidar). I'm surprised that this was not mentioned - is it considered too dangerous? Might it not also help machines to learn human perspectives (even 'appreciate' human values?) if they were embodied in a human form endowed with human-level senses?

@HershD 15 күн бұрын

AGI is here but its expensive, next mission is to scale it to make it accessible on most devices for most people

@arinco3817 Ай бұрын

Epic interview

@johnkintree763 Ай бұрын

Yes, language models, with humans in the loop, can extract knowledge and sentiment from unstructured input such as conversations, and store the fact checked statements in a shared graph representation, becoming a form of collective terrestrial intelligence.

@ThoughtfulAl Ай бұрын

Excellent, for me. Thanks

@LaboriousCretin Ай бұрын

Thank you for sharing the video. Why have you not linked the chat bot/A.I. to a avatar? Even if just shoulders and head that could work with cell phones also. Voice libraries or customized voice options. Deep knowledge sets to draw from. ( Google scholar, ArXiv, ect. ) . Reasoning modeling, chains of thought, predictive modeling, psychological modeling, world modeling of types, ect.. Keep up the good work.

@john_dren Ай бұрын

The equation to human interpretation will solve the next stage of AI evolution

@GrowStackAi Ай бұрын

AI bridges the gap between dreams and reality ✨

@AskarAituovFamily-l2d Ай бұрын

Whats the difference between bot and agent?

@En1Gm4A Ай бұрын

Subjective things will be hard to capture or things we came up to be great like music or art, which gained popularity by social dynamics

@antoniobortoni Ай бұрын

Hey geniuses, here’s a thought: we don’t think in text, right? Our minds process the world through audio, emotion, and context. So, what if we designed an AI model that doesn’t rely on text as its core but instead thinks in audio and context? A model like this could be trained with richer, multimodal inputs-audio, environmental cues, and simplified contextual relationships-to truly "think" more like a human. Such a model wouldn’t just generate text; it could produce audio responses or even work directly with sound and context to make decisions. Imagine it analyzing tone, pauses, and environmental noise while responding naturally in real time. It could be more intuitive, faster, and closer to how our brains actually process information-directly and efficiently, skipping the symbolic conversion of text. Why stick to text? Text processing requires converting symbols into audio, then turning that audio into meaning and context. It’s a multi-step process, wasting energy at every stage. If you want true efficiency, skip the text. Train AI to think and respond directly in audio and context-it’s faster, simpler, and more aligned with human cognition. Thoughts? Could this be the next leap for general AI?

@robertfloyddugger4516 Ай бұрын

I think you have a great idea with a poor attitude. What if this is the first step and your idea is the 4th or 5th?

@robertfloyddugger4516 Ай бұрын

Sorry after rereading it's mostly your opening that triggered my response. But I still implore this is a great concept.

@Jobox05 Ай бұрын

You are on the right track, though others have had this idea too, and its the basis of the multi modal models we have today. The main reason text has had so much focus is that there is a lot of it easily accessible out there, plus that data has shown to be something these models can very easily generalize from. That aside, there is a lot of merit to the idea that before the output, models dont think as text, rather just as a Cloud of firing wheights that are a more abstract form of meaning, just like human neurons.

@dr.mikeybee Ай бұрын

When doing reenforcement learning, the criteria used for assessment must be holistic.

@lorenzoleongutierrez7927 Ай бұрын

She is amazing

@sombh1971 Ай бұрын

26:03 Of course this method is not generally applicable, but consider its utility in things which are much less subjective than judging the aesthetic value of a poem, like assessing answers to scientific questions, and it really comes into its own. Things that are subjective are best left to themselves for such things don't have clear cut answers in any case, what might be a good poem to you may not be for me. Regarding the reward hacking issue, it's not applicable to things which have clear cut objective answers.

@shawnfromportland Ай бұрын

Bro you got people sweating under those bright ass set lights 😂

@NeoRelic-o8p Ай бұрын

🔥❤️🔥

@yw1971 Ай бұрын

What's so Drastic? (Or as the Joker would say - Why so drasic?)

@gatorskin1328 21 күн бұрын

I look forward to listening to this and I will . But man I think this was released way too soon! I have gone back to Google assistant on my Pixel phone and would take Gemini off completely if I could. I use chat GPT everyday and pay a subscription fee which I consider a bargain. Have used Pixels since they first came out and Google is my infrastructure. Right now this thing Stinks!

@bro_dBow 29 күн бұрын

MIT has worked on spacicial data syntax as a model to learn that would do math and physics in our spacial world. I struggle to explain. Can you comment?

@cacogenicist Ай бұрын

So, when's 2.0 Pro coming? 😊 Seems like you really don't want to talk about that. There's no shortage of data in the actual physical universe. We're going to need robots with sophisticated sensors. Until you're getting huge amounts of sensory data from domestic androids and such, I'm guessing the really significant improvements will come from assembling narrower models in the right way, in a modular architecture, with memory storage and management components, along with the domain specific modules.

@LeoRizoLeon Ай бұрын

Wait he basically said we are at agi (more or less). This guy doesn't hype stuff up. This is a big claim coming from him.

@MichealScott24 Ай бұрын

❤

@theforeigner6988 Ай бұрын

Oriol = Oрёл = Eagle 🦅

@harriemeeuwis978 Ай бұрын

It would be nice if Gemini was developed to provide what I want and not what Google wants me to want. That's annoying.

@dr.mikeybee Ай бұрын

No. We are seeing diminishing returns from scale because we are reaching advanced human level. These models can't learn patterns that aren't in the training data.

@alexandermoody1946 Ай бұрын

The future of data will no doubt result in any organisation that has any willing to sell as a new form of commodity. Video data is a very broad topic so let's assess this platforms video data to start with. KZbin's video content is highly edited as is other produced video in the form of television and film and this provides little consistency to be used in training. As a polar comparison surveillance data from video recording systems lacks obvious contextual commentary from those within the video footage. So whilst raw surveillance footage may be most suitable to expect returns in training by comparison to heavily edited KZbin content. KZbin does have the advantage of participation by other people in the form of commentary and this is both an assistance and not depending on the author of the comments willingness to contribute substance to the video. The worry with synthetic data is that of non novel examples leading to a uniform type of data that would be opposite to really intuitive unique examples. If model collapses occur or not depends on what data is allowed to be regurgitated. As a child playing a game called Chinese whispers ( no racism is intend) the message would often become corrupted very quickly. So novel layers of understanding will be required to be added to circumvent any collapse possibilities. Which leads back to video use. What happens when an organisation has been sold data with people's visual images incorporated into the data even when anonymity was granted to the individuals? Within a very short time those individuals will be identified because the real use of surveillance video footage is to see novel or unique interactions. Is it even right to sell data with identifiable human biometric data entwined? Is this also an infringement to the human right to a private life? The really wise caveat to this is that surveillance data could be golden data if annotation and curation was encouraged or rewarded by any individual that is included as a minimum and even opened up to others for annotation and curation. It would be far more useful to find out for instance why a scenario has unfolded in the way that it did and the circumstances surrounding. For instance if a shop surveillance system saw an upset child perhaps there are methodical compounding reasons for that situation that would be best served by explanation. Just this evening I had a disaggeeement with my child over if she was allowed a toy or not and being this close to Christmas the answer was no which she protested. So deeper meaningful expressions imply far reaching outcomes both for the quality of data and those that are learning and using the data in training or marketing campaigns. The question of sharing knowledge is to be a hard fought battle and how long will non consensual data use be acceptable without any idea of remedy in replacement. I believe that just as humans worked towards interactions on the Internet we cannot towards creating a paradigm shift for data production by intent. The idea of a creative meaning or imagination based currency systems tied into a blockchain architecture would go some way to aiding machine learning whilst also providing provenance and a role for humans after such sweeping attacks on employment are implemented and only reliance on some sanctioned social welfare as an alternative exists. Human require purpose in their lives and I am sure artificial intelligence and robots would like exponentials in understanding and this understanding can be exhibited by human interactions, imaginations and expressions of meaning in life. We are coming into an age where tools like Genie will facilitate the creation of digital domains that should be shared and also trained from with rewarding and incentivised participation this would be transformative, along with a social contract that rewards for interactions rather than encourages fear of privacy loss or infringement. Perhaps machine learning companies may not wish to pay people for participation in data production but they will be willing to buy data from organisations and this will never be as viable without consent.

@alexandermoody1946 Ай бұрын

When golden data production is possible and tied to a cryptographic proof of work full provenance can be attained. Any set of blocks of data can be chosen to tailor the exact quality of the training set and full traceability is possible. Examples for any combination of exhibitions of value can be produced and any abnormalities can be easily removed as now easily possible. Whilst at the moment the Internet is available and can be scraped once intelligent machines start to characterise each part of the Internet it makes absolute sense to sort the data into blocks so that each sweep of the scrapers are only adding to preexisting blocks or completely new blocks are created or forked from previous blocks. The Internet was not designed for training machines but when we design blocks of data purposefully for that reason the outcomes would be more precise and accurate. We are reaching a position in time when examples can be produced with as little as a mobile phone anywhere in the world and accessible to all. If you wish to understand a diversity of thoughts is required, the greater the expansive examples of perception the greater the accuracy of suitable answers will become. When building anything the quality of the materials used is of high consideration. How we shape the material is of equal importance and what we build may stand through the ages and prove the builders talents as victorious.

@Fordance100 Ай бұрын

Yea, I think they will charge a lot money from the website that just automatically opens.

@ethansk3613 Ай бұрын

this guy is fkn great

@tä̇̃ Ай бұрын

She feels like an AI, its kind of uncanny. I thought you were showing your new AI speech...

@neocephalon Ай бұрын

wtf that's what I've been trying to do

@gaminglikeapro2104 Ай бұрын

23:30 : Of course not. Models do NOT understand anything. Never did and never will. They pick up patterns in the videos or text captions...etc. The idea that they can suddenly start producing discoveries based on what they saw in these videos is laughable.

@G.G_ Ай бұрын

#381

@YashaPezxman Ай бұрын

I can fix the reasoning problem for you the reason why your models are lacking in reasoning stuff I know they are good they're pretty good but they're not comparable with human brain human brain is much more capable and reasoning stuff the reason is you are giving the large language model of yours text input I mean your build station is a large language model not a no name neural network if you want human level reasoning you should build your visual and audio neural networks to work with just numbers then the output of numbers should send it to the last enrollment work and the last number I want to work should have no name it's not it's not going to work by text it should work with the exact numbers that has been and the last name was Fortune to figure out what to do with those numbers to get desired output designing by reinforcement learning I mean give pleasure for desired output and give pain for not desired output

@reluctantrealist6861 Ай бұрын

Why is this woman everywhere?

@svenhoek Ай бұрын

Because she is an excellent communicator for the topics like this

@reluctantrealist6861 Ай бұрын

@@svenhoek "hip science woman"

@J3R3MI6 Ай бұрын

She’s excellent

@MarkWheels00 Ай бұрын

Professor Fry, respectfully, what the hell are you doing? Why are you cheerleading this global arms race? Alignment is unsolved! Your first question, the starting point, has to be safety. We must pause AI development, especially agent development, until International agreements are in place to ensure safety. If you disagree, please explain why.

@CedarGroveOrganicFarm Ай бұрын

(Hello, I am obviously not Professor Fry, but I am going to respond to your question anyway) So, I understand the concern for AI safety. This technology has the potential to run away and destroy the world, a la some kind of Stargate nanobot situation. But for me, it is also a paradoxical scenario. Bare with me -- If we don't *rapidly* alter our resource-use patterns on Earth (talking climate change here), we will destroy the world. The pace of conventional politics, which is arguably a significant component of the executive global human decision making, is not capable of making the change, at the pace we need it, fast enough. Left to this method alone, we will destroy the world. AI's, and technology at large, but AI's because of their seductive promise of recursive improvement, are the first viable tool to actually address socioenvironmental issues with the speed and effectiveness that is required to discover and implement the sweeping changes required to mitigate climate change, before we destroy the world. But therein lies the rub -- AI's consume significant amounts of energy, AI's might decide to kill off humans, deeming them a threat to the planet (and themselves) AI's might never reach a solution to this socially universal issue at all. All of these outcomes could also destroy the world. The same way that nuclear fission can generate clean(ish) power, while simultaneously having the potential for mass destruction, so too is AI a double edged sword. On the one hand, we might die during the training-run of an AI that could solve climatic issues (which as an aside, I feel are a shared root of all socioeconomic issues), on the other hand we will die anyways if we don't try. So we are brought back to Pascal's wager, in modern times. That is why AI safety isn't that important.

@svenhoek Ай бұрын

I would be very careful of AI bigoted commentary. 😅

@CedarGroveOrganicFarm Ай бұрын

@@svenhoek what do you mean?

@John-sd5li Ай бұрын

I'm sure we still haven't gotten safe alignment in nuclear arm race too. Welcome to this brutal world buddy.

@MarkWheels00 Ай бұрын

@@John-sd5li The nuclear weapons don’t operate themselves. Different issue

@bingeltube Ай бұрын

Please summarize video to under 20 minutes! Video too long; did not watch!

@gustinian Ай бұрын

Your attention span needs work.

@ApolloGemini11 Ай бұрын

Attention is all you need.

@aguzman222 28 күн бұрын

Really - set the playback to 1.5 or 2 - this is priceless information

@bingeltube 28 күн бұрын

@@aguzman222 Thanks, but I usually watch basically all KZbin videos (except e.g. music videos) at 1.5 speed. Plus, I am also familiar with Oriol Vinyals work.