Oriol is genius, but, for a moment, I'd love to acknowledge Hannah, specifically her JOY in receiving this geeky information (that we all love), making it so accessible, and orchestrating the flow of this conversation. Kudos. Keep radiating that jubilant smile... the breath of fresh air!
@AlternativeTakesАй бұрын
She is def very good
@SevenHeavenlyigАй бұрын
Yeah i also liked how she was asking counter questions
@Crux69Ай бұрын
I keep thinking the NotebookLM podcaster is modeled after her :D
@gustinianАй бұрын
Hannah is such a genuinely outstanding interviewer; she has that rare combination of charisma, intelligence, wit and infectious enthusiastic curiosity.
@loucasiАй бұрын
She’s so smooth in her interview style. Amazing work
@ricopagsАй бұрын
agreed, she's fantastic for this. she comes across as genuinely curious and passionate about learning
@John-sd5liАй бұрын
Best presenter I have ever seen, she really knew what she said and actively engage in the conversation. Oriol Vinyals is great, good scientist, he don't fail into the hype cycle like many AI's influencer (Stare at you Sam) and give us very clear picture of what is going on now.
@OyvindSOyvindSАй бұрын
These are the best podcasts on the net. It’s so great to witness a host so knowledgable and intelligent, asking questions to get good answers, rather than trying to show off own knowledge.
@josephbarney452313 күн бұрын
I love these podcasts from Google. I make good use of Gemini on my phone and am happy each time it gets smarter. Great presenter and the scientist helping design Gemini is a genius!
@drhxaАй бұрын
Hannah's amazing at this. Thanks for sharing, it's fascinating
@JustinHalfordАй бұрын
A wonderful, clear, and compelling window into an exciting future. Kudos to both Hannah and Oriol.
@lightheartedblogАй бұрын
I like how Hannah breaks down complex topics into simplest of terms and her curiosity is so engaging!!!
@john_drenАй бұрын
cant recall when's the last time I was this excited
@stevenpham6734Ай бұрын
It's when you came too early
@PaulMarshall13 күн бұрын
Really enjoyed the conversation and appreciated the insights into the advancements beyond the foundation models. Thank you
@piasetzkyАй бұрын
It is a rare thing to watch such an amazing interviewer. Very interesting clip, yet even more thanks to the professor ;)
@EchoYoutubeАй бұрын
Ngl Google, just waiting for Live Video with Astra. Agents are awesome, but I make home use robots and repair cars.. so camera would be more versatile for a hands on help than agents. This IS really cool and helpful for a vast amount of people; I commend you guys.
Ай бұрын
In AI studio, you can test "stream realtime" and stream your camera to Gemini 2.0 flash. Worth a try!
@TrishanPanchАй бұрын
This is the single best podcast series anywhere.
@mrpicky1868Ай бұрын
it's not. she is intentionally very soft of Doom side of AGI
@markfitz8315Ай бұрын
Get a pen out - take some notes - explore the definitions - a great primer for concepts and analogies. Beyond that, for AI enthusiasts like me, this is a cracking conversation that’s worth a few listens. Good job.
@En1Gm4AАй бұрын
Thanks a lot I really enjoy these podcasts. 🙏
@DimanjanDahalАй бұрын
Hanna Fry the best asset of Deepmind
@RaviAnnaswamyАй бұрын
Listening to this after Ilyas neurips talk this is so much more humble and detailed with tons of new insights and ideas that one can pursue. Ilya might still create another groundbreaking gpt like innovation for sure but the level of innovation engineering and then integration reminds us google ecosystem is so vibrant which we had forgotten for some time They were just steering their ship last couple of years and seem to be catching up if not overtaking on innovation scale
@Hamzairshad5Ай бұрын
Thank you for adding subtitles
@pandoraeeris7860Ай бұрын
I, for one, welcome our agentic, robotic overlords! 🤖
@Sindigo-ic6xqАй бұрын
you are everywhere i am haha
@GNARGNARHEADАй бұрын
"I'd like to remind them, as a trusted TV personality I can be helpful in rounding up others" - Hanna Fry 😂
@MarkWheels00Ай бұрын
Not funny. This is a genuine risk
@GaminHasardАй бұрын
Nah. It is Neo feudalism starting point. New social contract needed.
@teydam_2020Ай бұрын
Amazing. Im a podcast-hater, but I listen to this and a few UCLA podcasts, so worth it.
@180robon2 күн бұрын
Ditto on Hannah! I could listen to her read code
@Feel_theagiАй бұрын
PC agents seems like the next big win. So many business processes are carried out on computer.
@En1Gm4AАй бұрын
We need a language middle layer for agents created and used by LLMs. Basically the symbolic language has to be created by an LLM and than used by one. That's the Agent setup. That way they will come up with e=mc2 eventually.
@ahmedtremoАй бұрын
Thanks for sharing the knowledge with the world!
@radicalrodriguez5912Ай бұрын
Best new model for a while
@escapingthmatrixАй бұрын
Love the conversation, the crispiness of the audio really brings it in. What type of microphones are you all using?
@mikaelcodesАй бұрын
Such a dearth of good interviewers in the world.
@hunterkudo9832Ай бұрын
Great interview.
@ZanDatsuАй бұрын
How long before I can complete Portal 2 co-op mode with an AI partner. I feel like that should be a benchmark.
@grekikiАй бұрын
Sounds like it might be hard to not make it too good.
@ZanDatsuАй бұрын
@@grekiki If the AI is programmed specifically for Portal 2, sure it would be. I more meant an AI that has the kind of situational understanding and adaptability that a game like Portal 2 would require, without being specifically trained for one game. It would be closer to a real AGI at that point and Portal 2 would be a good benchmark to measure progress, IMO.
@jayk9068Ай бұрын
New benchmark for the Turing test!
@ulriknash550226 күн бұрын
Wow, the idea mentioned at around 22:45 is like the adding of sensors to the AI central nervous system. Right now, the sensory input for AI is the data on the internet. But why not gather data from live cameras, microphones, and other devices, and let the AI learn from this? Where these devices are pointed can be preset, or may become part of the learning process.
@WhatIsRealAnymore11 күн бұрын
This is the idea behind embodied AI. In other words. Robots. That will be 2025.
@bhavtosh5328Ай бұрын
AGI is closer but they don't want to say it directly.
@dr.mikeybeeАй бұрын
It makes sense to translate speech to text before trying to learn from video. Using the correct abstract representation is important.
@LiamHayes-c4yАй бұрын
Great interview!
@polabadiaconejos3251Ай бұрын
It's inspiring to see catalans in these research positions :)
@USONOFAV29 күн бұрын
Gemini 2.0 with great prompt filter. Everything goes to this filter and alter or augment the prompt if needed
@BrianMosleyUKАй бұрын
47:02 what do we mean by superintelligence really? Add strong reasoning to the amazing scale of memory and inference that we have now, and surely we are there? Perhaps the ability to continue learning as the test time compute generates new realisations that aren't in the training set?
@mohamedkarim-p7jАй бұрын
Thank for sharing👍
@BlackHermitАй бұрын
The subtitles are so good!
@ilovetrees-k1iАй бұрын
Please kindly paste the address of the primer about AI agent you mentioned at the beginning,Thx😊
@user-mj2lm5fh1jАй бұрын
Amazing insights. But we have a long way to go. I will discover AGI my friend!! I will be back to this comment after doing this discovery.
@dr.mikeybeeАй бұрын
This was a very interesting talk. I learned some things.
@ravindersyal6613Ай бұрын
A model trained on a video about out found truth could be trained on reward for the particular ground truth like e=mc square
@adamgibbons426219 күн бұрын
We already have algorithms for data compression. It sounds like we need algorithms for data explosion. Translating into a range of languages for example. Or converting metrics?
@charliekelland756421 күн бұрын
I feel like we are missing a trick if we expect models to generalize truth-grounded concepts (like physical laws) without embodiment and superhuman senses (eg Lidar). I'm surprised that this was not mentioned - is it considered too dangerous? Might it not also help machines to learn human perspectives (even 'appreciate' human values?) if they were embodied in a human form endowed with human-level senses?
@HershD15 күн бұрын
AGI is here but its expensive, next mission is to scale it to make it accessible on most devices for most people
@arinco3817Ай бұрын
Epic interview
@johnkintree763Ай бұрын
Yes, language models, with humans in the loop, can extract knowledge and sentiment from unstructured input such as conversations, and store the fact checked statements in a shared graph representation, becoming a form of collective terrestrial intelligence.
@ThoughtfulAlАй бұрын
Excellent, for me. Thanks
@LaboriousCretinАй бұрын
Thank you for sharing the video. Why have you not linked the chat bot/A.I. to a avatar? Even if just shoulders and head that could work with cell phones also. Voice libraries or customized voice options. Deep knowledge sets to draw from. ( Google scholar, ArXiv, ect. ) . Reasoning modeling, chains of thought, predictive modeling, psychological modeling, world modeling of types, ect.. Keep up the good work.
@john_drenАй бұрын
The equation to human interpretation will solve the next stage of AI evolution
@GrowStackAiАй бұрын
AI bridges the gap between dreams and reality ✨
@AskarAituovFamily-l2dАй бұрын
Whats the difference between bot and agent?
@En1Gm4AАй бұрын
Subjective things will be hard to capture or things we came up to be great like music or art, which gained popularity by social dynamics
@antoniobortoniАй бұрын
Hey geniuses, here’s a thought: we don’t think in text, right? Our minds process the world through audio, emotion, and context. So, what if we designed an AI model that doesn’t rely on text as its core but instead thinks in audio and context? A model like this could be trained with richer, multimodal inputs-audio, environmental cues, and simplified contextual relationships-to truly "think" more like a human. Such a model wouldn’t just generate text; it could produce audio responses or even work directly with sound and context to make decisions. Imagine it analyzing tone, pauses, and environmental noise while responding naturally in real time. It could be more intuitive, faster, and closer to how our brains actually process information-directly and efficiently, skipping the symbolic conversion of text. Why stick to text? Text processing requires converting symbols into audio, then turning that audio into meaning and context. It’s a multi-step process, wasting energy at every stage. If you want true efficiency, skip the text. Train AI to think and respond directly in audio and context-it’s faster, simpler, and more aligned with human cognition. Thoughts? Could this be the next leap for general AI?
@robertfloyddugger4516Ай бұрын
I think you have a great idea with a poor attitude. What if this is the first step and your idea is the 4th or 5th?
@robertfloyddugger4516Ай бұрын
Sorry after rereading it's mostly your opening that triggered my response. But I still implore this is a great concept.
@Jobox05Ай бұрын
You are on the right track, though others have had this idea too, and its the basis of the multi modal models we have today. The main reason text has had so much focus is that there is a lot of it easily accessible out there, plus that data has shown to be something these models can very easily generalize from. That aside, there is a lot of merit to the idea that before the output, models dont think as text, rather just as a Cloud of firing wheights that are a more abstract form of meaning, just like human neurons.
@dr.mikeybeeАй бұрын
When doing reenforcement learning, the criteria used for assessment must be holistic.
@lorenzoleongutierrez7927Ай бұрын
She is amazing
@sombh1971Ай бұрын
26:03 Of course this method is not generally applicable, but consider its utility in things which are much less subjective than judging the aesthetic value of a poem, like assessing answers to scientific questions, and it really comes into its own. Things that are subjective are best left to themselves for such things don't have clear cut answers in any case, what might be a good poem to you may not be for me. Regarding the reward hacking issue, it's not applicable to things which have clear cut objective answers.
@shawnfromportlandАй бұрын
Bro you got people sweating under those bright ass set lights 😂
@NeoRelic-o8pАй бұрын
🔥❤️🔥
@yw1971Ай бұрын
What's so Drastic? (Or as the Joker would say - Why so drasic?)
@gatorskin132821 күн бұрын
I look forward to listening to this and I will . But man I think this was released way too soon! I have gone back to Google assistant on my Pixel phone and would take Gemini off completely if I could. I use chat GPT everyday and pay a subscription fee which I consider a bargain. Have used Pixels since they first came out and Google is my infrastructure. Right now this thing Stinks!
@bro_dBow29 күн бұрын
MIT has worked on spacicial data syntax as a model to learn that would do math and physics in our spacial world. I struggle to explain. Can you comment?
@cacogenicistАй бұрын
So, when's 2.0 Pro coming? 😊 Seems like you really don't want to talk about that. There's no shortage of data in the actual physical universe. We're going to need robots with sophisticated sensors. Until you're getting huge amounts of sensory data from domestic androids and such, I'm guessing the really significant improvements will come from assembling narrower models in the right way, in a modular architecture, with memory storage and management components, along with the domain specific modules.
@LeoRizoLeonАй бұрын
Wait he basically said we are at agi (more or less). This guy doesn't hype stuff up. This is a big claim coming from him.
@MichealScott24Ай бұрын
❤
@theforeigner6988Ай бұрын
Oriol = Oрёл = Eagle 🦅
@harriemeeuwis978Ай бұрын
It would be nice if Gemini was developed to provide what I want and not what Google wants me to want. That's annoying.
@dr.mikeybeeАй бұрын
No. We are seeing diminishing returns from scale because we are reaching advanced human level. These models can't learn patterns that aren't in the training data.
@alexandermoody1946Ай бұрын
The future of data will no doubt result in any organisation that has any willing to sell as a new form of commodity. Video data is a very broad topic so let's assess this platforms video data to start with. KZbin's video content is highly edited as is other produced video in the form of television and film and this provides little consistency to be used in training. As a polar comparison surveillance data from video recording systems lacks obvious contextual commentary from those within the video footage. So whilst raw surveillance footage may be most suitable to expect returns in training by comparison to heavily edited KZbin content. KZbin does have the advantage of participation by other people in the form of commentary and this is both an assistance and not depending on the author of the comments willingness to contribute substance to the video. The worry with synthetic data is that of non novel examples leading to a uniform type of data that would be opposite to really intuitive unique examples. If model collapses occur or not depends on what data is allowed to be regurgitated. As a child playing a game called Chinese whispers ( no racism is intend) the message would often become corrupted very quickly. So novel layers of understanding will be required to be added to circumvent any collapse possibilities. Which leads back to video use. What happens when an organisation has been sold data with people's visual images incorporated into the data even when anonymity was granted to the individuals? Within a very short time those individuals will be identified because the real use of surveillance video footage is to see novel or unique interactions. Is it even right to sell data with identifiable human biometric data entwined? Is this also an infringement to the human right to a private life? The really wise caveat to this is that surveillance data could be golden data if annotation and curation was encouraged or rewarded by any individual that is included as a minimum and even opened up to others for annotation and curation. It would be far more useful to find out for instance why a scenario has unfolded in the way that it did and the circumstances surrounding. For instance if a shop surveillance system saw an upset child perhaps there are methodical compounding reasons for that situation that would be best served by explanation. Just this evening I had a disaggeeement with my child over if she was allowed a toy or not and being this close to Christmas the answer was no which she protested. So deeper meaningful expressions imply far reaching outcomes both for the quality of data and those that are learning and using the data in training or marketing campaigns. The question of sharing knowledge is to be a hard fought battle and how long will non consensual data use be acceptable without any idea of remedy in replacement. I believe that just as humans worked towards interactions on the Internet we cannot towards creating a paradigm shift for data production by intent. The idea of a creative meaning or imagination based currency systems tied into a blockchain architecture would go some way to aiding machine learning whilst also providing provenance and a role for humans after such sweeping attacks on employment are implemented and only reliance on some sanctioned social welfare as an alternative exists. Human require purpose in their lives and I am sure artificial intelligence and robots would like exponentials in understanding and this understanding can be exhibited by human interactions, imaginations and expressions of meaning in life. We are coming into an age where tools like Genie will facilitate the creation of digital domains that should be shared and also trained from with rewarding and incentivised participation this would be transformative, along with a social contract that rewards for interactions rather than encourages fear of privacy loss or infringement. Perhaps machine learning companies may not wish to pay people for participation in data production but they will be willing to buy data from organisations and this will never be as viable without consent.
@alexandermoody1946Ай бұрын
When golden data production is possible and tied to a cryptographic proof of work full provenance can be attained. Any set of blocks of data can be chosen to tailor the exact quality of the training set and full traceability is possible. Examples for any combination of exhibitions of value can be produced and any abnormalities can be easily removed as now easily possible. Whilst at the moment the Internet is available and can be scraped once intelligent machines start to characterise each part of the Internet it makes absolute sense to sort the data into blocks so that each sweep of the scrapers are only adding to preexisting blocks or completely new blocks are created or forked from previous blocks. The Internet was not designed for training machines but when we design blocks of data purposefully for that reason the outcomes would be more precise and accurate. We are reaching a position in time when examples can be produced with as little as a mobile phone anywhere in the world and accessible to all. If you wish to understand a diversity of thoughts is required, the greater the expansive examples of perception the greater the accuracy of suitable answers will become. When building anything the quality of the materials used is of high consideration. How we shape the material is of equal importance and what we build may stand through the ages and prove the builders talents as victorious.
@Fordance100Ай бұрын
Yea, I think they will charge a lot money from the website that just automatically opens.
@ethansk3613Ай бұрын
this guy is fkn great
@tä̇̃Ай бұрын
She feels like an AI, its kind of uncanny. I thought you were showing your new AI speech...
@neocephalonАй бұрын
wtf that's what I've been trying to do
@gaminglikeapro2104Ай бұрын
23:30 : Of course not. Models do NOT understand anything. Never did and never will. They pick up patterns in the videos or text captions...etc. The idea that they can suddenly start producing discoveries based on what they saw in these videos is laughable.
@G.G_Ай бұрын
#381
@YashaPezxmanАй бұрын
I can fix the reasoning problem for you the reason why your models are lacking in reasoning stuff I know they are good they're pretty good but they're not comparable with human brain human brain is much more capable and reasoning stuff the reason is you are giving the large language model of yours text input I mean your build station is a large language model not a no name neural network if you want human level reasoning you should build your visual and audio neural networks to work with just numbers then the output of numbers should send it to the last enrollment work and the last number I want to work should have no name it's not it's not going to work by text it should work with the exact numbers that has been and the last name was Fortune to figure out what to do with those numbers to get desired output designing by reinforcement learning I mean give pleasure for desired output and give pain for not desired output
@reluctantrealist6861Ай бұрын
Why is this woman everywhere?
@svenhoekАй бұрын
Because she is an excellent communicator for the topics like this
@reluctantrealist6861Ай бұрын
@@svenhoek "hip science woman"
@J3R3MI6Ай бұрын
She’s excellent
@MarkWheels00Ай бұрын
Professor Fry, respectfully, what the hell are you doing? Why are you cheerleading this global arms race? Alignment is unsolved! Your first question, the starting point, has to be safety. We must pause AI development, especially agent development, until International agreements are in place to ensure safety. If you disagree, please explain why.
@CedarGroveOrganicFarmАй бұрын
(Hello, I am obviously not Professor Fry, but I am going to respond to your question anyway) So, I understand the concern for AI safety. This technology has the potential to run away and destroy the world, a la some kind of Stargate nanobot situation. But for me, it is also a paradoxical scenario. Bare with me -- If we don't *rapidly* alter our resource-use patterns on Earth (talking climate change here), we will destroy the world. The pace of conventional politics, which is arguably a significant component of the executive global human decision making, is not capable of making the change, at the pace we need it, fast enough. Left to this method alone, we will destroy the world. AI's, and technology at large, but AI's because of their seductive promise of recursive improvement, are the first viable tool to actually address socioenvironmental issues with the speed and effectiveness that is required to discover and implement the sweeping changes required to mitigate climate change, before we destroy the world. But therein lies the rub -- AI's consume significant amounts of energy, AI's might decide to kill off humans, deeming them a threat to the planet (and themselves) AI's might never reach a solution to this socially universal issue at all. All of these outcomes could also destroy the world. The same way that nuclear fission can generate clean(ish) power, while simultaneously having the potential for mass destruction, so too is AI a double edged sword. On the one hand, we might die during the training-run of an AI that could solve climatic issues (which as an aside, I feel are a shared root of all socioeconomic issues), on the other hand we will die anyways if we don't try. So we are brought back to Pascal's wager, in modern times. That is why AI safety isn't that important.
@svenhoekАй бұрын
I would be very careful of AI bigoted commentary. 😅
@CedarGroveOrganicFarmАй бұрын
@@svenhoek what do you mean?
@John-sd5liАй бұрын
I'm sure we still haven't gotten safe alignment in nuclear arm race too. Welcome to this brutal world buddy.
@MarkWheels00Ай бұрын
@@John-sd5li The nuclear weapons don’t operate themselves. Different issue
@bingeltubeАй бұрын
Please summarize video to under 20 minutes! Video too long; did not watch!
@gustinianАй бұрын
Your attention span needs work.
@ApolloGemini11Ай бұрын
Attention is all you need.
@aguzman22228 күн бұрын
Really - set the playback to 1.5 or 2 - this is priceless information
@bingeltube28 күн бұрын
@@aguzman222 Thanks, but I usually watch basically all KZbin videos (except e.g. music videos) at 1.5 speed. Plus, I am also familiar with Oriol Vinyals work.