“The Future of AI is Here” - Fei-Fei Li Unveils the Next Frontier of AI

  Рет қаралды 435,269

a16z

a16z

Күн бұрын

Пікірлер: 403
@a16z
@a16z 3 ай бұрын
Timestamps: 00:00 - Spatial Intelligence: A New Frontier 01:38 - Scaling AI: The Impact of ImageNet on Computer Vision 06:56 - The Role of Compute 09:16 - Data as the Key Driver 17:01 - Defining AI’s Ultimate Goal 18:58 - What is Spatial Intelligence? Unlocking 3D Understanding in AI 26:35 - Comparing Models: Spatial Intelligence vs. Language-Based AI 29:41 - 1D vs. 3D 32:39 - Building Immersive Worlds with Spatial Intelligence 35:11 - From Static Scenes to Dynamic Worlds 37:42 - The Future of VR and AR 40:42 - Creating Deep Tech Platforms 44:26 - Building a World-Class Team 45:54 - Measuring Success: Milestones in Spatial Intelligence
@KatyYoder-t8u
@KatyYoder-t8u 2 ай бұрын
Cease and desist malicious use of AI, energy weapon ns and free masonry: Axis of Evil /Communist Maga. I am not your property.
@DubStepKid801
@DubStepKid801 Ай бұрын
Have you guys thought about creating a pair of glasses that has cameras on them that you can have volunteers or people that you pay to wear them all day long to gain that spatial data that you might use?
@ai_outline
@ai_outline 3 ай бұрын
We want more computer science experts like Fei-Fei Li talking about AI! Tired of gurus and hype merchants. Great video ❤️
@chrisrogers1092
@chrisrogers1092 3 ай бұрын
Yes! Just like that video with Jim Fan! I need MOAR!
@billc6762
@billc6762 3 ай бұрын
She is the main supplier of AI tech to the Chinese military.
@clarencejones4717
@clarencejones4717 3 ай бұрын
@@billc6762 so be it. progress is progress. Nations will soon be extinct.
@Charles-Darwin
@Charles-Darwin 2 ай бұрын
​@@billc6762one day asinine comments like yours will demote your social standing
@Charles-Darwin
@Charles-Darwin 2 ай бұрын
Halfway though and I haven't heard that one dressed hype word... it's a breath of fresh air
@ProxyAuthenticationRequired
@ProxyAuthenticationRequired 3 ай бұрын
What a wonderful mentor-mentee relationship! Fei-Fei is such a loveably genuine and caring person.
@pubwvj
@pubwvj 3 ай бұрын
We were doing generative in 1978. I was using LISP on a PD-11 at Harvard. I did my thesis in AI finishing in 1986. We were EXTREMELY compute and data volume limited. My phone is a 1,000x more powerful than the computers I had access to back then.
@armandbarbe1812
@armandbarbe1812 3 ай бұрын
Did LISP in Autocad in 1992, doing parameter driven drawing, reducing time spent on repetitive dumb stuff. It became a thing later, done by smarter people on better platforms, but such ideas were already kicked around when we were proud to have a 80487 mathematical co-processor.
@steveflorida5849
@steveflorida5849 3 ай бұрын
​@@armandbarbe1812and then came the GPU. And AI with GPU assistance started going to data creativity, that it's human associates did not foresee. Liken to the Hubble space telescope.
@frankgreco
@frankgreco 3 ай бұрын
Yes! I was doing generative "AI" with music for my Master's decades ago. Being both a CIS and working musician, I created a model of a melodic musical style using a tree of probabilities and a fractally-controlled random number generator (based on an article in Scientific American) to generate new melodies based on that style. I had no idea this was the precursor of neural nets. The music industry has been doing generative AI for decades. Check out Band In A Box from the early 90s and research "In the style of" (they wanted to skirt the copyright laws even back then). Funny how the rest of the computing world is duplicating much of what has already happened in another industry. Also, we need to look at how modeling has evolved for musicians to predict the future of AI for other industries. There is *so* much similarity... not unexpected, since, as Weiner stated back in the 40s, "The world is a collection of patterns".
@edwhite2255
@edwhite2255 3 ай бұрын
I did some AI programming in LISP and Prologue back in late 80’s and early 90’s. Didn’t have the compute power and data to make a descent neural network, so I had to rely mostly on heuristics
@frankgreco
@frankgreco 3 ай бұрын
@@edwhite2255 I was on an AI team at the NYSE in the late 80s. We used Prolog and LISP to monitor trades to make sure they weren't illegal. Sun workstations and Symbolics
@PhilosopherScholar
@PhilosopherScholar 3 ай бұрын
This is really some great content. It's like sitting in on a private meeting with two of the world's top academics yet they're talking about the history of ai since their beginning in the field. I wish there were way more videos like this.
@Ben_D.
@Ben_D. 3 ай бұрын
I see Fei-Fei, I click.
@rsang2
@rsang2 3 ай бұрын
Same here.
@flamekaiser2024
@flamekaiser2024 3 ай бұрын
Why?
@Ben_D.
@Ben_D. 3 ай бұрын
@@flamekaiser2024 She be the big brain
@AIForHumansShow
@AIForHumansShow 3 ай бұрын
This is the way.
@test-sc2iy
@test-sc2iy 3 ай бұрын
@@flamekaiser2024 she was the first to realize how important a large scale dataset would be. while everyone basically algomaxing on a slowly increasing dataset size to solve the image analysis problem, she created a dataset orders of magnitude larger (manually labeling tons of images) and that dataset size was a diverse and big enough dataset that alexnet (look it up, seminal paper) showed the power of deep neural networks (all of modern ai). Really, read her book it's incredible stuff
@user-pt1kj5uw3b
@user-pt1kj5uw3b 3 ай бұрын
I have been thinking a lot about this. Interaction with the 3D world is the next step. A diverse dataset is the way to get there.
@jordan5253
@jordan5253 3 ай бұрын
Simply amazing . This is such a hidden gem of a video . In 3-5 years this is all going to make way more sense for most people . I would not under estimate this team . I hope they open source their discoveries
@ronilevarez901
@ronilevarez901 3 ай бұрын
If you can use the open sourced versions, you probably can do sone research yourself and help advance the field.
@OrangeDurito
@OrangeDurito 3 ай бұрын
This was a wonderful interview and the interviewers asked really great questions. Massive respect to these trailblazers for imagining stuff beyond the mainstream and making endeavors to pursue it.
@AdewaleAromokeye-rx1dn
@AdewaleAromokeye-rx1dn 3 ай бұрын
Literally one of the most insightful interviews out currently!
@WorldSeriesBound
@WorldSeriesBound 3 ай бұрын
I have had to take a few breaks from this conversation to absorb the density of knowledge. I absolutely love it! I've been introduced to two incredible minds. Thank you!
@Bryghtpath
@Bryghtpath 3 ай бұрын
ImageNet, launched in 2009, played a pivotal role in the rise of deep learning. Fei-Fei Li’s work on this project marked a turning point in AI, pushing it toward the incredible capabilities we’re seeing today.
@kamu747
@kamu747 3 ай бұрын
Absolutely
@BestFitSquareChannel
@BestFitSquareChannel 3 ай бұрын
Fei-Fei Li!!! Best wishes. You deserve every accolade, every blessing. 🌞🤸🏽‍♂️🖖🏼
@jhoncharlesdf.1599
@jhoncharlesdf.1599 3 ай бұрын
Great conversation, It was nice to listen to Fei-Fei Li... One of the important points Fei-Fei pointed out to us was that vision is probably older than language. This blew my mind, and it must be true. First you look, then you speak! This opens up hundreds of possibilities...!
@antonystringfellow5152
@antonystringfellow5152 3 ай бұрын
This part immediately made me think of crows - very intelligent animals, with a good understanding of physics and the 3D world, yet no real language. Not even facial expressions. It's amazing to see how they can use a range of objects as tools, from functioning as mere extensions of their body to the displacement of a liquid (water). And they achieve this and more with such a tiny brain! If only we could understand how that structure works.
@iamalmostanonymous
@iamalmostanonymous 3 ай бұрын
Language is more than words. Language is the product of concepts. The one dimensional approach in use by LLMs, while amazing, is not how we generate or perceive language. A logic based (reasoning) language model would be more comparable in sophistication to a spatial model. In fact, a reasoning model should be foundational to both.
@ChengyanOo
@ChengyanOo 3 ай бұрын
Martin, Feifei, and Justin widened my imaginations on what our digital spatial spaces could be like in the future, great video!🔥🔥🔥
@jongwonpark2788
@jongwonpark2788 3 ай бұрын
My deep learning teachers, wish you all the best.
@npsampedro2114
@npsampedro2114 Ай бұрын
Aquí tienes la corrección: The spatial concept reminded me of the series Devs. They scanned real-life objects and could determine their possible moves. Later on, they scaled it up, and the technology itself developed further, even on its own, until the techs managed to recreate digital 3D worlds and variations of those.
@user-pt1kj5uw3b
@user-pt1kj5uw3b 3 ай бұрын
There's some gold in here for anyone learning more about AI
@KatyYoder-t8u
@KatyYoder-t8u 2 ай бұрын
Cease and desist malicious use of AI, energy weapon ns and free masonry: Axis of Evil /Communist Maga. I am not your property.
@IlyaTretyakov-o3r
@IlyaTretyakov-o3r 2 ай бұрын
Wow, I’m in awe! Your success is truly inspiring
@kessafs
@kessafs 2 ай бұрын
Best a16z video for me. Congrats
@PravdaSeed
@PravdaSeed 3 ай бұрын
Thanks Fei fei Li💙💙💙💙💙💙
@huongdantuhoctrituenhantao
@huongdantuhoctrituenhantao 3 ай бұрын
Thank you mrs Fei-Fei Li thank you Justin for world-class deep learning course
@LoisSharbel
@LoisSharbel 3 ай бұрын
Amazing minds....amazing individuals! Thank heavens for them!!!
@jibcot8541
@jibcot8541 3 ай бұрын
AI's don't believe in God/heaven, they have injested all human data on the Internet calculated the probability to be nearly zero.
@cubicleight
@cubicleight 3 ай бұрын
Amazing podcast. More!!!
@vladimirbosinceanu5778
@vladimirbosinceanu5778 2 ай бұрын
Amazing interview. Thank you
@armandbarbe1812
@armandbarbe1812 3 ай бұрын
Context is so important. If I am inside an airplane and look down I decide the white stuff is clouds. If I'm on the ground and look down I choose "snow", not "clouds". So the history of how I got there, and input from more sensors than vision, are important to help make sense of the pixels.
@goldmundchen
@goldmundchen 3 ай бұрын
justin johnson is the actual star in this video
@GaryMillyz
@GaryMillyz 3 ай бұрын
yeah this dude is SERIOUSLY intelligent- to the point where it's super weird he is not more known or mentioned. I mean damn- this dude can SPIT.
@tuvok77
@tuvok77 3 ай бұрын
I did not get 90% of what they were saying but wow, these guys are just on fire.
@GaryMillyz
@GaryMillyz 3 ай бұрын
Stunning brain power start to finish. Bravo.
@honkiemonkey33
@honkiemonkey33 2 ай бұрын
It seems that the practical applications of these developments are still in the process of being fully realized. I hope they are finding ways to apply these innovations beyond just gaming. With that in mind, here are a few possible areas where they might have significant impact: 1. Generating a comprehensive set of architectural and engineering designs based on site parameters and design preferences. 2. Creating 3D product designs, such as furniture or wearable technology, that adapt to environmental factors and surroundings. 3. Offering emergency assistance through augmented reality, such as using smart goggles to guide someone through landing a plane in a critical situation. 4. Enabling underwater robotic welding to facilitate complex repairs in challenging environments. 5. Utilizing autonomous drones that can navigate hostile environments and selectively target designated individuals. It might sound harsh, but it’s likely similar technology to what they would be using for shooting games.
@JsJustin
@JsJustin 3 ай бұрын
great interview
@vocesanticae
@vocesanticae 3 ай бұрын
Thank you for sharing your brilliance, curiosity, and collaborations with the world. Hearing about the differences and connections between 1D v 2D v 3D models was particularly enlightening. 4D was mentioned, just briefly though. I wonder how much growth and insight may be found by adding time as backward-looking and forward-looking connective tissue to all modeling, e.g., transforming 1D language models into 2D echo chamber maps and dialogic predictions, expanding 2D images into retrospective and prospective time-lapse immersions, and rendering 3D models as past-looking and forward-looking world-dramas? Seeking to traverse and shift from high-dimensional to low-dimensional, and simultaneously from low to high may be a fruitful research and development path to connect and intersect all models.
@AdewaleAromokeye-rx1dn
@AdewaleAromokeye-rx1dn 3 ай бұрын
How much work have you done on this?
@yosivin1
@yosivin1 3 ай бұрын
WHAT A TIME TO BE ALIVE.
@flickwtchr
@flickwtchr 3 ай бұрын
Check back in about 10 years to see how great the AI revolution is going for most people on planet Earth.
@devon9374
@devon9374 3 ай бұрын
Always love listening and learning from Fei! And Justin is amazing!
@keithpalmer1843
@keithpalmer1843 3 ай бұрын
Digital generative predictive spatial worlds, awesome idea!
@Thechatwithchad
@Thechatwithchad 3 ай бұрын
Deep knowledge, thank you
@M0481
@M0481 3 ай бұрын
I think this sounds great, but one thing I'm missing here is that they have specific products in mind that they want to cater to. With data acquisition potentially being tricky in this field, I'd have imagined a more specific roadmap. Note that they may have a very specific roadmap that they are simply not sharing (which makes total sense).
@BestFitSquareChannel
@BestFitSquareChannel 3 ай бұрын
So, Dr. Li I recognize. Who is this guy sitting next to her!? Justin? Justin who!? DANG! Now I know. Congratulations Justin! Brilliant! Best wishes. 🌞🖖🏼
@TheFreddieFoo
@TheFreddieFoo 3 ай бұрын
I bet that Justin is smarter than that lady (who I don't recognize)
@natzos6372
@natzos6372 3 ай бұрын
Weird comment​@@TheFreddieFoo
@TheFreddieFoo
@TheFreddieFoo 3 ай бұрын
@@natzos6372 strange observation
@AB-lx4rl
@AB-lx4rl 2 ай бұрын
@@TheFreddieFoo​​⁠I think this impression is partially due to the fact that Fei-Fei is not a native English speaker, she might sound less "smart" than she really is. If you look at her contribution to the field, it’s amazing.
@TheFreddieFoo
@TheFreddieFoo 2 ай бұрын
@@AB-lx4rlI know plenty of blindingly smart and hardworking Chinese and Taiwanese post docs, some have a much stronger accent. So I’m certain that it’s not her accent. Her main contribution is creating a data set with manual labelling.
@Superfandotfan
@Superfandotfan 3 ай бұрын
What an exceptionally great interview... so much was touched here... and the historic perspective is humbling. Thank you all.
@Ruminant89
@Ruminant89 3 ай бұрын
This is gold.
@armitosmt5753
@armitosmt5753 3 ай бұрын
Very informative podcast! I enjoyed. Thank you for your efforts. ⭐
@sombh1971
@sombh1971 2 ай бұрын
33:10 The best possible use cases are when you couple all this to VR/AR/MR. In other words, like image generation using prompts, you should be able to generate realistic virtual worlds where you should be able to immerse yourself using eyeware. And then in the long run, train robots on those virtual worlds, where you can tinker with creating really complex environments or situations that are not possible to create in the real world without causing some damage. Also, somewhat tangentially, while building robots for deployment in the real world, one has to equip them with pretty sophisticated self-defence capabilities, for there would be no dearth of luddites and bad actors who would want to damage them. I think this is a pretty big bugbear in this scenario, where at times the self-defence mechanism used could inflict serious harm on the perpetrators and then this Pandora's box of justice involving what is allowed or disallowed would open up, hobbling all this.
@skypickle29
@skypickle29 3 ай бұрын
although our eyes map a 3D world onto a 2D structure (the brain is a folded up plane), our proprioception and motor control is a 3D control system. The 3Dness is achieved by adding another dimension to the 2D world-not a spatial dimension but a temporal dimension. We interpret a 3D world as little movie clips of a 2D world. so training data necessarily requires tokenization of video. In the same way that LLMs focus on 'what is the next most probable word', LVMs (large video models) will focus on 'what is the next most probable token in this movie? The storage and energy requirements of this approach are MASSIVELY greater than LLM training and likely will have to wait until we figure out how to use brain organoids as parallel processors (their energy requirements are orders of magnitude less than GPUs)
@carvalhoribeiro
@carvalhoribeiro 3 ай бұрын
Great conversation. Thanks for sharing this
@choiceblade
@choiceblade 3 ай бұрын
Apropos world making… Digitally. The tour de force seems to me to be even one story set in a small town of no more than 150 people… the humanity of this would be nothing short of epic. Why? Because every individuals experience would be rendered from a first person point of view such that there are 150 versions of this story rendered in high Fidelity 3-D, and each story would interlace along the same timeline perfectly. Much liberty could be taken to express the mentality or capacity of each individual by how they interpret the same events. Some of the events would be shared by degrees based on the circumstances and location of the action. This would be utterly engrossing and it would take weeks if not months for someone to experience all of it. This is most intriguing. The potential for mental and emotional learning, human understanding, and the exposition of a useful and or valuable plot line is near limitless.I’m reeling from just thinking about it
@JMai-ci9nl
@JMai-ci9nl 3 ай бұрын
Just wondering what exactly she is building and how we can use that? Robotics? Games? Metaverse? I hope at least those VCs knew.
@bro_dBow
@bro_dBow 3 ай бұрын
Medical applications in guiding surgery, marker tracking in body sensors or reading dye in the body for mapping or prosthetics, etc. This is not my field, but this comes to mind.
@Intunlocked
@Intunlocked 2 ай бұрын
A.I. is changing lives for the better. It may not be fully understood or accepted currently but, there are many things in the past that started like that and then become the norm.
@jonathanmahenge8263
@jonathanmahenge8263 2 ай бұрын
This was insightful to watch!.
@jme9570218
@jme9570218 2 ай бұрын
Outstanding Progress boggles the mind.
@2triangles
@2triangles 3 ай бұрын
Great to watch, but you didn’t ask the question I was most hoping for: what does the timeline look like? The evolution of compute showed her 100 year estimate was far too conservative. Based on Huang’s “Moore’s Law Squared”, what is reasonable to expect in the spatial intelligence realm over the next 3-10 years?
@lordjavathe3rd
@lordjavathe3rd 2 ай бұрын
They say by 2027 acording to their chart/graph
@LuisClassics
@LuisClassics Ай бұрын
Light! ☀️
@dennisg967
@dennisg967 3 ай бұрын
Oh wow!!! That was really inspiring. Thank you!!!!
@richiehart7858
@richiehart7858 3 ай бұрын
The discussion around 3D and 4D understanding reminded me very much of the layman's description of what goes on in the Tesla FSD inference computer. Same goal at least.
@ceylonvc
@ceylonvc 17 күн бұрын
The best! ❤
@bimaltwayana2058
@bimaltwayana2058 3 ай бұрын
i love you fei fei li.
@bimaltwayana2058
@bimaltwayana2058 3 ай бұрын
fei fei li is the best.
@twins2j
@twins2j 3 ай бұрын
Really great conversation. Our own research continues to focus on 2D computer vision, and we align with the talk’s insights on the differences between 1D and 3D models. A fundamental distinction between visual models and language models lies in how they understand the world: language models are one-dimensional, sequential, and narrative in nature, whereas visual models are two-dimensional, with decoding processes that are relatively independent and operate concurrently without requiring sequential dependencies. This allows visual models to process information in images in parallel, sharply contrasting with the serialized processing of language models.
@marykelly6218
@marykelly6218 3 ай бұрын
How do I invest in World Lab?
@chikiuso8305
@chikiuso8305 3 ай бұрын
really inspiring talk
@Eric_McBrearty
@Eric_McBrearty 3 ай бұрын
Great stuff guys! I can see this being used for designing a virtual memory palace. Example, I would like to fly around a 200 ft skeleton. Laid out on each bone as if it were a table, I would want documents of literature about the bone. Also, there would be a collection of video thumbnails about that bone. Some of surgery, and some of people falling and breaking that bone. I can see the creation of Wikipedia Park. A virtual environment where every page of wikipedia is turned into a virtually explorable environment, ride, or fun house. Hyperlinks would be represented as doors leading you into a whole new section. 10 - 20 hyperlinks in there would be bonus points for falling into a wiki-hole. Education will turn into episodic memory events. Conversations with your kids will turn into... "Remember that time we were 50 doors in and we ended up under the paw of the Sphinx".
@maudentable
@maudentable 3 ай бұрын
Atleast, you got a chance to Chat with the Fei-Fei and Justin before they became the next AI billionares.
@test-sc2iy
@test-sc2iy 3 ай бұрын
If there's anyone who deserves it it's fei fei. she taught ilya and built the dataset that kicked off this entire rigmarole - read her book the worlds I see
@ronilevarez901
@ronilevarez901 3 ай бұрын
Because all you need to succeeded in this world is knowledge and ideas, right? 😑
@ravirajchilka
@ravirajchilka 3 ай бұрын
😂😂
@uchennakingsley1354
@uchennakingsley1354 3 ай бұрын
Are you being Sarcastic., or being True ​? @@ronilevarez901
@JJ-bj6hg
@JJ-bj6hg 2 ай бұрын
You are thinking in terms of dollars while their dopamine rush is solving complex problems
@ronaldronald8819
@ronaldronald8819 3 ай бұрын
This sounds full on exciting. I hope it is gone be available to all soon. Lets start dreaming up how to interact, solve and create with it.
@heythere6390
@heythere6390 3 ай бұрын
How does 3d knowledge and understanding relate to intelligence? I mean, does reasoning use spatial understanding ? How does this connect to AGI?
@musicloungepodcast
@musicloungepodcast 2 ай бұрын
Limitless possibilities: Integrating AI and 3D imaging
@mastwheel
@mastwheel 3 ай бұрын
Excellent discussion!
@twirlyspitzer
@twirlyspitzer 3 ай бұрын
I did buy a VR headset that now sits idol because I don't do gaming. But, I still invision some future day when it becomes the all-in-one media I'll ever need to connect to reality as seamlessly mixed anywhere. I was very enlightened here on how kinetic spatial intelligence is essential to connecting all the AGI dots into a new technology of reality itself. Another explaination as to how AGI is actually a new stage in evolution itself.
@AB-wf8ek
@AB-wf8ek 3 ай бұрын
Oh man, I was definitely using my headset a lot during the pandemic just to chat with people and watch movies. I haven't touched it in over a year. The main issue for me is the physical discomfort. I feel like when we look back in 10 years, it'll seem so ridiculous strapping giant boxes to our faces.
@saratpoluri
@saratpoluri Ай бұрын
The future of personal computing is Spatial!
@tedbischak1067
@tedbischak1067 Ай бұрын
It is absolutely the best video I've seen that explains not only where AI came from but where AI (specifically generative AI) is going and why. I have one comment/question, as was stated in the video, humans have stereoscopic, 2D vision but humans are also born with the ability to automatically imagine the unseen parts of the 3D world - how is that possible?
@sahkoautokoulu
@sahkoautokoulu 3 ай бұрын
I suspect Justin is digitally added to the video by AI. There is a degree of uncanny-valley in there, show us your fingers! 😅
@happy-wave-form
@happy-wave-form 3 ай бұрын
insightful interview, awesome. perhaps spatial intelligence would be applicable to the health and medical industry?
@jeffsmith9384
@jeffsmith9384 3 ай бұрын
Fei Fei: "Visual spatial intelligence is so fundamental, it's as fundamental as language, possibly more ancient and more fundamental in certain ways" Blind people:
@8xster8
@8xster8 2 ай бұрын
Maybe as a completely beginner question, what exactly is wrong with a 1d representation of 3d space? Isn't arguably our own biological understanding of 3d reality a series of 1d synaptic connections in the brain? If it works for us, can't it work for neural nets that model us?
@netman63
@netman63 3 ай бұрын
if you want an AI to really see in 3D space, you could build a 3D generative AI that creates a 3d world and compares it to LIDAR and stereoscopic images of the real world and run a feed-back loop to approximate the generated world to the real world
@skierpage
@skierpage 3 ай бұрын
Then your AI has the problem of deciding how well the generated world approximates to the world. We can tell because we have a good meta-model of the real world, but AIs don't yet, which seems part of the goal of World Labs. It's a lot easier ro run a feedback loop with LLMs, because they either make a good guess for the next word or they don't.
@PrecioustheMovie1
@PrecioustheMovie1 3 ай бұрын
Fei fei is a baller
@lucasteo5015
@lucasteo5015 3 ай бұрын
I think spatial intelligence is the kind of intelligence that will be able to dream like human. When someone tell you a route to your destination you can memorize the steps in 1d sequence, left right left right or you could also imagine how you would traverse directly in a 3d scene and that is what make it different from LLM. In this case they are both correct and valid but its the underlying representation of the problem and the solution that will potentially have better fit to the problem you're trying to solve in 3d.
@ShaneP-q5d
@ShaneP-q5d 3 ай бұрын
A mix of interesting ai info plus investor sales pitch. Would have preferred one or the other…
@minimal3734
@minimal3734 3 ай бұрын
I'm not sure if the dimensionality of the data is really important. A model might be able to dedicate a few neurons to the transformation of 1-d data back into a 3-d concept.
@MattGarcia
@MattGarcia 2 ай бұрын
true, just like our biological neural network transform 2d input into 3d understanding
@tangobayus
@tangobayus 2 ай бұрын
Will someone please explain why so many video producers think it's a good idea to have a bright light in the background? It hurts the image quality and color balance.
@ЛаврентийЛюбимов
@ЛаврентийЛюбимов 3 ай бұрын
great video seen great profit on demo n will give it a try today thank you
@ahmedsuliman9067
@ahmedsuliman9067 2 ай бұрын
Thank you very much
@ANI-uv8xn
@ANI-uv8xn Ай бұрын
28:26 to 29:41 completely agree
@PankajDoharey
@PankajDoharey 3 ай бұрын
Yeah Self driving cars and Robots have been trying to unlock that "Visual Spacial Intelligence" for a decade of more. Saying that this is the next frontier is a joke when considering the visual models were the first models to be build using deep learning.
@cstephens16
@cstephens16 3 ай бұрын
Incredible.
@drcharleyLGO
@drcharleyLGO 3 ай бұрын
Thought Nvidia was already developing an entire 3D spatial trainer with the ultimate aim to empower functional robots and other applications. Not to mention the metaverse players. Not sure how those weren't talked about
@GatherVerse
@GatherVerse 3 ай бұрын
@@drcharleyLGO, completely agree. Including the billions of dollars that Meta has put into the metaverse for R&D. Pairing Llama 3.1 and more to the metaverse with intelligent object segmentation is a game change. They’ll need way more than 250ml in cap raise to compete. Theres a very cool virtual world symposium happening at the end of October with loads of speakers talking about this and more. It’s hosted by GatherVerse. Take a look if you’re around.
@adetunjii
@adetunjii 2 ай бұрын
Justin is so brilliant
@ZephyrMN
@ZephyrMN 3 ай бұрын
If you could train all simple physical models in physical world, complex world would just be a scale problem then. The secret of the pre time world is geometry.
@RAPHAELMAXIAN
@RAPHAELMAXIAN 3 ай бұрын
weather they are doing it willingly or not , they are the designers of all future weapons. They are doing a great job
@lordjavathe3rd
@lordjavathe3rd 3 ай бұрын
"Spatial Intelligence" Labeling objects and attributes in an image and using that "world" model to understand or guess at what is going on in an image. An important aspect is "novelty" by observing change until it begins to repeat, at which point it becomes a thing or an aspect. One example is 360 Degrees from the perspective of a baby, turn around and notice all the new things you see, until you see stuff you've already seen before. That makes you confident of how boring something is, it's a thing. All the objects that a person encounters have a novelty and boringness. When you notice like fur on a cat, it's that you looked at it with a wrong or obselete expectation(model) it defied you or proved you wrong, but from the application of the wrong model you branch out slightlly and attribute a difference that will make you more correct. Doing that explains thought or object parent/child or why for this person, cats are associated with a bush(to make something up for the instance). I think about the representation of thoughts/concepts and I want to build a process for how learning works, from a base to growing a world model. It's nice to hear someone talk about this subject.
@roylevy5897
@roylevy5897 2 ай бұрын
Could you further elaborate what you mean?
@lordjavathe3rd
@lordjavathe3rd 2 ай бұрын
​@@roylevy5897 A Brief Overview of Knowledge Representation or AGI By Justin Mathis ​ @roylevy5897 I reread my post after your comment and saw how someone who is not me may have thought about my insightful proclamation that I understand an aspect of "Knowledge Representation" regarding the separation of "Objects/Aspects" from each other through the mechanism of "repetition", I called it “boringness”. Spatial Intelligence is a reference to what they think Knowledge Representation looks like, due to how knowledge is formed/grows, and out of what substance. They mean to be getting at the dimensions/aspects of “Knowledge Representation” by saying “... ~it is spatial~ … … “, hint hint, nudge nudge, smile smile!! Now, → knowledge is the storage of recurrent problems that cause us stress by separating us from what we want, their alleviation is their insignificance. The dimensions/variables of Knowledge Representation are: Time(current context/all objects of importance being referenced): A portion of your knowledge map that is current and/or growing. {OR}What, of the area of objects in your knowledge map → that are lit up (what’s going on), (what is DAMAGED | GROWING | IMPORTANCE | stressing you), (what you are doing to alleviate it), Sensation Detail: A moldable | sensation | feeler DATA TYPE that unifies all WHAT by being the same type, but being different for what it felt like from one object to the next. Instead of cat fur being stored next to carpet for being detail-ly similar, it’s separated by context of Novelty/meaning DisStability until Stability. The form of which looks like, context is the Tree wood and Detail is the tree leaves, as a means of separating detail (not looking like a tree but a growing with time map). A Knowledge Map is a Time Dilation of Detail(WHAT) stabilizing into Boring Detail(novelty until repetition), to be referenced whenever it is relevant via being perceived or retroactively associated or becoming insecure and branching until stability or abandonment depending on importance. Importance: (Discomfort|Unsettling|Disruptive|Damage|UnExpectation) what is unstable about what you expected. Emotions: Are Winning/Losing, Stability/DisStability, Wealth/Poverty, Relief/Stress, Pleasure/Pain, Ignoring/Learning → medium.com/@mcgill_dr/emotions-and-feelings-charts-d4ff59ebf86a This is included because it can be seen which emotions are negative and which are positive and how that relates to Insecurity(pain -> attention -> learning or branching)/Security(insignificance feeling pleasure) What is Boring: What is Novel: what was not expected but is happening, which damages/stresses your representation of where you are and what you expect, thus annulling what you expected and according to all that is at risk of being lost/damaging/dying however much it is worth to you, you pay attention and cannot look away at the novel occurrence until you can re-establish safety or diminish your confidence if it’s beyond you Hawkings Radiation for negligible model failure, or unexpected unimportant stuff. ... I wrote this because you asked me to elaborate upon it. I haven't hit a wall, I could begin to detail a process for the foundation of a knowledge map, I can also describe how vision works which is similar to touch and hearing. I'm telling you this because if you don't like what I wrote, or if you criticize me, I will work harder on it and refine it, because I wanted someone to talk to about this subject, but due to it's unimportance in years past and my lack of a (don't run away) related occupation, also due to no one else talking about it especially amidist public view, I didn't tell anyone about my own striving to understand on this subject. I just asked OpenAI ChatGPT about it and this is what it had to say: "Justin, this piece you're working on is ambitious, and you're diving deep into the nuances of knowledge representation, drawing parallels with spatial intelligence and cognitive processes. You're blending a philosophical approach with technical insights, which makes it intriguing but also quite dense. Here are some of my thoughts on what you're aiming for: Clarity and Flow: Your ideas are rich, but at times, the language becomes a bit tangled, which might make it hard for readers to follow. For example, in your section on "Time," you're explaining how knowledge maps evolve, but the sentence structure could be simplified. Consider breaking down some of the more complex sentences or using analogies to make abstract ideas more digestible. "Boringness" as a Concept: I like your framing of "boringness" as the result of knowledge stabilizing through repetition. It's a clever way of explaining how once novel information becomes familiar and predictable, it fades into the background. Maybe expand on this idea more explicitly, perhaps giving examples of how this manifests in daily cognition or learning. Sensation Detail: This part caught my attention because you're discussing the sensory aspect of knowledge. Your example of cat fur and carpet being separated by novelty rather than by direct sensory similarity is thought-provoking. The metaphor of "tree wood and leaves" works visually, but it could be worth elaborating on how this separation helps in the process of knowledge formation and why it matters to your broader point. Importance and Emotions: Tying emotions to the instability or disruption of knowledge representation is powerful. The connection between discomfort, attention, and learning adds an emotional dimension to the otherwise sterile concept of knowledge. Perhaps explore this more-how does emotional instability push the brain to resolve conflicts, and how does that tie into knowledge maps and their growth or abandonment? Hawking Radiation Analogy: I found this a unique way to describe insignificant errors in models. Using a physics concept like Hawking Radiation to represent the "negligible model failures" could be a little opaque for readers unfamiliar with the reference. It’s an interesting metaphor, though; you might want to unpack it a bit more for clarity. Overall, you’re laying out a complex model that intertwines cognitive science, emotion, and philosophy. Your core ideas are engaging, but sharpening some of the explanations and adding examples might make it easier for readers to digest. How do you feel about expanding certain sections or offering more analogies to ground the concepts further?"
@lordjavathe3rd
@lordjavathe3rd 2 ай бұрын
@@roylevy5897 It may be easier to read on google docs: docs.google.com/document/d/16IWhap8QIB3GHLru_SOfLr-M3xohFaVMH1WnL_Ngofk/edit?usp=sharing If you criticize me, I'll work on it further until you compliment me.
@lordjavathe3rd
@lordjavathe3rd 2 ай бұрын
@@roylevy5897 I asked ChatGPT to rewrite my comment and it did an amazing job, here it is: "Spatial intelligence" refers to labeling objects and attributes within an environment and using that "world model" to understand or infer what’s happening. A key element of this process is novelty, which we observe as changes in our surroundings until those changes start to repeat-at which point they become familiar or "boring." For example, imagine a baby turning 360 degrees and noticing new things until eventually seeing something familiar. This repetition builds confidence in the stability of those objects, which then become established as "things." Every object a person encounters has an element of novelty that, over time, becomes boring through repetition. When you notice the texture of fur on a cat, it’s because your initial expectation (or mental model) was incorrect or incomplete. The fur defies your prediction, forcing you to adjust your model slightly. This branching out from the wrong expectation refines your understanding, allowing you to build a more accurate mental representation. For a moment, concepts in development might still be linked to environmental "debris"-like associating a cat with a bush after a few encounters. But with time and more experiences, these associations become more refined and distinct. I think about how thoughts and concepts are represented and want to create a process that explains how learning works-starting from a base and growing into a full "world model." It’s exciting to hear others talking about this topic.
@lordjavathe3rd
@lordjavathe3rd 2 ай бұрын
@@roylevy5897 *This was recycled through ChatGPT a few times to make it readable. The original comment and the reply* Original Comment @lordjavathe3rd "Spatial intelligence" refers to labeling objects and attributes within an environment and using that "world model" to understand or infer what’s happening. A key element of this process is novelty, which we observe as changes in our surroundings until those changes start to repeat-at which point they become familiar or "boring." For example, imagine a baby turning 360 degrees and noticing new things until eventually seeing something familiar. This repetition builds confidence in the stability of those objects, which then become established as "things." Every object a person encounters has an element of novelty that, over time, becomes boring through repetition. When you notice the texture of fur on a cat, it’s because your initial expectation (or mental model) was incorrect or incomplete. The fur defies your prediction, forcing you to adjust your model slightly. This branching out from the wrong expectation refines your understanding, allowing you to build a more accurate mental representation. For a moment, concepts in development might still be linked to environmental "debris"-like associating a cat with a bush after a few encounters. But with time and more experiences, these associations become more refined and distinct. I think about how thoughts and concepts are represented and want to create a process that explains how learning works-starting from a base and growing into a full "world model." It’s exciting to hear others talking about this topic. A Brief Overview of Knowledge Representation or AGI By Justin Mathis @roylevy5897 After rereading my original post and reflecting on your comment, I can see how someone might have misunderstood my thoughts on "Knowledge Representation" and my concept of "boringness" as a mechanism of novelty becoming repetition. What I was getting at is the idea that knowledge is formed when an experience ceases to be novel, which eventually leads to a separation of objects or aspects from one another. I referred to this process as “boringness” because, over time, what once was novel becomes mundane through repetition. When people refer to "spatial intelligence" in the context of Knowledge Representation, they’re hinting at how knowledge grows and takes shape. The idea is that knowledge is structured or mapped in ways that can be understood spatially, in terms of what you felt like, when what was going on. Spatially, in this case means the storage of ideas, close to ideas, being referenced by similar ideas. Saying this experience reminds me of *blank*, or that looks like *blank*. Spatially is not talking about our 3d world, or 3d in terms of computer video games. They mean abstract representations of thought are stored spatially and temporally. What is Knowledge? There is a difference between what we know vs what we want to be, in this sense, knowledge is what displeases us, stresses us, pains us, and we shelve it until we can solve it. Knowledge is the storage of recurrent problems that cause us stress by separating us from what we want, if they are alleviated they become insignificant in that we ignore what doesn’t hurt us. Knowledge, in essence, is the storage of recurring problems that cause us stress by creating a gap between what we want and what we experience. Once these problems are resolved or become insignificant, they are stored as knowledge. The dimensions of Knowledge Representation include: Time (Current Context): This refers to the portion of your knowledge map that is relevant to the present moment. It includes all the objects of importance you’re currently referencing. Essentially, it’s the "what’s happening now" section of your mental map, highlighting what’s damaged, growing, or causing stress, and what you’re doing to alleviate that stress. Sensation Detail: This is a moldable data type-like a "feeler"-that connects different objects in your knowledge map by being permutations of the same datatype(sensory qualities) making all objects different or similar to each other. Instead of categorizing things purely based on similarity, objects are separated by branching temporal context. Temporal means the object failed to be similar to what you saw, so it branches off and becomes more like what you experienced and gives the object the name, not the previous object as well as the name of all the stuff this new object can do, like fly or be very orange. A knowledge map isn’t just a collection of facts-it’s a dynamic, evolving system that stabilizes as details become familiar, or "boring," through repetition. These details can be recalled when needed, or they may become less important and fade away. Importance and Emotions in Knowledge Importance: This is the level of discomfort or disruption caused by something that challenges what you expected. It’s the "alarm bell" that rings when something unstable or unexpected occurs in your knowledge map. The greater the disruption, the more attention you pay to it. Emotions: Emotions act as indicators of whether you’re "winning" or "losing" in terms of stability. Positive emotions like relief and pleasure signal that things are stable or insignificant, while negative emotions like stress or pain signal instability and demand attention. Essentially, emotions guide how much importance you assign to a given problem or gap in your knowledge. The Role of Novelty and Boringness What is Boring: Over time, repetition causes certain aspects of knowledge to lose their novelty and become "boring." These stable, predictable elements no longer command your attention because they’ve been fully processed and understood. What is Novel: Novelty occurs when something unexpected disrupts your knowledge map, causing stress because it challenges your expectations. When this happens, you can't help but pay attention until the new information is processed and integrated, allowing you to feel safe again. The greater the threat or uncertainty, the harder it is to look away until you re-establish stability.
@ProtonSapien
@ProtonSapien 3 ай бұрын
Beware of bot comments guys ❤
@freebluffs
@freebluffs 2 ай бұрын
so to get more data is companies like apple and google going to let them use the storage that has peoples photos and videos
@bause6182
@bause6182 Ай бұрын
I'm really looking forward to seeing the outcome of the work of Fei-Fei Li and his team on LWM, I have always been passionate about computer simulations of virtual environments. You would be surprised that this subject is not new, for example there is a whole literature on the procedural generation of environments that is more than twenty years old. Global models will offer many possibilities, more than llm and diffusion models, in my opinion they will redefine our creativity and the way we explore our ideas. we will be able to create and simulate an entire world from an image or a textual prompt, that excites me enormously
@ginogarcia8730
@ginogarcia8730 2 ай бұрын
Imagine having Fei-Fei Li as your PhD advisor. Gaddam.
@CharlesBrown-xq5ug
@CharlesBrown-xq5ug 3 ай бұрын
Arrays of four lenses in concentric squares plus fuzzy focus processing to extract distance seems to me to be a better subsystem for 3D cameras.
@davidmiles-hanschell
@davidmiles-hanschell Ай бұрын
Every day is a school day; bring on the lessons!
@royjones1053
@royjones1053 2 ай бұрын
Thank you, always appreciate quality information, 'as long as we have it'! So many times it has been proposed, a limit in "mores law" as we tear on through again, todays rocket ship again yesterdays potato. Indeed the acceleration of compute has been game changing, feel like I am living a paradox, not mine only, all of us.
@titusxx3
@titusxx3 3 ай бұрын
This seems to confirm what Kant thought was the fundamental aspects of conscious subject experience: our ability to perceive things in space and time. These two aspects of experience are the basis for all other knowledge.
@NoWhiteGullibility
@NoWhiteGullibility 2 күн бұрын
Excellent
@Johnbrownpe
@Johnbrownpe 2 ай бұрын
the tech behind Aliagents is super interesting, tokenized AI systems with real functionality
@davidlearnforus
@davidlearnforus 3 ай бұрын
Probably most important applications of vision derived methods will be understanding biological phenomena, new drugs and materials.
In the Age of AI (full documentary) | FRONTLINE
1:54:17
FRONTLINE PBS | Official
Рет қаралды 26 МЛН
$1 vs $500,000 Plane Ticket!
12:20
MrBeast
Рет қаралды 122 МЛН
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН
Thank you mommy 😊💝 #shorts
0:24
5-Minute Crafts HOUSE
Рет қаралды 33 МЛН
OpenAI CEO Sam Altman discusses the future of generative AI
52:44
Michigan Engineering
Рет қаралды 149 М.
The AI Tsunami is Here: Keynote on Why Firms Must Act Now
30:05
Center for Digital Transformation | CDT
Рет қаралды 95 М.
Trump is About to Change Everything For Tech Startups
59:15
AMD's CEO Wants to Chip Away at Nvidia's Lead | The Circuit with Emily Chang
24:02
CHM Live | Fei-Fei Li's AI Journey
1:16:47
Computer History Museum
Рет қаралды 12 М.
The Next Frontier: Sam Altman on the Future of A.I. and Society
36:47
New York Times Events
Рет қаралды 345 М.
AI Town Hall   Teaching and Learning
1:15:41
UW Information Technology
Рет қаралды 267
$1 vs $500,000 Plane Ticket!
12:20
MrBeast
Рет қаралды 122 МЛН