PROOF Tesla Is Creating a WORLD MODEL!! What are End-To-End Training, and Foundation World Models?

Рет қаралды 63,136

11 ай бұрын

Evidence is mounting quickly that Tesla has made a MAJOR move in their Full Self Driving (FSD Beta) technology: they're creating a foundation world model, likely with end-to-end training to solve car and Teslabot autonomy! But there is also a LOT of confusion about what all of these terms mean. Come with me to discover exactly what world models, foundation models, and end-to-end training means, why they are NOT 3D reconstructions, and why they're so important!
I recorded this prior to xai's Twitter Spaces one July 14th, and this provides EVEN MORE evidence that what I'm saying is correct. Stay tuned for a "part two" to this video!
Join this channel to get access to perks:
/ @drknowitallknows
**To become part of our Patreon team, help support the channel, and get awesome perks, check out our Patreon site here: / drknowitallknows . Thanks for your support!
Go to drinkag1.com/drknowitall to get started on your first purchase and receive a FREE 1-year supply of Vitamin D3+K2 and 5 travel packs.
Get 25% off Blinkist premium and enjoy 2 memberships for the price of 1! Start your 7-day free trial by clicking here: blinkist.com/drknowitallknows
Get The Elon Musk Mission (I've got two chapters in it) here:
Paperback: amzn.to/3TQXV9g
Kindle: amzn.to/3U7f7Hr!
**Want some awesome Dr. Know-it-all merch, including the AI STUDENT DRIVER Bumper Sticker? Check out our awesome Merch store: drknowitall.itemorder.com/sale
For a limited time, use the code "Knows2021" to get 20% off your entire order!
**Check out Artimatic: www.artimatic.io
**Want to get in on WeBull's awesome Crypto and stock fun? Check out this link, and get started trading stock and Crypto!
a.webull.com/i/DrKnow-it-allK...
**If you are looking to purchase a new Tesla CAR, Solar roof, Solar tiles or PowerWall, just click this link to get up to $500 off! www.tesla.com/referral/john11286. Thank you!
**You can help support this channel with one click! We have an Amazon Affiliate link in several countries. If you click the link for your country, anything you buy from Amazon in the next several hours gives us a small commission, and costs you nothing. Thank you!
* USA: amzn.to/39n5mPH
* Germany: amzn.to/2XbdxJi
* United Kingdom: amzn.to/3hGlzTR
* France: amzn.to/2KRAwXh
* Spain: amzn.to/3hJYYFV
**What do we use to shoot our videos?
-Sony alpha a7 III: amzn.to/3czV2XJ
--and lens: amzn.to/3aujOqE
-Feelworld portable field monitor: amzn.to/38yf2ah
-Neewer compact desk tripod: amzn.to/3l8yrUk
-Glidegear teleprompter: amzn.to/3rJeFkP
-Neewer dimmable LED lights: amzn.to/3qAg3oF
-Rode Wireless Go II Lavalier microphones: amzn.to/3eC9jUZ
-Rode NT USB+ Studio Microphone: amzn.to/3U65Q3w
-Focusrite Scarlette 2i2 audio interface: amzn.to/3l8vqDu
-Studio soundproofing tiles: amzn.to/3rFUtQU
-Sony MDR-7506 Professional Headphones: amzn.to/2OoDdBd
-Apple M1 Max Studio: amzn.to/3GfxPYY
-Apple M1 MacBook Pro: amzn.to/3wPYV1D
-Docking Station for MacBook: amzn.to/3yIhc1S
-Philips Brilliance 4K Docking Monitor: amzn.to/3xwSKAb
-Sabrent 8TB SSD drive: amzn.to/3rhSxQM
-DJI Mavic Mini Drone: amzn.to/2OnHCEw
-GoPro Hero 9 Black action camera: amzn.to/3vgVMrH
-GoPro Max 360 camera: amzn.to/3nORGYk
-Tesla phone mount: amzn.to/3U92fl9
-Suction car mount for camera: amzn.to/3tcUfRK
-Extender Rod for car mount camera: amzn.to/3wHQXsw
**Here are a few products we've found really fun and/or useful:
-NeoCharge Dryer/EV charger splitter: amzn.to/39UcKWx
-Lift pucks for your Tesla: amzn.to/3vJF3iB
-Emergency tire fill and repair kit: amzn.to/3vMkL8d
-CO2 Monitor: amzn.to/3PsQRh2
-Camping mattress for your Tesla model S/3/X/Y: amzn.to/3m7ffef
**Music by Zenlee. Check out his amazing music on instagram -@zenlee_music
or KZbin - / @zenlee_music
Tesla Stock: TSLA
**EVANNEX
Check out the Evannex web site: evannex.com/
If you use my discount code, KnowsEVs, you get $10 off any order over $100!
**For business inquiries, please email me here: DrKnowItAllKnows@gmail.com
Twitter: / drknowitall16
Also on Twitter: @Tesla_UnPR: / tesla_un
Instagram: @drknowitallknows
**Want some outdoorsy videos? Check out Whole Nuts and Donuts: / @wholenutsanddonuts5741
Sources:
• ONE MODEL to Rule Them...
• Tesla Engineer: Tesla'...
www.sciencedirect.com/science...
ai.stackexchange.com/question...

Пікірлер: 227

@KentLester 11 ай бұрын

Just a word of encouragement to say that I love the geekiness of your videos on AI! You do a great job of taking an esoteric subject and breaking it down into understandable analogies for us "laygeeks." I'm particularly interested in MICS, (massively complex interconnected systems) and am attempting to write a layman's guide on the subject, and find your geek sessions to be highly inspiring and informative. Bravo.

@RedRyan 11 ай бұрын

Totally agree

@MooseOnEarth 11 ай бұрын

Without formulae and numbers this is nowhere near geeky. I still like it, but do not mix up his explanations with actual geek stuff.

@tech-utuber2219 11 ай бұрын

29:47 This is exactly what versions FSD 10 & 11+ have been missing, SELF-LEARNING. It presently encounters a driving scenario, as though it has never seen it before, even if it has been driven through that same scenario for weeks. (The Tesla team has been hand-tweaking the algorithms to compensate.) What will distinguish versions 12 and beyond is this fundamental ability to learn from previous errors, and then predict and avoid those errors in the future. It took some time to get here, but this is huge progress.

@johnross6314 11 ай бұрын

Elon Musk way back when talked up Edge Inference. As he should have. Including discussions on chip design the mimics the brain. Perfect for his “visual” approach. The only chip design that is visual, neural at its basic core is FPGA architecture. He has shifted away from this idea (mistake), as part of his mass FSD firing several years ago. Field programmable gateway arrays actually process data into visuals like are eye does. And move those bits in a natural neural method at the silicon level. Very fast. Very brain like. This should be exploited. Makes it so easy to do edge inference (self learning, with communication back to AI father trainer, DoJo). Also, any self learning HAS to be at human eyeball 3D and quality, 360 degrees. That would could make it faster than us humans who have to move our head for the 360 view (and/or use mirrors to augment).

@russm8193 10 ай бұрын

Self learning happens only during training in batch mode on the server. Only inference (run-time) can occur in the car - takes huge compute and time to train. Models are iteratively trained and deployed to the cars with updates.

@tech-utuber2219 10 ай бұрын

@@russm8193 That's clear but when FSD is eventually deployed in the future, what does each individual vehicle receive from DOJO and what is it computing? A tokenized subset of the geographical area model in which the vehicle is currently operating, which was gathered by the local vehicle "hive" in the area and compiled by DOJO? This is memory and bandwidth limited, implying that when there is no connectivity, in a new geographic location, the vehicle computation will revert to a simplified general model, or will HW4 or HW5 be able to do this stand-alone? How many adjustable parameters will the vehicle computer be using in order to accomplish the processing? (A modern LLM uses billions of parameters, for example) Encoding of the area model will have to be very efficient, in order to reduce both memory and computation.

@RobertDickert 11 ай бұрын

As a software engineer who is not in AI but took the Andrew Ng online ML class years ago, I found this extremely helpful and interesting. Things have progressed a long way.

@jbencze 11 ай бұрын

😅😅

@jbencze 11 ай бұрын

Instructions Instructions 😮😅😅and

@jbencze 11 ай бұрын

😅😊 4:25

@jbencze 11 ай бұрын

😅

@LoisSharbel 11 ай бұрын

Thank you for this explanation that actually seems clear to me, although my understanding of most of the details is fuzzy. It seems this World Model is going to learn more closely to how we humans learn to navigate the world. Your ability to clarify dense information is amazing. Thank you!

@2ndfloorsongs 11 ай бұрын

Yes, very clear explanations. So clear that I actually think I understand it for at least 5 minutes after I finish it. But that's 5 minutes more than before. 😁

@alsetalokinalset 11 ай бұрын

IMO the best and most insightful video you have created in a long time 👍🙂

@Flat4fun 11 ай бұрын

Hope all is well as can be with your family. You being able to put things out while supporting your loved ones (in your special way) is inspiring.

@johnholman5341 11 ай бұрын

Great video! One of your best so far IMO. I keep coming back to your channel for these great explanations.

@JohnCorrUK 11 ай бұрын

Your most brilliant video yet, superb explanation . Bravo 👏👏👏

@danielmadison4451 11 ай бұрын

Thank you, thank you. Excellent video. Foundation Model! That's the part I was trying to put a word to. I'm a 30year software engineer who came through the maze of 8080 (Z80) assembly, pascal, C, C++ object design. My major was Software Architecture. I knew I was missing the door-way into AI because it is NOT based on deterministic algorithms. I have made a few comments and didn't mean to be snarky, but I knew their was something deeper that I wasn't understanding. Love your videos along with Warren's.

@davidhawkins7138 11 ай бұрын

This was a great summary. Thank you. I'm pretty sure an ontology will still be required to generate the FSD onboard display and as an audit trail for accident and situational analysis. If my guess is correct, hydra heads will be part of the reinforcement learning to keep key ontological elements in the foundation model so they can be extracted as needed. As you've said in your videos on this, a foundation world model is a significant breakthrough. I'm interested in your thoughts about Tesla doing with a diffusion-based approach instead of generative transformers.

@alexanderpoplawski577 11 ай бұрын

Thank you for breaking it down! You have to give it to George Hotz. He said from the beginning, that end to end neural networks will be the solution and Tesla will come around eventually. Also when he saw the occupancy network image, his immediate reaction was: "They created a LIDAR image in software".

@DavidDuwaer 11 ай бұрын

Your best video I’ve seen, by far. Thanks for this. More of these tech deepdives please 🙏 ! You also explain it really well.

@PeteCorp 11 ай бұрын

So basically the zero system from Gundam Wing where it predicts the future and feeds it to the pilot.

@briansilver9652 11 ай бұрын

I can see this system actually reading the text on signs and interpreting their intent and authourity. In other countries it could translate this text to align with the existing AI training. ARRET = STOP etc. LIDAR cannot compete with this.

@MattOGormanSmith 11 ай бұрын

I think, for inference efficiency, separately trained context blocks of models could be switched and potentially blended at a high level, rather than having French road signs in the model used to drive in US, or snowy conditions to be considered when driving in a hot desert. This would be contrary to the end-to-end approach, so it doesn't look like it's the way they're going.

@briansilver9652 11 ай бұрын

@@MattOGormanSmith So if I was driving from say Ontario to Quebec, the car would download the French context module at the border or have to have all languages loaded all the time just in case I decided to make a run for the border. This would need to be considered for left hand drive juridictions that border right hand drive ones. The world is such a complex problem.

@zoemayne 11 ай бұрын

Lidar is just a sensor it is not comparable to anything with regards to AI or coding.

@donfields1234 11 ай бұрын

Hey Dr know it all... just wanted to tell you your awesome... I gave a bit of healthy criticism on your last video and just wanted to add that thats all it was... your the bomb and we all can still improve right? Keep up the great work bud.

@glwicks 11 ай бұрын

One of your best ever ... now I get it ... AMAZING stuff... and can't wait for it to show up in my car!

@garyrooksby 10 ай бұрын

Really well covered, given the complexity of the subject. Thanks!

@jackcoats4146 11 ай бұрын

Great explanation. I see that getting corner and extreme cases are still a problem, but they are for people too.

@justchaz. 11 ай бұрын

LOL. Stuff must be sold to sold buyers.

@HubertHeller 11 ай бұрын

RLHF: Reinforcement Learning from Human Feedback NeRF: Neural radiance fields (NeRFs) are a technique that generates 3D representations of an object or scene from 2D images by using advanced machine learning

@TheEriclwarner 11 ай бұрын

Soo... I'm in the industry - but in different forms of AI... predictive maintenance... Security models... are some. Super great article for the layperson ... Thanks DKIA!

@nettlesoup 11 ай бұрын

Thank you for getting the right understanding and separating out the different aspects of what the CPVR videos taught us. It seems others don't get it unless it's spelled out like here. Final step is just to be able to pronounce Ashok's name correctly. At one point you mentioned Ashok and smoke in the same sentence and yet pronounced Ashok the way you always have!

@MarcelR-89 11 ай бұрын

Great video ! Great explanation ! Keep doing your work … at some point people will never drive better safer than a Tesla car ! The question is not if it happen … the question is only when will it happen!

@vuththiwattanathornkosithg5625 11 ай бұрын

Very well explanation and this make me having a goosebumps when its clicked in my head.

@jimparr01Utube 11 ай бұрын

Hardly understood a word, but REALLY appreciate what you are doing here Sir.

@Harkonic 11 ай бұрын

In the xai QA Twitter Discussion Elon said: „If i look at the experience with Tesla what we’re discovering over time is that we actually over complicated the problem i can’t speak too much detail about what Tesla‘s figured out but except to say that in broad terms the answer was much simpler than we thought we were too dumb to realize how simple the answer was“ Do you think what he talks about is related to this?

@chriswright9096 11 ай бұрын

Classic Musk nonsense. He makes it up as he goes along.

@markhooker8520 11 ай бұрын

Thank you for making this fantastic informative video.

@marce8760 11 ай бұрын

Super video. Great explanation. Love it

@JohnEAvenson 11 ай бұрын

Excellent presentation

@stewartmcleod4094 11 ай бұрын

You made it very easy to understand, thank you

@simonschmitz4894 11 ай бұрын

Amazing explanations, thank you so much!! This whole debate reminds me of the days when I was studying philosophy, especially Wittgenstein. He first thought that explaining the world can only be done by describing "what is" (more like what is coming in via the photons), and who later completely changed his mind and said something like language creates the world (more like what ChatGPT is doing). And it is fascinating to understand that now when trying to emulate the human brain we are essentially using this same process of deep learning for both dimensions (space and linguistics). Do you think with driving, that part of the human feedback will be to edit out things where other humans made mistakes? otherwise the system may not be safe and may even repeat accidents...

@sekching8197 11 ай бұрын

GPT-4 serves as a foundational model for understanding the world linguistically and is used to create systems like ChatGPT. In a similar vein, Tesla's world model, which learns to understand the world through raw images or photons, will form the foundation for their Full Self-Driving system. Correct me if i am wrong !

@ChattingwithKendall 11 ай бұрын

You went to Everest Base Camp? Wow! I’m impressed (and a bit jealous)!

@spookytrigger 11 ай бұрын

Thank you, I learned heaps and filled in some knowledge gaps from the way you explained this.

@MathewBoorman 11 ай бұрын

Thanks, this is great to understand! Was definitely worth clarifying. It would be great to understand the kick starting of these models, and where it is similar or different to biological brains. Random training from scratch you clearly want to shortcut. I'd be interested to learn if the solution has been to merge the previous NN's developed together as the starting point for a single world model evolution, and train from there? If that is so, I suspect some future features be helped along by developing new specific classifiers and adding that to the model for a kick along. For example new regions with some totally different structures, vehicles and signage and road rules. I imaging that is closer to how we learn, recognizing the new things as a special case, until they are integrated into the lower level 'don't think about it'.

@alexandreblais8756 11 ай бұрын

very informative. thank you for this video

@2009RayMD 11 ай бұрын

This is awesome didactics. Thank you for making this incredible step change easy to understand.

@JohnBrown-pw3bz 11 ай бұрын

Great explanation.

@NaughtyGoatFarm 11 ай бұрын

Awesome videos. Thank you.

@roydenvickers6382 11 ай бұрын

You ROCK DOC! Love your videos.

@ApteraEV2024 11 ай бұрын

15:00 Thank you! Sharing your Brilliant Educational Course ❤

@dousiastailfeather9454 11 ай бұрын

Ya, we saw your video of same thing last year where you say the same thing.... You sure have the spirit! Too bad it was reported Tesla's resale values plummeted along with virtually ALL ev's recently. Keep pumping! (Same time, next year?)

@haroldvargas01 11 ай бұрын

Excellent explanation now I have a better understanding thank you!

@vlad_417 11 ай бұрын

it's all cool, but as I understand it, it doesn't run on the car's computer. This is what will be run on Dojo to train neural networks. I think it will take a lot of additional data collection and enormous computing power.

@nickfosterxx 11 ай бұрын

OK, now I get it, thanks so much for persevering. In this purely visual world model, I assume they will have to devise some way to more explicitly teach about physical properties such as slipperiness of road surfaces that are eg wet, icy etc, perhaps by directly setting up training episodes so it can learn the safety envelopes? Also, I imagine, judgments about human interactions such as being flashed when it might mean 'after you, please go first' versus, 'warning, hazard ahead' etc. And of course 'Road Closed', 'No stopping' etc. With best wishes. Edit: I liked your ChatGPT analogy about building a purely statistical foundation model, onto which you train human preferences and higher level subtleties.

@XiallaLife 11 ай бұрын

Awesome explanation thank you!!

@adriansaenz6853 11 ай бұрын

Thanks a lot man! Great vid

@bertwright1790 11 ай бұрын

It seems like a major difference between HW 3 and HW 4 is the radar module. Do we have any idea when and if radar is coming back into the picture? Could radar better calculate distances in inches and/or see in low light/visibility situations. Putting radar back without a plan to us it doesn’t make sense, but having two major standards is more a a difficult problem. It is possible to design a new computer (HW 4 lite) that has the computing power of HW 4 and can be upgraded, use the older cameras but the cars with no radar or LiDAR are forever limited no matter the computer or software.

@at3941 11 ай бұрын

I think I understand what you’re saying about the world model and predictive training. But when I drive I also have a model of the actual environment I drive in - a 3D map in my head. I can draw it out - lane directions, # of lanes, curbs, buildings, changes in elevation, pot holes etc. on roads I actually drive on which helps me predict (in advance of even being at a particular location) what I need to do on the road I drive on every day - is this also happening or is this all part of their world model?

@andrewsteinhaus8267 11 ай бұрын

Great Deep dive! I know a lot of this is possible because of the transparency of Tesla. Could you also do a deep dive and maybe a comparison with other autonomous programs, (Mobileye, Waymo, etc) , even though they may not have the same level of transparency. That would be super interesting.

@alexvoigt4247 11 ай бұрын

Great video, appreciate your work!

@andromeda3542 11 ай бұрын

An End-to-End Model, in the context of machine learning and AGI, refers to a model that takes raw input data and feeds it through a series of transformations, ultimately outputting a direct prediction or control action. This approach contrasts with traditional methods that involve multiple stages of processing and human intervention. For instance, in the case of Tesla's self-driving technology, an end-to-end model would take raw sensor data (like images from cameras) and output control actions for the vehicle (like steering, acceleration, and braking commands) without any intermediate steps like object detection or path planning. The advantage of an end-to-end model is its ability to learn complex tasks without requiring explicit programming for each step. However, training such models can be challenging due to the complexity and the amount of data required. A World Model, on the other hand, is a representation of the environment that an AGI system uses to make predictions about future states based on its current state and actions. It's a kind of internal simulation of the world that the AGI uses to plan and make decisions. The world model should be able to understand and simulate the dynamics of the world in a way that's similar to how humans understand the world. In the case of Tesla's self-driving technology, a world model would need to understand the dynamics of driving, including the behavior of other vehicles, pedestrians, and the rules of the road. It would use this understanding to predict the outcomes of different actions and make decisions about how to control the vehicle. Tesla is working on creating a foundation model, which is a kind of world model that trains itself on a large amount of data without needing explicit labels. This foundation model would then be fine-tuned using reinforcement learning through human feedback to optimize its performance. An end-to-end model and a world model are key components of AGI systems. They allow the system to learn complex tasks from raw data and make decisions based on an understanding of the world, respectively.

@DarylOster 11 ай бұрын

It also represents the ability for virtual time travel... into the past (to the point of the start of the 4d model), and a short distance into the most plausable furtures...

@z4zuse 11 ай бұрын

How long before the car would recognize it is driving towards itself, when confronted with a large mirror?

@alexanderpoplawski577 11 ай бұрын

Doesn't really matter, it would have to break anyway, as a human would. You don't know what's behind the mirror.

@ApteraEV2024 11 ай бұрын

23:00 fine motor skills 😉

@ApteraEV2024 11 ай бұрын

Where do You work?? Viewer ,instantly watching released videos. Then commenting a Paragraph response.❤😮

@tomturnbull3723 11 ай бұрын

Thank you, thank you, THANK YOU!!! You've helped many people to now truly understand how Tesla will finally solve L5 for everywhere, and why nobody else can.

@-whackd 11 ай бұрын

Comma can

@williamdeoradesilva9444 11 ай бұрын

AMAZING!!!! Blade Runner is here. An exciting time to be alive.

@olyalphy 10 ай бұрын

John, do you still think they will have heuristic code (software 1.0) for things like safety and hard boundaries (don’t over accelerate or turn wheel at a certain rate) and also do you think they will still use the voxel approach to define drive-able space. Or all this just gets replaced with the new end to end world model?

@FinlayPG 11 ай бұрын

Thanks!

@Marc_de_Car 11 ай бұрын

Thank you

@cherubin7th 11 ай бұрын

Foundation models are basically unsupervised learning making a comeback with superpowers.

@lewiswithrow1936 11 ай бұрын

/freaking great! Very informative. Thanks

@JosefSvenningsson 11 ай бұрын

John Vervaeke is a neurophychologist who's developed the 4P model of cognition and it explains and summarizes very well the difference between "propositional" knowledge, i.e. seeing a baseball game on TV vs "participatory" knowledge, i.e. playing baseball.

@unomilan 11 ай бұрын

Excellent!

@M3W3 11 ай бұрын

Thanks a lot

@paulmeynell8866 11 ай бұрын

So the big question, if you collect all data then release an update all cars drive the same. If each car learns like a human will we have high mileage cars being much better at driving than one that hardly gets used? Or will they download the best driving car to everyone as a starting point? Loving the videos thank you.

@Koro2810 11 ай бұрын

Thanks

@retrodraggin5540 11 ай бұрын

Thanks for dumbing it down for the rest of us

@evj6043 11 ай бұрын

I just want to know how through all of this they address the biggest flaw that is affecting my drives today, which is old/inaccurate map data. Right now map data from an outdated system acts first and will constantly put the car in say the right lane when you are about to turn left because the road was different 8 years ago and the maps are from that time. Is their solution to upgrade to a better mapping system or somehow have the vision system lead and the mapping is somehow second. Other than not recognizing signs this is the only thing that causes me to disengage frequently where I live.

@ApteraEV2024 11 ай бұрын

18:00 just wish Text was LARGER , so i could read it... ill just keep Listening...❤

@sujanpoudel2 11 ай бұрын

Like your shirt. Have you been to EBC?

@jabulaniharvey 11 ай бұрын

Voice and text intervention for FSD

@appl314 11 ай бұрын

Does show why cars that are coming towards you from a backwards angle are not identified. That angle doesn't appear to be in the shot. I wonder if they do similar projections from other than the front looking cameras?

@zaneenaz4962 11 ай бұрын

world model essentially is that the system is defined .... uniform exits, lane markings, signage. woe be the case where the cones blew away or were removed.

@fredhearty1762 11 ай бұрын

Since the car's sensors/cameras cannot 'see' any of the car itself (though it must 'understand' its own physical boundaries), wouldn't Optimus need a significantly upgraded version of the World Model in which it can see/perceive its own hands, limbs, physical 'body' relative to the World Model attributes? An end-to-end training regime would be vastly more efficient if ego had self-awareness... seeing where its hand is relative to an object it is picking up, for instance. This interaction would border on consciousness.

@BenefitOfTheDoubtInquiry 11 ай бұрын

FSD knows it's body in the world, and it's wheels. Park assist shows this. Optimus just has a more anamorphic footprint in the 3d space.

@CharlesVanNoland 11 ай бұрын

I just feel like there's going to be a loooooong tail of edge cases as long as FSD is only trained on the road. It will always end up encountering situations it fails to handle that any human could negotiate without issue. Humans understand the world through a lifetime of experience with it and the things in it. A computer backprop trained on driving alone will never understand the world and the things in it in a way that will allow it to behave like a human reliably would. That being said, FSD will of course become (I believe it has been already) statistically safer than the 'average driver' (which doesn't exist) but when it makes a glaring mistake nobody can imagine making which results in someone's death, the consensus will be that it's not safer. People want FSD to at least be as-good, or better, than themselves, and witnessing it make mistakes they can't fathom making erodes confidence in it. As long as Tesla has to tell drivers to be alert and ready to take control, it's not really "Full Self-Driving". When they can say "Lie back, take a nap, watch a movie, enjoy the ride. We'll get you where you're going quickly and safely!" because it doesn't make anymore obvious mistakes, is when it's FSD. I appreciate their pursuit, but I just don't see something that doesn't understand the world like a human reliably behaving like a human.

@mariusm62 11 ай бұрын

Exactly. For all this talk about perception, I have not seen so far a plan from Tesla in how to resolve edge cases. I mean if even now they are still struggling with perception, when are they going to tackle the almost as complex space of actions?

@damonknutson2855 11 ай бұрын

“There’s a difference between knowing the path and walk in the path.”

@michelmasse6225 11 ай бұрын

From a pay now for ever FSD model, I foresee a massive lawsuit if Tesla owners like me who own HW3 FSD can’t do the same as HW4 FSD. And who says the first HW4 rendering won’t be reengineered to be swapped. All that being said, from a future legal perspective, how many current HW3 FSD owners like me will still have their current Tesla anyways. I will certainly consider changing my low Millage Y with a 2025 Tesla Y with all the improvements since my 2020 edition purchase. I believe this is what Tesla is betting on. If you can send a rocket and land it back on earth, you can either upgrade your HW on older models or take the legal hit, which ever is cheaper and less effortless time consuming. Don’t forget we are a tens of thousands paying beta testers financing the AI development we will never own unless we buy Tesla shares.

@johnross6314 11 ай бұрын

You are correct. I am afraid Elon Musk at times gets bamboozled by his semi English speaking FSD team. They make all kinds of wild claims in meetings, he then turns those around to make what turns out to be “false claims”. As a shareholder, customer, supporter of the disruption, like you, I am not self brainwashed. What you say is real. So too are a variety of other dependencies that if not worked correctly will continue to be a big problem.

@TheSpartan3669 11 ай бұрын

@@johnross6314It's very likely the other way around. Musk isn't dumb enough to be fooled by engineers for years on end. He has a habit of making extraordinary claims in all areas (Humans on Mars by 2022, FSD, Hyperloop) so it seems like he's the one knowingly pushing unrealistic expectations.

@a_linikus 11 ай бұрын

Wow, you are a total machine!

@0602bill 11 ай бұрын

Making sense ... Once the world model is formed, you said that it organically becomes increasingly complete or perhaps higher resolution or in some way simply better and better based on its experience with the world on going. The new of a few days ago is that Tesla is proposing to hire people to driver their cars later this summer and early fall for from I think it was $18 to $45 per hour. Does this hiring integrate with the world model refining you describe?

@DarylOster 11 ай бұрын

Unfortunately FSD is also learning some bad habits of average drivers - for instance camping out in the fast lane. In most states the left lane is for passing only.

@sportbikeguy9875 11 ай бұрын

28:09 the driver feedback could be hitting the brakes, re-taking control of the car, facial expressions recorded on the interior camera, body language, etc.... the options are endless

@changtimwu 7 ай бұрын

Finally I found this.

@frank2you 11 ай бұрын

Brilliant

@earleyelisha 11 ай бұрын

If you’re citing LeCun as an authority, just know he doesn’t think the field will have any agents with a world model for at least a decade. And even then his thoughts are that they may only be as capable as a mouse.

@FrunkensteinVonZipperneck 11 ай бұрын

Doc, RE: ChatGPT: Are researchers creating foundation models in every language?

@charlesrovira5707 11 ай бұрын

@29:29 I only have two words to say: _Mirror Neurons._ People who can _dance,_ not just __spaz out to a beat,_ usually learn the kinematics of a dance from just watching it, a dozen times before they try it. *Dojo* can be considered like *_Ego's_*_ mirror neurons._

@johntessier7248 10 ай бұрын

I like this system

@mariusmeyer14 11 ай бұрын

Thx, I think my brain just exploded but understand it a little more. 😅

@stevenjensen3653 11 ай бұрын

very few of these comments sound like they actually understood the video and the potential of Tesla growing a Foundational World Model... for Tesla investors, its means we get all da monies

@ApteraEV2024 11 ай бұрын

19:00 they need a Terminal Output OS...😅❤DARK MODE😅❤

@quantum2330 11 ай бұрын

Ahhhhh Ok Ok i think im starting to underatand how they kight be training the model.. and yeah it's super simple. It's juat like a bit game of match the probable. Images To simplify further. It's like giving the car a liat of known images that look like "being in the righr hand lane" for example. And then the car can simply look at what pictures it is currently seeing and work out what it needs to do to make it's live images match up with.. what it should look like to be "in the right hand Lane" I can defiently see how this would solve most of teslas problems. And how this could apply to. Many other things. This would also make training a self driving system in different envyroments very easy. At this point who ares if your traffic cone looks like x or looks like y the system can just learn it

@aomurdock 11 ай бұрын

Future Headline: 'Hackers change FSD USA to right hand drive & create chaos!' Hopefully this is just me being paranoid.

@crawkn 11 ай бұрын

I suspect it might make some uncomfortable that it seems to be creating a sort of "black box" to process in this way, making the learning and interpretation process opaque to humans. What I realized is that this doesn't eliminate labeling for the purpose of the human interface, it simply separates it from the real-time prediction process.

@julie-xd7rr 11 ай бұрын

I think you are correct most of the time. However, concerning the problem with FSD going faster than the speed limit. FSD will not do that. The slow car is not causing dangerous conditions, it is the vehicles going too fast that is the problem. I am not sure how FSD is going to handle road rage.

@danam579 11 ай бұрын

I was talking to some safety drivers who got the opportunity work with a self driving system which had been end to end trained. It was capable of delivering spectacularly. Good behavior. Better behavior than they had ever seen in SD system made of modules like perception and planning. Unfortunately, the end to end trained system was capable of absolutely terrifying behavior. And there was no ability to determine what would cause either behavior. Although the DMV does make you take a road test, you can only do that once you’ve demonstrated a high level of capacity to recognize signage and other abstract concepts of operating a vehicle on the public roads. I’m still wondering how Tesla can make strong assurances that their FSD system won’t keep smashing into first responders vehicles, or first responders themselves

@blengi 11 ай бұрын

hmm, I slow down and use the high level executive function module of my neocortex to address edge cases in the real world appropriately and not rely on some heavy duty black box world model inference process. Scares me to think others don't and that FSD wouldn't aim for a more modular high level abstraction layer over real world physical scenarios....

@andreasklossek9252 11 ай бұрын

2:56 Most LIDAR scans do look more consistent, to be honest.

@alexanderpoplawski577 11 ай бұрын

Because they contain a lot of redundant information.

@freddydad1 11 ай бұрын

scratching my speculative itch..

@zaneenaz4962 11 ай бұрын

Amazing you stop to explain what RGB means and is,, but what is this CPR or CVPR ? Please explain what is this CPR or CVPR ? Foundation Model ....where have i heard this before....maybe Microsoft Foundation Classes in programming windows, ugh .....