Thank you to the comma team for this amazing event, sharing the livestream and high-quality videos. All credits to them. Follow their official kzbin.info for more livestreams. Check out github.com/commaai/commavq | github.com/commaai/comma2k19 | github.com/commaai/comma10k | blog.comma.ai from $1250 buy -> comma 3X comma.ai/shop/comma-3x | openpilot.comma.ai is the best ADAS system in the world | If you have skill apply on -> comma.ai/jobs | Join the community on github.com/commaai/openpilot & discord.comma.ai | Check if your car is supported -> comma.ai/vehicles | How to setup your car comma.ai/setup | Stay up to date by following twitter.com/comma_ai | Support comma.ai at comma.ai/shop Chapters: 00:00 intro 00:20 learning a driving simulator 00:36 imagined, predicted videos via ML 01:44 why a simulator 02:50 why not a classical simulator 04:00 small offset simulator 06:25 learning a driving simulator paper 07:10 ML simulator architecture 07:25 image tokenizer 08:32 source - 5120 bits - 1280 bits 10:25 pose tokenizer (np.digitize) 11:06 dynamics transformer 14:00 it works, straight, turning right, left 14:56 accelerating, braking 15:25 commaVQ opensource dataset 15:50 2x$1000 bounties 16:50 side note 1: tokenized driving models 17:25 side note 2: smooth decoder 18:29 raw - smooth rollout 18:44 side note 3: loss function for driving in ML simulator 20:02 next steps, auto-regressive sampling 22:03 using the learned simulator for traning driving models 22:13 beautiful rollouts properties 23:40 questions 23:52 conditioning on language? 24:10 pose tokens 25:18 bottlenecks to inference for real time 26:23 aplications for human indistinguishable rollouts 27:13 model collapse from simulation data 28:35 mass auto encoders 29:40 video decoder 30:00 INB frame, general i frame and p frame 30:40 output from the simulator used for training policy models 31:00 two models, end to end, same loss function 32:08 policy model will be trained on the simulator, not the vision model 32:25 how to keep up with machine learning research 33:15 you don't want to be the first one to replicate a paper 33:48 state of the art, leveraging other work and open sourcing 34:35 where do the model names come from, Nicki Minaj, Nicholas Cage 34:55 tokenize because cross entropy is the best loss function 35:50 interpreting the tokens 36:45 conditioning on geolocation 37:35 scaling to any type of robotics problem 38:41 adding language as the first input 40:05 why does it flicker 41:30 smoothing decoder, RNN layer
@KhaosVFX10 ай бұрын
Great timing, Harald (Comma CTO) just published today that they finally got a working model fully trained on this simulator on Twitter!
@blaclee10 ай бұрын
Wonderful presentation, Thank you Yassine and thank you Camma! Keep it up 🙌
@dev_navdeep10 ай бұрын
great presentation, well explained, been working on generating synthetic data, this seems really excited.
@Ella_199410 ай бұрын
I truly expect this project to completely revolutionize self driving car industry!!!!!!!!!!!!!!!!!!!!!!
@MrFlawor10 ай бұрын
Aweseome to watch! 13:00 "In theory" triggers my siri haha
@ArtOfTheProblem10 ай бұрын
Amazing work. So is the path ahead is a Learning a Universal simulator, that's "the world model" problem everyone is after?
@-mwolf10 ай бұрын
I think the caption for "mass autoencoder" is supposed to be "masked autoencoder" :)
@nishantnikhil290710 ай бұрын
What's the smooth decoder? Is there a paper?
@avchor10 ай бұрын
Will Comma ever transition to making self-driving possible for semi-trucks?