MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

  Рет қаралды 310,191

Lex Fridman

Lex Fridman

Күн бұрын

Пікірлер: 146
@lexfridman
@lexfridman 6 жыл бұрын
Deep RL is my favorite subfield of AI, because it asks some fundamental questions about what it takes to build safe and intelligent robots that operate in the real world. So many open problems and interesting challenges to solve!
@thesk8erdav
@thesk8erdav 6 жыл бұрын
we love you Lex!
@farhadsafaei1910
@farhadsafaei1910 6 жыл бұрын
It's my favorite one, too. Thanks for the lecture, I did enjoy a lot watching it.
@colouredlaundry1165
@colouredlaundry1165 6 жыл бұрын
With these lectures and interviews you are sharing and creating immense value: knowledge. Thank you!
@dklvch
@dklvch 6 жыл бұрын
Thank you Lex, awesome presentation!
@liuculiu8366
@liuculiu8366 6 жыл бұрын
love your spirit in sharing the latest information. appreciate!
@NakedSageAstrology
@NakedSageAstrology 2 жыл бұрын
I wish you still did videos like this, we appreciate you sharing such knowledge.
@自我表达
@自我表达 Күн бұрын
Wow, KZbin nailed it! This video couldn't have come at a better time.
@KeepingUp_withAI
@KeepingUp_withAI 6 жыл бұрын
Deep RL is the field that excites me the most. Thank you Lex.
@kawingchan
@kawingchan 6 жыл бұрын
I really like that tongue in cheek chuckle when Lex talked about that multiverse and whoever created it.....
@wendersonj
@wendersonj 5 жыл бұрын
Since 2017, Lex have improved his lessons spectacularly ! Now (2019), I watch a more fluid video with a feeling that this guy know exactly what his talking without hesitating . Once again, thanks Lex, for sharing this videos. Congratulations and thanks from Brazil.
@Techieadi
@Techieadi 6 жыл бұрын
Thank you for bringing these lectures to us.
@nova2577
@nova2577 5 жыл бұрын
"Every type of machine learning is supervised learning", cannot agree more!!!
5 жыл бұрын
In fact, learning itself is a supervised process, otherwise it is acquiring not learning.
@samuelschmidgall2090
@samuelschmidgall2090 5 жыл бұрын
Seriously the best Deep RL lecture out there to date.
@ronaldolum464
@ronaldolum464 9 ай бұрын
Certainly, one of the best videos on deep learning I have come across.
@akarshrastogi3682
@akarshrastogi3682 5 жыл бұрын
Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285. Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..
@akarshrastogi3682
@akarshrastogi3682 5 жыл бұрын
1:04:40 Best part, that grin after he just casually dropped that line in an MIT lecture.. All of infinite universes being Simulations
@judedavis92
@judedavis92 2 жыл бұрын
Loved the lecture. Definitely recommend his podcast. Quality.
@tarunpaparaju5382
@tarunpaparaju5382 5 жыл бұрын
I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on KZbin. Thank you!
@sivaa6130
@sivaa6130 6 жыл бұрын
Every Lecture has a historical context, evolution, mathematics and inspiration, Technical overview, Network Architecture overview. Well Summarized!!
@amandajrmoore3216
@amandajrmoore3216 2 жыл бұрын
As always Le a generous Share, which will be a useful resource for loads of folks. Thanks.
@merebhayl5826
@merebhayl5826 2 жыл бұрын
I like how you quoted many theorems from Dostoevsky and also a few axioms from the Nietzsche's texts
@merebhayl5826
@merebhayl5826 2 жыл бұрын
I had never seen Lex's lecture videos other than the philosophical podcasts. This is my first. And I just wrote the above comment as a joke without seeing the video and three minutes in, I found Socrates, Kant, Nietzsche... 😂😂 That's very Lex👌
@danielvelazquez4472
@danielvelazquez4472 6 жыл бұрын
Haha he says "that is super exciting", without being excited! He is a robot! Thanks for the open lectures
@mrr5183
@mrr5183 3 жыл бұрын
I appreciate the philosophical insights sprinkled throughout the lecture!
@MistaSmilesz
@MistaSmilesz 5 жыл бұрын
I've seen a lot of these videos & read some of the books in ML; Lex has a clarity thats rare
@chinbold
@chinbold 5 жыл бұрын
I like his lecture because it's more understandable. And I also like his tones.
@ArghyaChatterjeeJony
@ArghyaChatterjeeJony 5 жыл бұрын
Lex Fridman, I just love your videos. I am your great fan sir. Carry on.
@DennisZIyanChen
@DennisZIyanChen 4 жыл бұрын
I honestly don't care about AlphaGo or Dota 2 or the robots, I just cannot get over how incredible the thought structure is behind this. What is mean by thought structure is the strategy behind how to quantify the right things, asking the right questions, and model the policy upon which growth can be created. IT IS SICK
@vast634
@vast634 4 жыл бұрын
Important detail when trying to transfer from a simulation to the real world: make the simulation have many random variations in its behavior/mechanics during runtime. (such as drag, gravity, friction, size of the agent, random perturbations, etc) This will make the agent have to generalize more, and not over optimize on the details in the sim. This makes it easier to transfer the agents capabilities to a real world environment.
@user-sc8ph2ds2m
@user-sc8ph2ds2m 2 жыл бұрын
gravity is fake buddy ;)
@vast634
@vast634 2 жыл бұрын
@@user-sc8ph2ds2m Take a brick, stand still, throw it straight up, then you can observe if gravity exists, or not. Very simple experiment to administer.
@user-sc8ph2ds2m
@user-sc8ph2ds2m 2 жыл бұрын
@@vast634 you will experience buoyancy 🤦
@charlesotieno6309
@charlesotieno6309 5 жыл бұрын
Thanks Lex !! Deep Reinforcement Learning opens up a new world..Life is not that complex like the baby in your video taking his first steps...unsupervised learning. Take into account the amount of time and effort(brains+USD) of getting an AI to do what the baby is doing..WALK in a few days and in the years to come -be a professor and continue with this subject The baby is the moral of the story....what we are doing is not working...we need a radical way of thinking...Your radical way is the way forward
@abdulrahmankerim2377
@abdulrahmankerim2377 6 жыл бұрын
One of the best lectures, I have ever watched ....Keep it up.
@jefferysherwood7424
@jefferysherwood7424 4 жыл бұрын
🐸🐸🐸🐸🐸
@Arghamaz
@Arghamaz Жыл бұрын
This is interesting for me as this is my favorite Mathematics n Statistics combined Algebraic equations 🎉 MATHEMATICS is the Best Subject in World 🌎 👌 ❤🎉🎉
@neutrinocoffee1151
@neutrinocoffee1151 6 жыл бұрын
Loved this lecture. I learned a lot. Thank you.
@sofina527
@sofina527 Жыл бұрын
very helpful, thanks a lot dear prof.
@AbhishekKumar-mq1tt
@AbhishekKumar-mq1tt 6 жыл бұрын
Thank u for this awesome video
@datta97
@datta97 4 жыл бұрын
Thanks for the last slide.
@samferrer
@samferrer 5 жыл бұрын
Another detail I have noticed in many presentations ... those agents are not trying to model the environment ... that is semantically impossible ... what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance ...
@samferrer
@samferrer 5 жыл бұрын
Kevvy Kim hmmm ... we are saying the same thing ... it seems that practitioners and lectures keep it short without realizing perhaps the big conceptual gap is being created.
@CarlosGutierrez-go9hq
@CarlosGutierrez-go9hq Жыл бұрын
since i begin my journey of data science, machine learning, and AI I have been seeing patterns, I am the only one who see that is probably that we are just programs seeking for a never-ending end of this simulation, the way that q-learning is created is the most realistic comparison to human thought, so in order to maximize my output i have to reconsider my reward mechanism? (taking some info from huberman also).
@msamogh96
@msamogh96 4 жыл бұрын
This guy is a better Siraj Raval.
@LidoList
@LidoList 5 жыл бұрын
Very good explanation of RL, thanks for the speaker !
@noname76787
@noname76787 2 жыл бұрын
thank you so much for the lecture!
@oldPrince22
@oldPrince22 3 жыл бұрын
very good lecture! Thanks.
@AviaEfrat
@AviaEfrat 4 жыл бұрын
27:24 - There is no "reload" in Doom =)
@hansharajsharma2765
@hansharajsharma2765 5 жыл бұрын
Love this. Thanks Lex.
@junxu147
@junxu147 3 жыл бұрын
Great lecture!
@alec1975
@alec1975 2 жыл бұрын
very good intro
@kaneelsenevirathne7085
@kaneelsenevirathne7085 4 жыл бұрын
I took the engineering plasma class taught by your dad at Drexel :D
@benyaminewanganyahu
@benyaminewanganyahu Жыл бұрын
This guy should do podcasting.
@jonk.3947
@jonk.3947 6 жыл бұрын
Love the Digital Physics reference at 1:04:00 :)
@Lunsterful
@Lunsterful 6 жыл бұрын
Excellent talk.
@bayesianlee6447
@bayesianlee6447 6 жыл бұрын
Lex, I heard that DL professionals are now using the simulation which has nature based environment and using it to teach AI agent like making this agent to learn how to walk or run by itself. Yoshua bengio said next evolution will be based on simulation environment for AI. Would you have any idea or information to share with that? I really really appreciate all your works and spirit you have. All the world who have interests on AI really appreciate your work and sharing. Thank you ! :)
@borispyakillya4777
@borispyakillya4777 5 жыл бұрын
Do you mean smth like GYM-based simulations? Mujoco is based on physical laws - you can already train with RL methods
@yu-siangwang1818
@yu-siangwang1818 6 жыл бұрын
Great overview of DRL
@Asmutiwari
@Asmutiwari 4 жыл бұрын
Amazing lecture on DRL, can you also show us how can we implement Q function in Neural Network?
@stabgan
@stabgan 5 жыл бұрын
You are my idol lex
@bryanbocao4906
@bryanbocao4906 6 жыл бұрын
It would be appreciated if anyone can have specific steps to get all the directions on the map from 18:51 to 21:32 in great detail.
@heinrichwonders8861
@heinrichwonders8861 6 жыл бұрын
I have been waiting for this.
@mrektor
@mrektor 6 жыл бұрын
Amazing work. Excelent lecture
@scorpion7434
@scorpion7434 5 жыл бұрын
The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing! Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!
@jasonabc
@jasonabc 6 жыл бұрын
Really great lecture learned a lot
@sauravsingh9177
@sauravsingh9177 2 жыл бұрын
check out - "Spinning up with Deep RL by openai"
@kaiwang2924
@kaiwang2924 5 жыл бұрын
Wonderful lecture.
@Lorkin32
@Lorkin32 5 жыл бұрын
Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.
@emilecureau
@emilecureau 2 жыл бұрын
"when the reward flips, the optimal path is grad school, taking as long as possible and never reaching the destination....pffff" lol 21:20
@konouzkartoumeh
@konouzkartoumeh 5 жыл бұрын
Great lecture! Thank you.
@caizifeng
@caizifeng 5 жыл бұрын
great lecture
@samlaf92
@samlaf92 5 жыл бұрын
@50:06 DQN can't learn stochastic policies. DQN has a softmax output on actions... isn't that a stochastic policy in itself?
@ruinsaneornot
@ruinsaneornot 6 жыл бұрын
30:30 "you know, MIT does better than Stanford that kind of thing" xD
@Twgvlogs539
@Twgvlogs539 5 жыл бұрын
Super
@eeee8677
@eeee8677 5 жыл бұрын
THANK YOU MIT
@rorylennon
@rorylennon 2 жыл бұрын
Nice vijeo...
@onwrdandupwrd5303
@onwrdandupwrd5303 3 жыл бұрын
that DeepRL animation looks like something out of Bamzooki
@putzz67767
@putzz67767 6 жыл бұрын
very good!!
@stmandl
@stmandl 5 жыл бұрын
Hi Lex, thanks for this great lecture! Which books of Nietzsche did you have on your mind around 4:33?
@OEFarredondo
@OEFarredondo 5 жыл бұрын
Remove the human factor. Have the traffic be free of human crossing
@aabkhcdcz6067
@aabkhcdcz6067 5 жыл бұрын
شكرا جزيلا
@deeplearningpartnership
@deeplearningpartnership 6 жыл бұрын
Nice.
@inaamilahi5007
@inaamilahi5007 4 жыл бұрын
Awesome
@vincentschmitt392
@vincentschmitt392 3 жыл бұрын
nice tie
@el_lahw__el_khafi
@el_lahw__el_khafi 3 ай бұрын
where are the rest of the lectures?
@johnmacleod7789
@johnmacleod7789 5 жыл бұрын
Brilliant!!
@benaliamima9903
@benaliamima9903 3 жыл бұрын
Thank you for this amazing video. I want to know if i can use the DRL principe to enhance the QoS requirements in vehicular network?? Any suggestions??
@ProfessionalTycoons
@ProfessionalTycoons 6 жыл бұрын
very good
@liberator328
@liberator328 5 жыл бұрын
Which Nietzsche book is he recommending at 4:12 ?
@nisman.lo.desvivieron
@nisman.lo.desvivieron Жыл бұрын
27:07 lex is scared of Doom
@abhaysap
@abhaysap 6 жыл бұрын
Can we take the idea's or clues from Biomimicry architecture in Reinforcement learning
@OldGamerNoob
@OldGamerNoob 6 жыл бұрын
My naive perception is that every frame of "video" entering into each of our eyes and every second of sensory data we receive from birth constitutes a rather large data set for our brains to train on (although having the possibility to constantly train and update the network)
@mutyaluamballa
@mutyaluamballa 6 жыл бұрын
Yes, but my perception is, the brain is already a trained model with the data from all our ancestors and at the time of birth. we will have a trained model only with all the necessary weights excluding the dataset it is trained on (our ancestors' life). which can be retrained on the go, based on our experiences. : )
@kawingchan
@kawingchan 6 жыл бұрын
I think this maybe mostly true for other mammals, the less intelligent, the more hard wired. When it comes to human, maybe not so sure how much we rely on genetic wiring, vs. neural plasticity aka training. Not sure if any ethical experiments can bring any insight.
@thepalad1n197
@thepalad1n197 5 жыл бұрын
oh shit i listen to your podcast lmao
@pittyconor2489
@pittyconor2489 4 жыл бұрын
nice
@sarathrnair9499
@sarathrnair9499 6 жыл бұрын
Why no one is asking any doubts ? Or is that portions edited out? Nice lecture
@samferrer
@samferrer 5 жыл бұрын
I am having hard ... very hard time believing that the brain uses back propagation as learning mechanism ... it just makes no sense in a space-time governed universe ... god damn good lecture ... by the way ...
@MrPeregrineFalcon
@MrPeregrineFalcon 5 жыл бұрын
Lex doesn't say the brain uses it (he says it's a mystery). And more generally most cognitive neurologists don't believe it does - although some think there are similar biological correlates. But it's a very efficient algorithm for ANNs to perform gradient descent.
@petevenuti7355
@petevenuti7355 3 жыл бұрын
As far as I know , biological brains don't use back propagation. But there are neural circuits where the flow of information goes opposite. There is also the chemical side of things integrating many levels of homeostasis from hunger to pain to emotion. I would say the combination of those two are the mysterious correlates of back propagation, back propagation being the obviously oversimplified version.
@samferrer
@samferrer 3 жыл бұрын
@@petevenuti7355 got it ...
@jeanjacqueslundi3502
@jeanjacqueslundi3502 4 жыл бұрын
Are we really morally equipped to build AI that is safe and also built it for the right reasons. This is my problem with contporary science/techhnology... We dont focus on if we SHOULD do something. Just because its doable doesnt mean it should be made.
@ns4235
@ns4235 3 жыл бұрын
just create a large number of random simulations. if you're successful in a large number of other realities then this one should be easy. o_o
@kevinayers7144
@kevinayers7144 4 жыл бұрын
Is the entire deep RL course available?
@Lorkin32
@Lorkin32 5 жыл бұрын
How/why can you even upload this for free? Doesn't university cost loads in the US? Great stuff though!
@m3awna
@m3awna 5 жыл бұрын
I guess that's because MIT is focusing more on workshops/hands-on learning, AND to raise the barre for other universities/institutes... hhh
@petevenuti7355
@petevenuti7355 3 жыл бұрын
But if a diploma is your goal , it sometimes helpful to sit in on a class before you take it for credit, can make it easier, but sometimes it just makes it boring and counterproductive the second time around.
@petevenuti7355
@petevenuti7355 3 жыл бұрын
Sitting in doesn't get you credits or a diploma.
@msp9331
@msp9331 4 жыл бұрын
isnt that the guy from joe rogans podcast? it takes me a week to grasp what he says in 5 minutes.
@abhiastronomy
@abhiastronomy 4 жыл бұрын
Nice yo
@reinerwilhelms-tricarico344
@reinerwilhelms-tricarico344 2 жыл бұрын
Couldn't always follow. Was distracted by the two cats and then later by the fool who fell in the water. 🙂
@midishh
@midishh 5 ай бұрын
hugest*
@skyfeelan
@skyfeelan 3 жыл бұрын
34:12
@arsh2489
@arsh2489 Жыл бұрын
2:15
@fizzfox8886
@fizzfox8886 4 жыл бұрын
the robots won't be happy to see that we kicked them in our labs instead of being friendly :/
@rikelmens
@rikelmens 6 жыл бұрын
Lex is super low on cortisol and super high on gaba. So much so he sounds quite sleepy sometimes.
@raphaelrehman
@raphaelrehman 6 жыл бұрын
Insert comment
@juggy666
@juggy666 2 жыл бұрын
I like Lex' interviews but he is a terrible tutor
@vincentschmitt392
@vincentschmitt392 3 жыл бұрын
sounds BS
MIT 6.S191: Reinforcement Learning
1:00:19
Alexander Amini
Рет қаралды 77 М.
Reinforcement Learning Course - Full Machine Learning Tutorial
3:55:27
freeCodeCamp.org
Рет қаралды 1 МЛН
MIT 6.S094: Introduction to Deep Learning and Self-Driving Cars
1:31:29
Lex Fridman
Рет қаралды 1,2 МЛН
Deep Learning Basics: Introduction and Overview
1:08:06
Lex Fridman
Рет қаралды 2,3 МЛН
An introduction to Policy Gradient methods - Deep Reinforcement Learning
19:50
Reinforcement Learning Series: Overview of Methods
21:37
Steve Brunton
Рет қаралды 112 М.
An introduction to Reinforcement Learning
16:27
Arxiv Insights
Рет қаралды 673 М.
Overview of Deep Reinforcement Learning Methods
24:50
Steve Brunton
Рет қаралды 70 М.