A friendly introduction to deep reinforcement learning, Q-networks and policy gradients

  Рет қаралды 90,000

Serrano.Academy

Serrano.Academy

Күн бұрын

A video about reinforcement learning, Q-networks, and policy gradients, explained in a friendly tone with examples and figures.
Introduction to neural networks: • A friendly introductio...
Introduction: (0:00)
Markov decision processes (MDP): (1:09)
Rewards: (5:39)
Discount factor: (8:51)
Bellman equation: (10:48)
Solving the Bellman equation: (12:43)
Deterministic vs stochastic processes: (16:29)
Neural networks: (19:15)
Value neural networks: (21:44)
Policy neural networks: (25:44)
Training the policy neural network: (30:46)
Conclusion: (34:53)
Announcement: Book by Luis Serrano! Grokking Machine Learning. bit.ly/grokkingML
40% discount code: serranoyt

Пікірлер: 145
@-xx-7674
@-xx-7674 Күн бұрын
This is probably the most friendliest video and still covering all important concepts of RL, thank you
@zentootv4687
@zentootv4687 5 ай бұрын
Hands down, this explanation of reinforcement learning is like winning a dance-off against a robot-smooth, on point, and utterly unbeatable!
@SerranoAcademy
@SerranoAcademy 5 ай бұрын
Thanks! Lol, I love it!
@reyhanehhashempour8522
@reyhanehhashempour8522 2 жыл бұрын
Fantastic as always! Whenever I want to learn a new concept in AI, I always start with Luis's video(s) on that. Thank you so much, Luis!
@achyuthvishwamithra
@achyuthvishwamithra 2 жыл бұрын
I feel super fortunate to have come across your channel. You are doing an incredible job! Just incredible!
@srinivasanbalan5903
@srinivasanbalan5903 3 жыл бұрын
One of the best videos on RL algorithms. Kudos to Dr. Serrano.
@colabpro2615
@colabpro2615 3 жыл бұрын
you're one of the best teachers I have ever come across!
@wfth1696
@wfth1696 Жыл бұрын
One of the clearest explanations of the topic that I saw. Excellent!
@renjithbaby
@renjithbaby 3 жыл бұрын
This is the simplest explanation I have seen on RL! 😍
@EshwarNorthEast
@EshwarNorthEast 3 жыл бұрын
The wait ends! Thank you sir!
@jeromeeusebius
@jeromeeusebius 2 жыл бұрын
Luis, great video. Thanks for putting this together explaining the most important concepts and terms in Reinforcement Learning.
@therockomanz
@therockomanz Жыл бұрын
I'd like to thank the creators for this video. This is the best video to learn the basics of RL. Helped a lot in my learning path.
@riddhimanmoulick3407
@riddhimanmoulick3407 7 ай бұрын
Thanks for such a great video! Your visual descriptions combined with your explanations really presented a wonderful conceptual understanding of Deep-RL fundamentals.
@geletamekonnen2323
@geletamekonnen2323 2 жыл бұрын
I Can't pass Without appreciating this great great Lecture. Thanks Luis serrano. 😍
@alsahlawi19
@alsahlawi19 Жыл бұрын
This by far the best video explaining DRL, many thanks!
@jjhj_
@jjhj_ Жыл бұрын
I've been bingewatching your "friendly intro to" series since yesterday and it has been amazing. I've worked with ML models as part of my studies and my work over the past two years, but even so, you've enriched my conceptual understanding by so much more than any of my professors could. Really appreciate your clever visualizations of what's going on "under the hood" of the ML/DL algo's. Great videos, awesome teacher!
@SerranoAcademy
@SerranoAcademy Жыл бұрын
Thank you, so happy to hear you’re enjoying the series! :)
@user-cy2hk8bb3x
@user-cy2hk8bb3x 9 ай бұрын
Yeah
@zhangeluo3947
@zhangeluo3947 9 ай бұрын
Yes@@user-cy2hk8bb3x
@LuisGonzalez-jx2qy
@LuisGonzalez-jx2qy 3 жыл бұрын
Amazing work fellow Luis! Looking forward to more of your videos
@TheOnlyAndreySotnikov
@TheOnlyAndreySotnikov 9 ай бұрын
Great video!
@emanuelfratrik1251
@emanuelfratrik1251 2 жыл бұрын
Excellent explanation! Thank you!
@Andy-rq6rq
@Andy-rq6rq 2 жыл бұрын
Amazing explanation! I was left confused after the MIT RL lecture but it finally made sense after watching this
@NoNTr1v1aL
@NoNTr1v1aL 2 жыл бұрын
Absolutely amazing video! You are my saviour!
@shreyasdhotrad1097
@shreyasdhotrad1097 3 жыл бұрын
Very intuitive as always. Expecting some more intuitions on semi supervised learning,energy models. Thank you so much sir!!🙏
@miguelramos3424
@miguelramos3424 Жыл бұрын
it's the best video that I've seen about this topic, thanks.
@saphirvolvianemfogo1717
@saphirvolvianemfogo1717 2 жыл бұрын
Amazing explanation. Thank you, it gives me a good starting point on DRL
@eeerrrrzzz
@eeerrrrzzz 2 жыл бұрын
This video is a gem. Thank you.
@mariogalindoq
@mariogalindoq 3 жыл бұрын
Luis: congratulations! Again a very good video, very well explained and with a beautiful presentation. Thank you.
@alexandermedina4950
@alexandermedina4950 Жыл бұрын
Great starting point for RL! Thank you.
@prakashselvakumar5867
@prakashselvakumar5867 2 жыл бұрын
Very well explained! Thank you
@beltusnkwawir2908
@beltusnkwawir2908 2 жыл бұрын
I love the analogy of the discount factor with the dollar depreciation
@piyaamarapalamure5927
@piyaamarapalamure5927 9 ай бұрын
This is the best tutorial so far for the Q learning .. Thank you so much 😍😍
@karlbooklover
@karlbooklover Жыл бұрын
best explanation I've seen so far
@charanbirdi
@charanbirdi Жыл бұрын
Absolutely brilliant, specially Nural network and loss function explanation
@zamin_graphy
@zamin_graphy Жыл бұрын
Fantastic explanation.
@yo-sato
@yo-sato Жыл бұрын
EXcellent tutorial. I have recommended this tutorial to my students.
@code_with_om
@code_with_om Жыл бұрын
After a day of searching I found a great explanation 😀😀 thank you so much
@zeio-nara
@zeio-nara 2 жыл бұрын
An excellent explanation, thank you
@nishanthplays195
@nishanthplays195 2 жыл бұрын
No words sir! Finally found another great yt channel ✨
@pellythirteen5654
@pellythirteen5654 2 жыл бұрын
Fantastic ! Having watched many teachings on this subject , your explanation really made things clear. Now my fingers are itching to try it out and write some Delphi code. I will start with your grid-world first , but if that works I want to write a chess-engine. I have already written a chess-program using the alfa-beta algoritme and it will be fun to compare it with a neural-network based.
@mutemoonshiner
@mutemoonshiner Жыл бұрын
Huge thanks , for a nice and lucid content. specially for how to train the network, loss function and how to create datasets.
@overgeared
@overgeared 2 жыл бұрын
excellente como siempre! thank you from an MSc AI student working on DQNs.
@pandharpurkar_
@pandharpurkar_ 3 жыл бұрын
Luis is master man of explaining complex things easily..!! thank you luis for such a great efforts
@lebohangmbele283
@lebohangmbele283 2 жыл бұрын
Wow. I can show this to my pre-school nephew and at the end of the video they will understand what RL is all about. Thanks.
@mustafazuhair2830
@mustafazuhair2830 2 жыл бұрын
You have made my day, thank you!
@scooby95219
@scooby95219 2 жыл бұрын
great explanation. thank you!
@DrMukeshBangar
@DrMukeshBangar 2 жыл бұрын
great video. easy explanation! thank you.
@flwi
@flwi Жыл бұрын
Wow - that was a very understandable explanation! Well done!
@infinitamo
@infinitamo 2 жыл бұрын
You are a God-send. Thank you so much
@shreyashnadage3459
@shreyashnadage3459 3 жыл бұрын
Finally here it is....been waiting for this for ages! Thanks Luis! Regards from India
@francescserratosa3284
@francescserratosa3284 2 жыл бұрын
Excellent video. Thank's a lot!!
@ahmarhussain8720
@ahmarhussain8720 Жыл бұрын
amazing explanation
@KathySierraVideo
@KathySierraVideo 2 жыл бұрын
Thank-you for this 🙏
@ahmedoreby2856
@ahmedoreby2856 2 жыл бұрын
very good video with excellent elaboration for the equation thanks you very much for this
@kr8432
@kr8432 3 ай бұрын
I am not stupid but AI still does not come easy to me. Sometimes I wonder, besides having more slots in the working memory, how a genius or simply more intelligent people think about this subject so that it comes more naturally to them. I feel like this video was a very good insight on how easy such a complicated topic can appear, if you just have a very good intuitive understanding for abstract concepts. Very nicely done!
@AyaAya-fh2wx
@AyaAya-fh2wx Жыл бұрын
You are a genius!! Thank you!
@adrianfiedler3520
@adrianfiedler3520 2 жыл бұрын
Incredible video, I love the animations!
@bostonlife8589
@bostonlife8589 2 жыл бұрын
Fantastic explanation!
@sricinu
@sricinu 2 жыл бұрын
Excellent explaination
@CrusadeVoyager
@CrusadeVoyager 3 жыл бұрын
Nice vid with gr8 explanation on RL.
@randomdotint4285
@randomdotint4285 2 жыл бұрын
Oh my god. This was god level teaching. How I envy your real world students.
@paul-andrejacques2488
@paul-andrejacques2488 2 жыл бұрын
Just Fantastic. Thank you
@Shaunmcdonogh-shaunsurfing
@Shaunmcdonogh-shaunsurfing 2 жыл бұрын
Excellent video! Hoping for more on RL.
@diwakerkumar5910
@diwakerkumar5910 9 ай бұрын
Thanks 🙏
@sergeipetrov5572
@sergeipetrov5572 3 жыл бұрын
Thank you so much! Very useful!
@svein2330
@svein2330 3 жыл бұрын
This video is brilliant!
@roshanid6523
@roshanid6523 2 жыл бұрын
Thanks for sharing
@debobabai
@debobabai 2 жыл бұрын
Excellent explanation. I dont know why this video has so low views. It deserves Billion views.
@faisaldj
@faisaldj 2 жыл бұрын
I wish I had atleast my bachelors Math teacher like you but I would like to be like you for my students.
@li-pingho1441
@li-pingho1441 7 ай бұрын
this the best rl tutorial on internet
@seraphiusNoctis
@seraphiusNoctis 2 жыл бұрын
Loved the video, quick question on the policy network section, because something still seems a little “disjointed” in the sense that the roles for both networks do not seem to be clear - I might be missing something… I don’t understand why we would use a decreasing/recursive “gain” function instead of just using the value network for the purpose of establishing values for the policy. Instead, doesn’t the value network already build in feedback mechanism that would be well suited to this?
@ZirTaaah
@ZirTaaah Жыл бұрын
best vids on the subject for suuuuuuuure im mad that i didnt see it earlier nice broo
@teetanrobotics5363
@teetanrobotics5363 3 жыл бұрын
Amazing. Could you please make a course on RL and Deep RL?
@ishwargowda
@ishwargowda 2 жыл бұрын
This is perfect!!!
@AlexisKM100
@AlexisKM100 6 ай бұрын
God damn it, this explanation was just straightforward, I loved it, it helped me to clarify many doubts I had, thanks :D Just how every explanation should be, concise and with practical examples.
@antonioriz
@antonioriz 2 жыл бұрын
This is simply GREAT! I would love to follow more video on the issue of Reinforcement Learning. By the way I'm really enjoying your book Grokking Machine Learning, but I would like to know more on RL
@banaiviktor6634
@banaiviktor6634 2 жыл бұрын
Yes agree, no clear explanation on this topic apart from this video , thanks a lot, it is awesome ! :)
@nothing21797
@nothing21797 Жыл бұрын
Wunderbar!!!
@bjornnorenjobb
@bjornnorenjobb 2 жыл бұрын
wow, extremely good video my friend! Big thanks!
@ahmedshamz
@ahmedshamz Ай бұрын
Thanks for these videos Luis. Are these from a course?
@honghaiz
@honghaiz 5 ай бұрын
Nice presentation
@msantami
@msantami 3 жыл бұрын
Thanks, great video. Bought the book!
@SerranoAcademy
@SerranoAcademy 3 жыл бұрын
Great to hear, thank you! I hope you like it!
@brok4498
@brok4498 2 жыл бұрын
great job!
@aliza207
@aliza207 3 жыл бұрын
in love with your videos😍
@AI_Financier
@AI_Financier 2 жыл бұрын
Great video, a question: if i go for value network, do i still need the policy network too or vice versa? because by having only one of them, i can get to my target? thanks in advanced
@william_8844
@william_8844 9 ай бұрын
WTF!!! Like I am half way through and I am already blown by the way you explain content. This has been the best video so far explaining RF..... Wow. New sub❤❤😅
@Lukas-zl5zs
@Lukas-zl5zs 2 жыл бұрын
amazing video, good work!
@lucianoinso
@lucianoinso 5 ай бұрын
Truly great video and explanation! Loved that you went deep (haha) into the details of the neural network, thanks!
@SerranoAcademy
@SerranoAcademy 5 ай бұрын
Thanks! Lol, I see what you did there! :D
@Alpacastan21m
@Alpacastan21m 2 жыл бұрын
Amazing.
@paedrufernando2351
@paedrufernando2351 3 жыл бұрын
cool...it took so long to drop this vid..I was earlier expecting RL videos from your site..but then I turned to Prof Oliver Siguad and completed RL there..Now I understand how DDPG works and internals of it..But I defintiley would want to see your take and perspective on this topic..So here I go again to watch this Video on RL ....
@studgaming6160
@studgaming6160 Жыл бұрын
Finally good video on RL
@AlexandriaLibraryGame
@AlexandriaLibraryGame Жыл бұрын
I don't understand how to train the NN at 34:09, what are the features and what is the label?
@RobertLugg
@RobertLugg Жыл бұрын
You are one of the best teachers around. Thank you. What if the grid is different or the end goals change location? Do you need to start training over?
@SerranoAcademy
@SerranoAcademy Жыл бұрын
Thank you! Great question, If the environment changes in general you do have to start again. However, there may be cases in which you can piggy back from having learned the game in a simpler situation, so it depends.
@ottodgs4031
@ottodgs4031 3 ай бұрын
Very nice video! When you say that the label of the new dataset is a "big increase" or a "small decrease", what is that in practice? Just the gain?
@elimelechschreiber6937
@elimelechschreiber6937 Жыл бұрын
Thank you. Question: In the last section you use the term 'gain' but actually use the 'value' function i believe. Shouldn't the gain be the the difference of the value (in your example, always positive 1 then)? The gained value associated with the given action?
@rohitchan007
@rohitchan007 2 жыл бұрын
This is by far the best explanation.
@wilmarariza9020
@wilmarariza9020 3 жыл бұрын
Excellent! Luis.
@SerranoAcademy
@SerranoAcademy 3 жыл бұрын
Gracias Wilmar!
@outtaspacetime
@outtaspacetime Жыл бұрын
1234's vote up! thanks for this great overview
@jaivratsingh9966
@jaivratsingh9966 2 жыл бұрын
@Luis Serrano - thanks for this. Excellent! At 30:15 shouldnt (4,0) be -2 and hence (4,1) be -3 and so on. A Query on policy train: If you freeze video at 28:52, and look at the table. I see it as random walk where you end up to a reward location, and kind of infer the value (subtracting 1) from next value point and come up with 3,2,..-1. Why would you say the one should decrease p(->) for 0,0 ? At 0,0 (or any chosen node on simulated path), the moves always increase the value (better value state), then change should never be "decrease"). Also while training the net you dont use "Change". Then why are we discussing "Change" at all ? Shouldn't it be simply the probability of actual step each step to be higher than rest as it points to path leading to a reward?
@AyaAya-fh2wx
@AyaAya-fh2wx Жыл бұрын
Thanks!
@SerranoAcademy
@SerranoAcademy Жыл бұрын
Thank you so much for your contribution Aynur! And I'm so glad you like the video! :)
@joselee5377
@joselee5377 5 ай бұрын
i fucking move this video. oh my goodness... the level of satisfaction of understanding something that i struggled to grasp ;)
@aditimahajan9278
@aditimahajan9278 2 жыл бұрын
Kindly make video on Response Surface Methodology
@kafaayari
@kafaayari 11 ай бұрын
Great lecture Mr. Serrano, thx. But some parts are inconsistent and confusing. For example at 29:49, for the state (3,1) the best action is to move left and agent went left. However you try to decrease its probability during training as seen in the table. That doesn't make sense.
A friendly introduction to Deep Learning and Neural Networks
33:20
Serrano.Academy
Рет қаралды 693 М.
Deep Reinforcement Learning: Neural Networks for Learning Control Laws
21:15
【獨生子的日常】让小奶猫也体验一把鬼打墙#小奶喵 #铲屎官的乐趣
00:12
“獨生子的日常”YouTube官方頻道
Рет қаралды 107 МЛН
Can You Draw The PERFECT Circle?
00:57
Stokes Twins
Рет қаралды 61 МЛН
Mini Jelly Cake 🎂
00:50
Mr. Clabik
Рет қаралды 17 МЛН
Godzilla Attacks Brawl Stars!!!
00:39
Brawl Stars
Рет қаралды 10 МЛН
Q-learning - Explained!
11:54
CodeEmporium
Рет қаралды 10 М.
Proximal Policy Optimization (PPO) - How to train Large Language Models
38:24
Interstellar Expansion WITHOUT Faster Than Light Travel
21:14
PBS Space Time
Рет қаралды 51 М.
An introduction to Policy Gradient methods - Deep Reinforcement Learning
19:50
How Deep Neural Networks Work - Full Course for Beginners
3:50:57
freeCodeCamp.org
Рет қаралды 3,4 МЛН
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 92 М.
What is Q-Learning (back to basics)
45:44
Yannic Kilcher
Рет қаралды 89 М.
A Friendly Introduction to Generative Adversarial Networks (GANs)
21:01
Serrano.Academy
Рет қаралды 239 М.
Reinforcement Learning: Machine Learning Meets Control Theory
26:03
Steve Brunton
Рет қаралды 249 М.
Apple. 10 Интересных Фактов
24:26
Dameoz
Рет қаралды 25 М.
СЛОМАЛСЯ ПК ЗА 2000$🤬
0:59
Корнеич
Рет қаралды 2,3 МЛН
На iPhone можно фоткать даже ночью😳
0:30
GStore Mobile
Рет қаралды 1,3 МЛН
How about that uh?😎 #sneakers #airpods
0:13
Side Sphere
Рет қаралды 9 МЛН
Купите ЭТОТ БЮДЖЕТНИК вместо флагманов от Samsung, Xiaomi и Apple!
13:03
Thebox - о технике и гаджетах
Рет қаралды 65 М.