Q Learning Intro/Table - Reinforcement Learning p.1

  Рет қаралды 299,552

sentdex

sentdex

Күн бұрын

Welcome to a reinforcement learning tutorial. In this part, we're going to focus on Q-Learning.
Q-Learning is a model-free form of machine learning, in the sense that the AI "agent" does not need to know or have a model of the environment that it will be in. The same algorithm can be used across a variety of environments.
For a given environment, everything is broken down into "states" and "actions." The states are observations and samplings that we pull from the environment, and the actions are the choices the agent has made based on the observation. For the purposes of the rest of this tutorial, we'll use the context of our environment to exemplify how this works.
Text-based tutorial and sample code: pythonprogramming.net/q-learn...
Channel membership: / @sentdex
Discord: / discord
Support the content: pythonprogramming.net/support...
Twitter: / sentdex
Instagram: / sentdex
Facebook: / pythonprogramming.net
Twitch: / sentdex
#reinforcementlearning #machinelearning #python

Пікірлер: 313
@aravindsuresh8157
@aravindsuresh8157 5 жыл бұрын
When i think about a topic, he posts it. Awesome!
@Stinosko
@Stinosko 5 жыл бұрын
I'm more than happy to support this awesome channel! keep up the great work, i love your tutorials :-D
@Ilya_Sss
@Ilya_Sss 3 жыл бұрын
Thank you so much for supporting free education! You are a great man!
@Totial
@Totial 5 жыл бұрын
Man you make learning so easy, i think you have no idea how much you are changing this world for good! So much tutorials out there are linked to you and so much ppl becoming able to reach their dreams because of you. Respect!! Keep up the amazing job
@saisritejakuppa9856
@saisritejakuppa9856 5 жыл бұрын
The wait is over....the only reason I came into this AI field from electrical engineering is just by watching your videos instead of taking some random courses. ...keep going....Thanks a lot sentdex.
@hyperistica
@hyperistica 4 жыл бұрын
I just got started with reinforcement learning and your tutorial is really helpful. On a side note, I also love the way you laugh (that deep inhale gets me every time).
@ahmedhany5037
@ahmedhany5037 5 жыл бұрын
I can't thank you enough for these awesome tutorials you give us . It is the most practical reinforcement learning guide I have ever seen. Please keep up with this AWESOME work .
@sentdex
@sentdex 5 жыл бұрын
Happy to do it!
@marsf7089
@marsf7089 8 ай бұрын
Great video and so much energetic presentation. I was learning reinforcement learning this week. And this is the only material that doesn't only talk about vague and abstract concepts. So much concrete and deliberated contents!
@andreydev2132
@andreydev2132 5 жыл бұрын
One of the most interesting topics for me. Please, continue! It would be very interesting to see self-driving car with Q-Learning (table / deep)
@flosset9640
@flosset9640 4 жыл бұрын
carla
@damianwysokinski3285
@damianwysokinski3285 3 жыл бұрын
Running out of memory ;)
@ayaanp
@ayaanp 5 жыл бұрын
I LOVE THIS! I have been wanting to learn Reinforcement Learning and this is the start. Your videos are NEVER bad. You are teaching this 9 year old(me) with your website and youtube channel. I now know all python basics, AI, robotics, almost all because of YOU!
@lunapopo8415
@lunapopo8415 Жыл бұрын
For the latest gym package, to avoid backward compatibility warnings 1) define env = gym.make("MountainCar-v0", new_step_api=True, render_mode='human') 2) remove env.render()
@atharvachouhan474
@atharvachouhan474 Жыл бұрын
bro you literally saved my life thanks a lot
@alansabok7462
@alansabok7462 Жыл бұрын
as pe November 2022, the env.step() is also producing more variables, so have to replace 10 line with new_state, reward, done, truncated,info = env.step(action)
@onlyshorts6837
@onlyshorts6837 9 ай бұрын
does any one of you have encounter a problem cuz when i write his project , it does run but nothing is shown , i am using python version 3.10 ?
@lommoberry7312
@lommoberry7312 9 ай бұрын
you need to import pygame@@onlyshorts6837
@Nxck2440
@Nxck2440 6 ай бұрын
The code working for me was (also required $ pip install pygame) import gym # OpenAI gym env = gym.make("MountainCar-v0", render_mode='human') env.reset() done = False while not done: action = 2 # go right new_state, reward, done, truncated, info = env.step(action) env.render() env.close()
@vickymar3836
@vickymar3836 5 жыл бұрын
There is an acute lack of good reinforcement learning study materials on the net (especially videos). I literally jumped from my seat. I want to binge watch this series.
@gomes8335
@gomes8335 5 жыл бұрын
Exactly
@simonec3511
@simonec3511 Жыл бұрын
Incredible work, the best explanation i have found. It makes this concepts so easy to understand compared to many books on this topic that i have been studying. Thanks for you awsome work!!!
@vibekdutta6539
@vibekdutta6539 5 жыл бұрын
The thing I've been waiting for, you're awesome!
@aradarbel4579
@aradarbel4579 5 жыл бұрын
im so excited about this new series! good luck, will be looking for next episodes :D
@abhinavpy2748
@abhinavpy2748 5 жыл бұрын
Most awaited topic. And it comes from the one and only Sentdex!! Thanks a lot. Please make as many tutorials as possible.
@jayhu6075
@jayhu6075 5 жыл бұрын
Hi, first the switch from javascript to python and then give a topic about reinforcement. That is amazing.... The learning curve that you explain make the live from a developer so easy and simple. Thank you. mr.Sentdex
@gomes8335
@gomes8335 5 жыл бұрын
Omg. Thank you for starting reinforcement 😍
@Artificial_Intelligence_AI
@Artificial_Intelligence_AI 5 жыл бұрын
I have completed several Machine & deep learning courses though these months (from Udemy, KZbin, coursera etc), and I even read some famous books about this field. I think your courses are in the top 3 easily, because they are a perfect combination between a well-conducted intuition approach and a fundamental programming part, even better executed. Congratulations for these amazing videos, you deserve our gratitude. I really hope you can get more subscribers during the following years, your content is still underrated. Regards from Spain.
@sentdex
@sentdex 5 жыл бұрын
That's really awesome to hear!
@michaelfrangos8587
@michaelfrangos8587 5 жыл бұрын
You're the best. My simple networks are just not doing the job well enough. Perhaps this series will be what's needed.
@fuuman5
@fuuman5 5 жыл бұрын
Uhh, just sitting on the toilet and the notification comes in. Some nice ML quality content from my favorite python buddy
@harkishansinghbaniya2784
@harkishansinghbaniya2784 5 жыл бұрын
Just love your videos and explanations. I was just waiting for the Q-Learning Tutorial Series.
@MrDan2512
@MrDan2512 5 жыл бұрын
Just what I needed for my master thesis.
@ernestassimutis6239
@ernestassimutis6239 5 жыл бұрын
Nice topic! Hope it will have at least 100 series. Thank you!
@Phateau
@Phateau 5 жыл бұрын
Finally, I have been waiting for this. Please do a long series! Thank you
@sentdex
@sentdex 5 жыл бұрын
It will be JUST the right length :D
@Tweakimp
@Tweakimp 5 жыл бұрын
Thank you very much for this. It would be cool to see qlearning applied to some little game like connect 4. Keep up the good work!
@mockingbird3809
@mockingbird3809 5 жыл бұрын
Wow.....This is Video I Was Waiting For.....Thanks, Harrison.
@mannycalavera121
@mannycalavera121 5 жыл бұрын
Love the videos and a series, thanks for putting these out
@ruantwice
@ruantwice 3 жыл бұрын
You are an absolute boss. Thank you for the quality content!
@loukask.9111
@loukask.9111 5 жыл бұрын
Dude how do you alway know what kinds of videos I need?! This is perfect!
@user-ur9fs8cx4f
@user-ur9fs8cx4f 5 жыл бұрын
Больше видео на эту тему, пожалуйста!
@GabrielCarvv
@GabrielCarvv 4 жыл бұрын
He's made a secretive and expansive AI that monitors every single viewer
@s16ray_
@s16ray_ 5 жыл бұрын
Learned a lot from you.... Started machine learning from your channel only...
@jorostuff
@jorostuff 4 жыл бұрын
I feel like this guy knows everything. Whatever I google, he has a tutorial on that.
@ahmedgabr8009
@ahmedgabr8009 5 жыл бұрын
Thanks for the great tutorial ! Can't wait for the next video !!!!!!!
@Sporkredfox
@Sporkredfox 5 жыл бұрын
Oh, this is funny! I am currently going through your python-sc2 tutorial and might be attempting to include Q-Learning once I learn about it (I know what you said about it in the video about why you didn't use it) Looking forward to this tutorial! Thank you for the content!
@prathamprasoon2535
@prathamprasoon2535 4 жыл бұрын
Yay! Thank you Sentdex for these brilliant tutorials.
@cyborgx1156
@cyborgx1156 2 жыл бұрын
Hey I know you twitter guy
@st00ch
@st00ch 5 жыл бұрын
Omg! RL I'm so excite!
@Mvobrito
@Mvobrito 5 жыл бұрын
Was waiting for this!
@tejasshah9881
@tejasshah9881 5 жыл бұрын
Man, Thank you so much. I love you for this.
@RutgerMusicOnline
@RutgerMusicOnline 5 жыл бұрын
Was really looking forward to you doing more RL stuff :)
@thomaswoo6276
@thomaswoo6276 5 жыл бұрын
Can't wait for the next episode! Great work, and ofc thank you.
@sentdex
@sentdex 5 жыл бұрын
Next one is out :D
@thomaswoo6276
@thomaswoo6276 5 жыл бұрын
@@sentdex watched!!! and love you as yesterday!!!!!!!
@siamakvakili6349
@siamakvakili6349 5 жыл бұрын
I really enjoy your lessons. Thank you very very much.
@lukerhoads
@lukerhoads 5 жыл бұрын
Awesome content that is always new to me. Thanks!
@martinprinceton9858
@martinprinceton9858 3 жыл бұрын
This is really a great explanation. I love this
@nagLostInEntropy
@nagLostInEntropy 3 жыл бұрын
Great video! Thank u so much!
@alazahir
@alazahir 5 жыл бұрын
I was waiting for this... RL teached by you !! and I have commented even before seeing the video
@rdwansrhan3209
@rdwansrhan3209 5 жыл бұрын
Great video, as always.
@ftmftm7627
@ftmftm7627 3 жыл бұрын
You are a legend man! Thank you
@varmhund
@varmhund Жыл бұрын
for others coming here in late 2022 struggling with the rendering due to module updates. import gym env = gym.make("MountainCar-v0", render_mode="human") observation, info = env.reset() done = False while not done: action = 2 observation, reward, done, truncated, info = env.step(action) if done or truncated: observation, info = env.reset() env.close()
@danielma2824
@danielma2824 Жыл бұрын
thank you
@shreyashsinha933
@shreyashsinha933 Жыл бұрын
Hi could you point to resource where i could find an updated version of this
@onlyshorts6837
@onlyshorts6837 9 ай бұрын
how on god green earth found the answer ? please
@dabunnisher29
@dabunnisher29 2 ай бұрын
Thank you so much.
@berkc5323
@berkc5323 3 жыл бұрын
Amazing channel man, keep doing this!!!
@douglasferreira3506
@douglasferreira3506 5 жыл бұрын
Finally!! You are the best
@girishkumar2759
@girishkumar2759 5 жыл бұрын
That's what I was waiting for
@AbhishekKumar-mq1tt
@AbhishekKumar-mq1tt 5 жыл бұрын
Thank u for this awesome video and series
@indivarmishra6119
@indivarmishra6119 5 жыл бұрын
Thanks for the tutorials.!(eagerly waiting for DQN cuz i am kinda stuck there.!) . loved your thinking about the education system.!
@user-bj4jq5gy6n
@user-bj4jq5gy6n 5 жыл бұрын
Thank you for the useful lessons, sentdex. It is very interesting to understand the problem of learning based on time series. This is when there is some record of the battle, and you need to train the algorithm on it to choose the best action. I would also want to understand how to prepare such time series for transmission to the algorithm. Something like that. Have a good day!
@Ruddradev
@Ruddradev 4 жыл бұрын
Thank you for this tutorial. I knew the theory but your tutorial helped me put it to practice. Also for anyone looking for theoretical background into RL, check out David Silver's 10 lecture series on Reinforcement Learning.
@ahmedelsayedabdelnabyrefae1365
@ahmedelsayedabdelnabyrefae1365 3 жыл бұрын
you are great man actually ,you are my mentor now
@MultiWolfxxx
@MultiWolfxxx 5 жыл бұрын
Love this channel.
@gunjanmimo
@gunjanmimo 3 жыл бұрын
you RL videos helped me a lot in my research work. Thank you. Make some videos on Unity Machine Learning agent, hope the audience will be benefited from these videos
@CodeWithDerrick
@CodeWithDerrick 5 жыл бұрын
Great, well explained intro. Thanks!
@vigeshmadanan
@vigeshmadanan 5 жыл бұрын
Excited for this tutorial series :D
@sentdex
@sentdex 5 жыл бұрын
Same!
@gauravsingh1963
@gauravsingh1963 5 жыл бұрын
hey, will you be covering dopamine 2.0?
@dr.mikeybee
@dr.mikeybee 5 жыл бұрын
This is very clear. Thank you.
@vulthuryol8051
@vulthuryol8051 4 жыл бұрын
15:34 _"That's gotta be the best table I've ever seen"_ _"So it would seem..."_
@jhgfdjhgfdhdjfjhd6721
@jhgfdjhgfdhdjfjhd6721 5 жыл бұрын
Thanks so much for your efforts
@priyankrajsharma
@priyankrajsharma 5 жыл бұрын
Q learning is difficult to understand .. I read so many blogs before coming to your channel. You made it easy.
@MrDan2512
@MrDan2512 5 жыл бұрын
I try to use DQN to plan an agent’s route in a dense moving crowd. My tools are UE4 and TF + Cuda. Can’t wait for the deep Q-learning video.
@ebimonaca
@ebimonaca 2 жыл бұрын
Thank you for nice deep"Q"Learning video
@balavigneshk5382
@balavigneshk5382 5 жыл бұрын
Yes! I have the hard copy of sutton barto. Now is the time to open it :)
@RutgerMusicOnline
@RutgerMusicOnline 5 жыл бұрын
Any chance you could do a tutorial on an actor critic or PPO algorithm after the DQN tutorial? ;) Maybe in the long term a tutorial on combining these algorithms with the unity environment.
@varuntotakura8139
@varuntotakura8139 5 жыл бұрын
I guess you will be showing us many of the real-time examples which have a broad scope. Thank you..! :)
@ManuelMendez1
@ManuelMendez1 4 жыл бұрын
People clicking on the "skip ad" button: these people make money out of those ads also, avoiding this is like saying: "Thank you for taking your time to teach people like me, who otherwise would have to pay for this".
@adityaretissin1856
@adityaretissin1856 3 жыл бұрын
"How do we do that? , We just do this *Shows the Q Function* , DUH!" That cracked me up xD
@user-or7ji5hv8y
@user-or7ji5hv8y 4 жыл бұрын
amazing how you know so many things!
@pujanagarwal7316
@pujanagarwal7316 5 жыл бұрын
Can you upload a series on GAN. Really need to know the intuition behind it
@macmos1
@macmos1 5 жыл бұрын
yesssss
@MartinSFesty
@MartinSFesty 5 жыл бұрын
Great video! Will you be doing an episode on how to create environments and action spaces? I am quite interested in environments where there are two learning players doing actions simultaneously (or even sequentially), and seeing whether or not they are able to cooperate to achieve a best possible outcome. It would be cool to see you exploring and explaining these subjects in this series. I am an amateur when it comes to programming, but I find your tutorials easy to follow, and I have had great use of the skills you have taught me!
@flosset9640
@flosset9640 4 жыл бұрын
this is super cool
@gamzeetuncay
@gamzeetuncay 3 жыл бұрын
it is so helpful my thesis, thanks a lot
@olokix
@olokix 5 жыл бұрын
gonna be TD learning, Ant Colony Optimization and Particle Swarm Optimization? and if yes, when?
@hobby_coding
@hobby_coding 3 жыл бұрын
i watched this months ago didnt understand a thing now after watching david silver's course i finally can understand what he's talking about , if you are like me dont get frustrated just read more on the subject
@Fire6
@Fire6 3 жыл бұрын
Yeah I think this is not really for complete beginners aha
@hjchew9810
@hjchew9810 5 жыл бұрын
Great job!
@user-wi2lb6fr5e
@user-wi2lb6fr5e 4 жыл бұрын
Thanks a lot 😊 , very helpful
@bobsamuelson8130
@bobsamuelson8130 5 жыл бұрын
Excellent!
@sachinaugustine9023
@sachinaugustine9023 3 жыл бұрын
This is gold
@ntchindagiscard3870
@ntchindagiscard3870 5 жыл бұрын
you are awesome man. I love tour channel
@masoudmasoumimoghaddam3832
@masoudmasoumimoghaddam3832 5 жыл бұрын
Great video SentDex as I always! I was wondering if you could add some episodes to this tutorial series explaining and implementing deep reinforcement learning especially the one used in AlphaGo. by the way you're awesome in racing.!!
@fuadkhan3449
@fuadkhan3449 3 жыл бұрын
Love your mug
@gondorian1640
@gondorian1640 Жыл бұрын
Hello there! Great video! I have a question, do the values inside the DISCRETE_OS_SIZE list have to be the same? i created an env where the first observation space element only has 6 combinations total, so thats why im asking. Thanks in advance!
@arshshah1871
@arshshah1871 4 жыл бұрын
"paint is love, paint is life" -sentdex 2019
@ELarivie
@ELarivie 5 жыл бұрын
Sentdex you're the best!
@szajbon
@szajbon 5 жыл бұрын
Great video! I have one suggestion though - consider mentioning that gym uses numpy arrays and not basic python lists. It might be confusing for someone that you basically divide a list by list and get another list - its a specific implementation of numpy.array that gives you that high-level convenience. I just stumbled on your video, so maybe you pointed that out in some other videos, but hey, for a newcomer it can be mind-bending after getting some weird bug after some time.
@kiarashgeraili8595
@kiarashgeraili8595 4 жыл бұрын
That was exactly one of my problems! thanks a lot
@ramzykaram296
@ramzykaram296 5 жыл бұрын
Dude, You're super awesome, waiting for the DL part Actually i got book "Deep Reinforcement Learning Hands-On" and i am really struggling Also "Reinforcement Learning: An Introduction" by Richard S. Sutton and Andrew G. Barto, is really great but too hard to go through
@cruelworld4732
@cruelworld4732 5 жыл бұрын
Please be quick with the next videos, I am working on a project and I am gonna need your help, Keep up the good work
@ammarshahzad9627
@ammarshahzad9627 5 ай бұрын
If someone is following this tutorial with the new gymnasium update you need to add new_state, reward, term, trun, _ = env.step(action), instead of new_state, reward, done, _. This should be followed by a if term or trun: done = True this will ensure that the env runs fine
@PLAZMAKE
@PLAZMAKE 5 жыл бұрын
what are the combinations you are talking about and what is the discrete_os_win_size used for?
@skatingskelly
@skatingskelly 4 жыл бұрын
Whenever i try and launch it CMD prompt says nothing and it does nothing.
@viktorkuzmanov3086
@viktorkuzmanov3086 4 жыл бұрын
Number one AI channel on yt by far
@shmarvdogg69420
@shmarvdogg69420 5 жыл бұрын
amazing!
@KasimierLP
@KasimierLP 5 жыл бұрын
thank you for your videos :D
@sentdex
@sentdex 5 жыл бұрын
Happy to share!
@gautamj7450
@gautamj7450 5 жыл бұрын
YEEESSSS!!!!
@renanbuchan1633
@renanbuchan1633 Жыл бұрын
“We just do this! *shows big complicated equation* duh!” Earned a subscriber lol
@wahab487
@wahab487 4 жыл бұрын
I think you might confuse the discount factor with the learning rate. The discount factor can be based on how the reward is distributed across time.
Q Learning Algorithm and Agent  - Reinforcement Learning p.2
28:58
WHO LAUGHS LAST LAUGHS BEST 😎 #comedy
00:18
HaHaWhat
Рет қаралды 23 МЛН
НРАВИТСЯ ЭТОТ ФОРМАТ??
00:37
МЯТНАЯ ФАНТА
Рет қаралды 1,5 МЛН
Playing hide and seek with my dog 🐶
00:25
Zach King
Рет қаралды 31 МЛН
Reinforcement Learning from scratch
8:25
Graphics in 5 Minutes
Рет қаралды 49 М.
Q-learning - Explained!
11:54
CodeEmporium
Рет қаралды 17 М.
What is Q-Learning (back to basics)
45:44
Yannic Kilcher
Рет қаралды 93 М.
Training an unbeatable AI in Trackmania
20:41
Yosh
Рет қаралды 13 МЛН
Deep Q Learning w/ DQN - Reinforcement Learning p.5
32:19
sentdex
Рет қаралды 138 М.
How Bugatti's New Electric Motor Bends Physics
9:25
Ziroth
Рет қаралды 67 М.
WHO LAUGHS LAST LAUGHS BEST 😎 #comedy
00:18
HaHaWhat
Рет қаралды 23 МЛН