Python Reinforcement Learning using Gymnasium - Full Course

  Рет қаралды 80,630

freeCodeCamp.org

freeCodeCamp.org

Күн бұрын

Learn the basics of reinforcement learning and how to implement it using Gymnasium (previously called OpenAI Gym). Gymnasium is an open source Python library maintained by the Farama Foundation that provides a collection of pre-built environments for reinforcement learning agents. It provides a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API.
Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.
💻 Google Colab Notebook (full tutorial code): colab.research.google.com/dri...
Gymnasium documentation: gymnasium.farama.org/
✏️ Course developed by @EverythingTechWithMustafa
🔗 Mustaf on LinkedIn: / mustafa-esoofally-aab0...
⭐️ Contents ⭐️
⌨️ (0:00:00) Introduction
⌨️ (0:04:19) Reinforcement Learning Basics (Agent and Environment)
⌨️ (0:12:15) Introduction to Gymnasium
⌨️ (0:14:59) Blackjack Rules and Implementation in Gymnasium
⌨️ (0:18:27) Solving Blackjack
⌨️ (0:19:46) Install and Import Libraries
⌨️ (0:23:19) Observing the Environment
⌨️ (0:27:55) Executing an Action in the Environment
⌨️ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack
⌨️ (0:42:28) Understand the Q-values
⌨️ (0:47:29) Training the Agent to Play Blackjack
⌨️ (0:57:10) Visualize the Training of Agent Playing Blackjack
⌨️ (1:04:34) Summary of Solving Blackjack
⌨️ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN)
⌨️ (2:29:29) Summary of Solving Cartpole
⌨️ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo
Correction:
00:09 Gymnasium is maintained by the Farama Foundation and is not associated with OpenAI.
🎉 Thanks to our Champion and Sponsor supporters:
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Erdeniz Unvan
👾 Justin Hual
👾 Agustín Kussrow
👾 Otis Morgan
--
Learn to code for free and get a developer job: www.freecodecamp.org
Read hundreds of articles on programming: freecodecamp.org/news

Пікірлер: 85
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
I hope you all liked this course. Make sure to leave your feedback
@TransmentalMe
@TransmentalMe Жыл бұрын
You did great Mustafa, I appreciate your effort.
@G2Chanakya
@G2Chanakya Жыл бұрын
The course is actually good, if people already knew reinforcement learning in someway. Otherwise people will definitely find it hard to understand what epsilon, discount factor, argmax, qvalue, nn, optimisation, loss function, activation function, etc.,
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
@@G2Chanakya It is complex for sure but i tried
@G2Chanakya
@G2Chanakya Жыл бұрын
@@EverythingTechWithMustafa u did great. Good job 🤝
@joelmendonsa3246
@joelmendonsa3246 10 ай бұрын
there are errors in your google colab, could you please fix them and show it in the video how did you fix it
@kingvolpes352
@kingvolpes352 8 ай бұрын
Some typos to help get some people through: 1) 21:21- collection should be collections, error raised will say that it doesn't exist otherwise 2) 22:30 - patch should be capitalised, i.e 'Patch' 3) 24:46 - true should be capitalised, i.e 'True' 4) 30:20 - I recommend on line 4 including env.reset() to reset the episode already 5) 35:15 - Just to note, these there are TWO underscores on either side of init, not just one: i.e. __init__ NOT _init_. Can be difficult to see it :) 6) 35:25 - colons were meant to be underscores I am fairly sure i.e. change learning:rate to learning_rate. 7) 36:45 - typo: env.action_dpace.n to env.action_space.n 8) 40:20 - typo: selg should be self 9) 40:32 - typo: tupe should be tuple(fixed later on anyway :D ) 10) 44:48 - needs colon after closing bracket for update on line 47 11) 45:54 - typo: actions should be action 12) 47.24- typo: missing self statements for class causing problems, I think line 41 should actually read: self.epsilon = max(self.final_epsilon, self.epsilon - self.epsilon_decay) 13) 47:54 - typo, should be learning_rate = 0.01 NOT learning_rate:0.01 14) 50:53 - Ok, wow, this one took some time to get to a place where it wasn't going to give me a syntax error. For some reason, the instantiation of the wrapper was just not working with any variation of this code. So, to solve this, I first ensured I had imported deque and also imported RecordEpisodeStatistics as shown below: from collections import deque from gym.wrappers import RecordEpisodeStatistics and then write the following line instead of line 2 of this cell: env = gym.wrappers.RecordEpisodeStatistics(env, deque_size=n_episodes) (Again, sorry for any mistakes of my own, btw he does fix that before he then runs it at 55:42) 15) 53:05 - Typo: ob should be changed to obs I will be honest, the content within the tutorial is very good and what it can be used for is definitely something that people should look into. However, the problem is the quality of the teaching process provided. Within the first hour I found the mistakes listed above and it definitely makes it a tedious process not being able to verify the typos to be right or wrong as the code cells were never ran until around 55 minutes in. It is good to have the notebook available in the form of a Google Colab notebook but if the process of understanding the material is not something that was easily cleared up along the way it definitely makes it harder to be sure the code will work the way that it is expected to. I appreciate you taking the time to make the tutorial Mustafa and wish you the best in the future courses you provide :)
@sarthaknimbalkar5868
@sarthaknimbalkar5868 8 ай бұрын
Thanks man
@a.wgpfhl9590
@a.wgpfhl9590 6 ай бұрын
Thank you! Additionally to solving the RecordEpisodeStatistics error (15) I copied the creation of the environment env = gym.make("Blackjack-v1", sab=True,render_mode="rgb_array") before the wrapping into the Recorder, otherwise with multiple attempts to execute the cell env is wrapped and wrapped again -> 🍬. Apart from that i am thankful for the tutorial. It is one of the most recent and with all the changes that happened to gym in the recent years. Others may be better but are already outdated
@viatrufka
@viatrufka 2 ай бұрын
Thank you very much
@shreyasmishra9573
@shreyasmishra9573 2 ай бұрын
Thank you very much!
@RyanMartinRAM
@RyanMartinRAM 6 ай бұрын
Wow, a bait and switch. When I saw the intro at the beginning, I thought "oh great, a clear accent, nice microphone, high production quality, excellent!" and then it switches to, well, you know, something a bit more typical for KZbin.
@jordanterry1692
@jordanterry1692 Жыл бұрын
Hey, I’m the maintainer of Gymnasium. It’s not affiliated with OpenAI in any way (though Gym used to be), its part of the Farama Foundation. Thats why the repo is under the Farama Foundation organization, as is the website. The home page of the Gymnasium website and the readmes of Gym and Gymnasium both make this clear.
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
Noted. Edits have been made to fix this
@freecodecamp
@freecodecamp Жыл бұрын
Sorry about this misunderstanding. We've updated everything to make it more clear that Gymnasium is not associated with OpenAI.
@sketchwaretagalogtutorials
@sketchwaretagalogtutorials Жыл бұрын
So Gym and Gymnasium are not the same?
@xsrpma
@xsrpma Жыл бұрын
Gym (legacy) -> Gymnasium (active fork)
@user-zk7px5xo5k
@user-zk7px5xo5k 6 ай бұрын
The google colab link is not working can you please make it available. Alot of doubts
@Fetrah2
@Fetrah2 Жыл бұрын
⭐ Contents ⭐ ⌨ (0:00:00) Introduction ⌨ (0:04:19) Reinforcement Learning Basics (Agent and Environment) ⌨ (0:12:15) Introduction to OpenAI Gymnasium ⌨ (0:14:59) Blackjack Rules and Implementation in Gymnasium ⌨ (0:18:27) Solving Blackjack ⌨ (0:19:46) Install and Import Libraries ⌨ (0:23:19) Observing the Environment ⌨ (0:27:55) Executing an Action in the Environment ⌨ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack ⌨ (0:42:28) Understand the Q-values ⌨ (0:47:29) Training the Agent to Play Blackjack ⌨ (0:57:10) Visualize the Training of Agent Playing Blackjack ⌨ (1:04:34) Summary of Solving Blackjack ⌨ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN) ⌨ (2:29:29) Summary of Solving Cartpole ⌨ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo
@amirrezasafaei3152
@amirrezasafaei3152 8 ай бұрын
Thank you very much for this tutorial.
@seesea191
@seesea191 Ай бұрын
The google colab link is not working can you please make it available
@swastikgorai2332
@swastikgorai2332 Жыл бұрын
Just what I needed at the right moment! Thanks!
@powerfuel297
@powerfuel297 Жыл бұрын
same
@Ireile
@Ireile Жыл бұрын
Thank you for all your hard work! 😊
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
You're very welcome!
@adamharb5011
@adamharb5011 5 ай бұрын
I'm trying to get to the code but the link you shared isn't working "Sorry, the file you have requested does not exist". So please share us the right link to access the code!!!!
@amirrezasafaei3152
@amirrezasafaei3152 8 ай бұрын
Thank you man. It helped me a lot. :)
@queenslaands
@queenslaands 10 ай бұрын
Thank you for the tutorial and the code! But I think you should focus more on "explaining" the mechanism of the code rather than just reading and typing word by word. I actually learned by reading the colab notebook, cannot finish your video. Anyway, thank you very much!
@snowyfield1906
@snowyfield1906 4 ай бұрын
anyone pls send me the fixed link of this notebook? :( Im currently studying with this video but lacking of notebook makes me suffered
@user-jr3ue1nv1o
@user-jr3ue1nv1o 2 ай бұрын
Does anyone have the source code in the video? The Colab link in the video introduction is not working. thank you.
@joelmendonsa3246
@joelmendonsa3246 10 ай бұрын
There are tons of erros in the google colab, could you please make a video about fixing them properly , thanks
@kaustubh3334
@kaustubh3334 Жыл бұрын
Thank you so so much ❤
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
You're very welcome!
@deekshantwadhwa
@deekshantwadhwa Жыл бұрын
Woah, I received an assignment on Deep Q-Networks for reinforcement learning which I have to submit by the weekend and here this is 😮😂
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
@ianchang7446
@ianchang7446 9 ай бұрын
Hi, this video is so good for me to learn RL. Thanks a lot. But I got a problem when running the cell "Default title text" in your code Colab shows "RuntimeError: maximum size for tensor at dimension 0 is 100 but size is 1000". Do you know why ? How to fix it ?
@lorenzojamone6463
@lorenzojamone6463 8 ай бұрын
it's an error/typo in the code, it should be "means = durations_t.unfold(0,100,1).mean(1).view(-1)" and then "means = torch.cat((torch.zeros(99),means))", instead of "1000" and then "zeroes"
@HubertRozmarynowski
@HubertRozmarynowski 4 ай бұрын
at 1:04:01, why is the "episode lengths" plot starting and ending with impossibly low values?
@curiousphilosopher2129
@curiousphilosopher2129 Жыл бұрын
Book recommendation: "A Primer to the 42 Most commonly used Machine Learning Algorithms (With Code Samples)."
@briangwaze9530
@briangwaze9530 Жыл бұрын
Thank you a million!
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
@Narendra_19
@Narendra_19 Жыл бұрын
Thank you so much sir
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
@zed625
@zed625 Жыл бұрын
Why did the presenter change?
@nikolaaswillaert2951
@nikolaaswillaert2951 10 ай бұрын
some constructive feedback: I noticed a lot of typos in the code. I suggest running the code instead of jumping to the next section. Showing errors helps us understand as well. Also you are a bit hard to understand.
@ishaquenizamani9800
@ishaquenizamani9800 Жыл бұрын
I am getting error when I use OBS: tuple [int, int, bool] as function parameters why can anybody explain.
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
Check out the google colab notebook . Link is in description
@ApexArtistX
@ApexArtistX 5 ай бұрын
can you do flappy bird and space invaders , by watching screen pixels. with gymnasium
@ojasvisingh786
@ojasvisingh786 Жыл бұрын
🎉🎉
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
🎉
@andrewrice9362
@andrewrice9362 Жыл бұрын
I've been having a melt down today trying to look all the gymnasium environments and had to go back to gym. Can't believe this came up 29 mins ago
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
haha
@G2Chanakya
@G2Chanakya Жыл бұрын
Bro that website is amazing. But when i did, all sorts of library errors were coming and i didn't know how to go about with wrappers.
@ash-pq9nf
@ash-pq9nf 5 ай бұрын
I am unable to download the code. Could you please reupload
@ArjunMahes
@ArjunMahes 3 ай бұрын
same
@imaginime
@imaginime Жыл бұрын
can anyone tell me how to import the IDE he is using to code for RL?
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
it's google colab
@prashlovessamosa
@prashlovessamosa Жыл бұрын
This is something new.
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
That was the goal
@usmanmarkaz
@usmanmarkaz Жыл бұрын
Plz upload React Native Tutorial and projects
@freecodecamp
@freecodecamp Жыл бұрын
Coming in the next two weeks.
@Nikhil-dp2mw
@Nikhil-dp2mw Жыл бұрын
This is new..
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
It is
@veraciousperson8055
@veraciousperson8055 Жыл бұрын
You people need a 100 million subscribers
@Samuelsward96
@Samuelsward96 11 ай бұрын
One of the worst coding guide i've ever seen. Such a shame because i really want to learn this topic and there are so few guides out there. Oh well, at least i can copy the code and analyze it
@bestboy007
@bestboy007 Жыл бұрын
this video is only for india speaking people.
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
It’s not that bad
@LoVeAmBiEnT
@LoVeAmBiEnT Жыл бұрын
Nice. Just wish I could understand what you are saying.
@user-wd5nx7pd8c
@user-wd5nx7pd8c Жыл бұрын
Hello I wish to add an audio track in Arabic, the new feature added by KZbin
@m.s.2927
@m.s.2927 Жыл бұрын
I think "gym" is referring to an institution, where one can train the muscles... Not "Gymnasium"
@abdum1493
@abdum1493 7 ай бұрын
In the Blackjack example, I'm wondering who learned less, the agent or myself. This is a worthless tutorial and should be deleted. Aside from the effort put, I have nothing good to say.
@andrewz641
@andrewz641 11 ай бұрын
horrible
@srinivasanrajagopal9062
@srinivasanrajagopal9062 Жыл бұрын
First :P
@jeansmith2663
@jeansmith2663 Жыл бұрын
🥳
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
Congrats
@lxx2952
@lxx2952 Жыл бұрын
this damned Indian slang! (
@EverythingTechWithMustafa
@EverythingTechWithMustafa Жыл бұрын
🤨
@Abstract.x
@Abstract.x Жыл бұрын
What do you mean?
@stupid_opinion
@stupid_opinion Жыл бұрын
His 'slang' is perfectly fine!
An introduction to Reinforcement Learning
16:27
Arxiv Insights
Рет қаралды 635 М.
JAIDARMAN TOP / 1 ТУР / C және D тобы / 2-ойын
1:32:22
Jaidarman OFFICIAL / JCI
Рет қаралды 369 М.
◆テンゲテンゲダンス~Tengelele~◆ #ひめちゃんとおうくん #funny #shorts
00:24
プリンセス姫スイートTV Princess Hime Suite TV
Рет қаралды 19 МЛН
Tom&Jerry❤️Real funny Twist🤣#shorts
00:54
Krishna Avyu
Рет қаралды 50 МЛН
He wasn't having fun 🫢 #shorts
00:13
Маджит Сулейманов
Рет қаралды 5 МЛН
Training AI to Play Pokemon with Reinforcement Learning
33:53
Peter Whidden
Рет қаралды 6 МЛН
Data Science Job Interview - Full Mock Interview
1:25:04
freeCodeCamp.org
Рет қаралды 470 М.
Reinforcement Learning: Machine Learning Meets Control Theory
26:03
Steve Brunton
Рет қаралды 244 М.
AI Learns to Walk (deep reinforcement learning)
8:40
AI Warehouse
Рет қаралды 8 МЛН
Build Anything with AI Agents, Here's How
29:49
David Ondrej
Рет қаралды 95 М.
Reinforcement Learning from scratch
8:25
Graphics in 5 Minutes
Рет қаралды 26 М.
The Truth About Learning Python in 2024
9:38
Internet Made Coder
Рет қаралды 60 М.
Reinforcement Learning in 3 Hours | Full Course using Python
3:01:58
Nicholas Renotte
Рет қаралды 388 М.
JAIDARMAN TOP / 1 ТУР / C және D тобы / 2-ойын
1:32:22
Jaidarman OFFICIAL / JCI
Рет қаралды 369 М.