Python Reinforcement Learning using Gymnasium

Python Reinforcement Learning using Gymnasium - Full Course

Рет қаралды 80,630

Күн бұрын

Learn the basics of reinforcement learning and how to implement it using Gymnasium (previously called OpenAI Gym). Gymnasium is an open source Python library maintained by the Farama Foundation that provides a collection of pre-built environments for reinforcement learning agents. It provides a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API.
Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.
💻 Google Colab Notebook (full tutorial code): colab.research.google.com/dri...
Gymnasium documentation: gymnasium.farama.org/
✏️ Course developed by @EverythingTechWithMustafa
🔗 Mustaf on LinkedIn: / mustafa-esoofally-aab0...
⭐️ Contents ⭐️
⌨️ (0:00:00) Introduction
⌨️ (0:04:19) Reinforcement Learning Basics (Agent and Environment)
⌨️ (0:12:15) Introduction to Gymnasium
⌨️ (0:14:59) Blackjack Rules and Implementation in Gymnasium
⌨️ (0:18:27) Solving Blackjack
⌨️ (0:19:46) Install and Import Libraries
⌨️ (0:23:19) Observing the Environment
⌨️ (0:27:55) Executing an Action in the Environment
⌨️ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack
⌨️ (0:42:28) Understand the Q-values
⌨️ (0:47:29) Training the Agent to Play Blackjack
⌨️ (0:57:10) Visualize the Training of Agent Playing Blackjack
⌨️ (1:04:34) Summary of Solving Blackjack
⌨️ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN)
⌨️ (2:29:29) Summary of Solving Cartpole
⌨️ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo
Correction:
00:09 Gymnasium is maintained by the Farama Foundation and is not associated with OpenAI.
🎉 Thanks to our Champion and Sponsor supporters:
👾 Nattira Maneerat
👾 Heather Wcislo
👾 Serhiy Kalinets
👾 Erdeniz Unvan
👾 Justin Hual
👾 Agustín Kussrow
👾 Otis Morgan
--
Learn to code for free and get a developer job: www.freecodecamp.org
Read hundreds of articles on programming: freecodecamp.org/news

Пікірлер: 85

@EverythingTechWithMustafa Жыл бұрын

I hope you all liked this course. Make sure to leave your feedback

@TransmentalMe Жыл бұрын

You did great Mustafa, I appreciate your effort.

@G2Chanakya Жыл бұрын

The course is actually good, if people already knew reinforcement learning in someway. Otherwise people will definitely find it hard to understand what epsilon, discount factor, argmax, qvalue, nn, optimisation, loss function, activation function, etc.,

@EverythingTechWithMustafa Жыл бұрын

@@G2Chanakya It is complex for sure but i tried

@G2Chanakya Жыл бұрын

@@EverythingTechWithMustafa u did great. Good job 🤝

@joelmendonsa3246 10 ай бұрын

there are errors in your google colab, could you please fix them and show it in the video how did you fix it

@kingvolpes352 8 ай бұрын

Some typos to help get some people through: 1) 21:21- collection should be collections, error raised will say that it doesn't exist otherwise 2) 22:30 - patch should be capitalised, i.e 'Patch' 3) 24:46 - true should be capitalised, i.e 'True' 4) 30:20 - I recommend on line 4 including env.reset() to reset the episode already 5) 35:15 - Just to note, these there are TWO underscores on either side of init, not just one: i.e. __init__ NOT _init_. Can be difficult to see it :) 6) 35:25 - colons were meant to be underscores I am fairly sure i.e. change learning:rate to learning_rate. 7) 36:45 - typo: env.action_dpace.n to env.action_space.n 8) 40:20 - typo: selg should be self 9) 40:32 - typo: tupe should be tuple(fixed later on anyway :D ) 10) 44:48 - needs colon after closing bracket for update on line 47 11) 45:54 - typo: actions should be action 12) 47.24- typo: missing self statements for class causing problems, I think line 41 should actually read: self.epsilon = max(self.final_epsilon, self.epsilon - self.epsilon_decay) 13) 47:54 - typo, should be learning_rate = 0.01 NOT learning_rate:0.01 14) 50:53 - Ok, wow, this one took some time to get to a place where it wasn't going to give me a syntax error. For some reason, the instantiation of the wrapper was just not working with any variation of this code. So, to solve this, I first ensured I had imported deque and also imported RecordEpisodeStatistics as shown below: from collections import deque from gym.wrappers import RecordEpisodeStatistics and then write the following line instead of line 2 of this cell: env = gym.wrappers.RecordEpisodeStatistics(env, deque_size=n_episodes) (Again, sorry for any mistakes of my own, btw he does fix that before he then runs it at 55:42) 15) 53:05 - Typo: ob should be changed to obs I will be honest, the content within the tutorial is very good and what it can be used for is definitely something that people should look into. However, the problem is the quality of the teaching process provided. Within the first hour I found the mistakes listed above and it definitely makes it a tedious process not being able to verify the typos to be right or wrong as the code cells were never ran until around 55 minutes in. It is good to have the notebook available in the form of a Google Colab notebook but if the process of understanding the material is not something that was easily cleared up along the way it definitely makes it harder to be sure the code will work the way that it is expected to. I appreciate you taking the time to make the tutorial Mustafa and wish you the best in the future courses you provide :)

@sarthaknimbalkar5868 8 ай бұрын

Thanks man

@a.wgpfhl9590 6 ай бұрын

Thank you! Additionally to solving the RecordEpisodeStatistics error (15) I copied the creation of the environment env = gym.make("Blackjack-v1", sab=True,render_mode="rgb_array") before the wrapping into the Recorder, otherwise with multiple attempts to execute the cell env is wrapped and wrapped again -> 🍬. Apart from that i am thankful for the tutorial. It is one of the most recent and with all the changes that happened to gym in the recent years. Others may be better but are already outdated

@viatrufka 2 ай бұрын

Thank you very much

@shreyasmishra9573 2 ай бұрын

Thank you very much!

@RyanMartinRAM 6 ай бұрын

Wow, a bait and switch. When I saw the intro at the beginning, I thought "oh great, a clear accent, nice microphone, high production quality, excellent!" and then it switches to, well, you know, something a bit more typical for KZbin.

@jordanterry1692 Жыл бұрын

Hey, I’m the maintainer of Gymnasium. It’s not affiliated with OpenAI in any way (though Gym used to be), its part of the Farama Foundation. Thats why the repo is under the Farama Foundation organization, as is the website. The home page of the Gymnasium website and the readmes of Gym and Gymnasium both make this clear.

@EverythingTechWithMustafa Жыл бұрын

Noted. Edits have been made to fix this

@freecodecamp Жыл бұрын

Sorry about this misunderstanding. We've updated everything to make it more clear that Gymnasium is not associated with OpenAI.

@sketchwaretagalogtutorials Жыл бұрын

So Gym and Gymnasium are not the same?

@xsrpma Жыл бұрын

Gym (legacy) -> Gymnasium (active fork)

@user-zk7px5xo5k 6 ай бұрын

The google colab link is not working can you please make it available. Alot of doubts

@Fetrah2 Жыл бұрын

⭐ Contents ⭐ ⌨ (0:00:00) Introduction ⌨ (0:04:19) Reinforcement Learning Basics (Agent and Environment) ⌨ (0:12:15) Introduction to OpenAI Gymnasium ⌨ (0:14:59) Blackjack Rules and Implementation in Gymnasium ⌨ (0:18:27) Solving Blackjack ⌨ (0:19:46) Install and Import Libraries ⌨ (0:23:19) Observing the Environment ⌨ (0:27:55) Executing an Action in the Environment ⌨ (0:33:01) Understand and Implement Epsilon-greedy Strategy to Solve Blackjack ⌨ (0:42:28) Understand the Q-values ⌨ (0:47:29) Training the Agent to Play Blackjack ⌨ (0:57:10) Visualize the Training of Agent Playing Blackjack ⌨ (1:04:34) Summary of Solving Blackjack ⌨ (1:09:57) Solving Cartpole Using Deep-Q-Networks(DQN) ⌨ (2:29:29) Summary of Solving Cartpole ⌨ (2:34:07) Advanced Topics and Introduction to Multi-Agent Reinforcement Learning using Pettingzoo

@amirrezasafaei3152 8 ай бұрын

Thank you very much for this tutorial.

@seesea191 Ай бұрын

The google colab link is not working can you please make it available

@swastikgorai2332 Жыл бұрын

Just what I needed at the right moment! Thanks!

@powerfuel297 Жыл бұрын

same

@Ireile Жыл бұрын

Thank you for all your hard work! 😊

@EverythingTechWithMustafa Жыл бұрын

You're very welcome!

@adamharb5011 5 ай бұрын

I'm trying to get to the code but the link you shared isn't working "Sorry, the file you have requested does not exist". So please share us the right link to access the code!!!!

@amirrezasafaei3152 8 ай бұрын

Thank you man. It helped me a lot. :)

@queenslaands 10 ай бұрын

Thank you for the tutorial and the code! But I think you should focus more on "explaining" the mechanism of the code rather than just reading and typing word by word. I actually learned by reading the colab notebook, cannot finish your video. Anyway, thank you very much!

@snowyfield1906 4 ай бұрын

anyone pls send me the fixed link of this notebook? :( Im currently studying with this video but lacking of notebook makes me suffered

@user-jr3ue1nv1o 2 ай бұрын

Does anyone have the source code in the video? The Colab link in the video introduction is not working. thank you.

@joelmendonsa3246 10 ай бұрын

There are tons of erros in the google colab, could you please make a video about fixing them properly , thanks

@kaustubh3334 Жыл бұрын

Thank you so so much ❤

@EverythingTechWithMustafa Жыл бұрын

You're very welcome!

@deekshantwadhwa Жыл бұрын

Woah, I received an assignment on Deep Q-Networks for reinforcement learning which I have to submit by the weekend and here this is 😮😂

@EverythingTechWithMustafa Жыл бұрын

❤

@ianchang7446 9 ай бұрын

Hi, this video is so good for me to learn RL. Thanks a lot. But I got a problem when running the cell "Default title text" in your code Colab shows "RuntimeError: maximum size for tensor at dimension 0 is 100 but size is 1000". Do you know why ? How to fix it ?

@lorenzojamone6463 8 ай бұрын

it's an error/typo in the code, it should be "means = durations_t.unfold(0,100,1).mean(1).view(-1)" and then "means = torch.cat((torch.zeros(99),means))", instead of "1000" and then "zeroes"

@HubertRozmarynowski 4 ай бұрын

at 1:04:01, why is the "episode lengths" plot starting and ending with impossibly low values?

@curiousphilosopher2129 Жыл бұрын

Book recommendation: "A Primer to the 42 Most commonly used Machine Learning Algorithms (With Code Samples)."

@briangwaze9530 Жыл бұрын

Thank you a million!

@EverythingTechWithMustafa Жыл бұрын

❣

@Narendra_19 Жыл бұрын

Thank you so much sir

@EverythingTechWithMustafa Жыл бұрын

❤

@zed625 Жыл бұрын

Why did the presenter change?

@nikolaaswillaert2951 10 ай бұрын

some constructive feedback: I noticed a lot of typos in the code. I suggest running the code instead of jumping to the next section. Showing errors helps us understand as well. Also you are a bit hard to understand.

@ishaquenizamani9800 Жыл бұрын

I am getting error when I use OBS: tuple [int, int, bool] as function parameters why can anybody explain.

@EverythingTechWithMustafa Жыл бұрын

Check out the google colab notebook . Link is in description

@ApexArtistX 5 ай бұрын

can you do flappy bird and space invaders , by watching screen pixels. with gymnasium

@ojasvisingh786 Жыл бұрын

🎉🎉

@EverythingTechWithMustafa Жыл бұрын

🎉

@andrewrice9362 Жыл бұрын

I've been having a melt down today trying to look all the gymnasium environments and had to go back to gym. Can't believe this came up 29 mins ago

@EverythingTechWithMustafa Жыл бұрын

haha

@G2Chanakya Жыл бұрын

Bro that website is amazing. But when i did, all sorts of library errors were coming and i didn't know how to go about with wrappers.

@ash-pq9nf 5 ай бұрын

I am unable to download the code. Could you please reupload

@ArjunMahes 3 ай бұрын

same

@imaginime Жыл бұрын

can anyone tell me how to import the IDE he is using to code for RL?

@EverythingTechWithMustafa Жыл бұрын

it's google colab

@prashlovessamosa Жыл бұрын

This is something new.

@EverythingTechWithMustafa Жыл бұрын

That was the goal

@usmanmarkaz Жыл бұрын

Plz upload React Native Tutorial and projects

@freecodecamp Жыл бұрын

Coming in the next two weeks.

@Nikhil-dp2mw Жыл бұрын

This is new..

@EverythingTechWithMustafa Жыл бұрын

It is

@veraciousperson8055 Жыл бұрын

You people need a 100 million subscribers

@Samuelsward96 11 ай бұрын

One of the worst coding guide i've ever seen. Such a shame because i really want to learn this topic and there are so few guides out there. Oh well, at least i can copy the code and analyze it

@bestboy007 Жыл бұрын

this video is only for india speaking people.

@EverythingTechWithMustafa Жыл бұрын

It’s not that bad

@LoVeAmBiEnT Жыл бұрын

Nice. Just wish I could understand what you are saying.

@user-wd5nx7pd8c Жыл бұрын

Hello I wish to add an audio track in Arabic, the new feature added by KZbin

@m.s.2927 Жыл бұрын

I think "gym" is referring to an institution, where one can train the muscles... Not "Gymnasium"

@abdum1493 7 ай бұрын

In the Blackjack example, I'm wondering who learned less, the agent or myself. This is a worthless tutorial and should be deleted. Aside from the effort put, I have nothing good to say.