Thank you for your tutorial, I hope to see how you can visualize the environment in the upcoming tutorial!
@NicholasRenotte3 жыл бұрын
Me too! Keen to do a ton more stuff with RL and possibly PyGame!
@NicholasRenotte3 жыл бұрын
@Eliseo Raylan awesome! Let me know how you go with it!
@tomtalkscars94943 жыл бұрын
Love these. Building custom environments is one of the biggest areas missing with the OpenAI stuff imo. Would be cool to see one bringing in external data. Like predicting the direction of the next step of a Sine Wave or something simple like that.
@NicholasRenotte3 жыл бұрын
Definitely, got way more stuff on RL planned once the Python course is out!
@laser78612 жыл бұрын
Great tutorial. Simple and to the point, especially for someone who is familiar with RL concepts and just wants to get the nuts and bolts of an OpenAI gym env.
@tawsifkamal883 жыл бұрын
Really informative video! As a high schooler self-learning RL, tutorials such as these are really helpful for showing applicability in RL.
@NicholasRenotte3 жыл бұрын
Ohhh man, going to have something awesome for you in a few days time then!
@Techyisle3 жыл бұрын
Your tutorials were awesome, and I just finished your 3-hour RL tutorial, and I would like to see a Pygame implementation as soon as possible :) If possible, try to create a different set of advanced videos where you will explain the math and intuition behind RL, along with code implementations (to cater a different audience). Something I like about you is that you respond to each every comment, a characteristic which I don't see often from others. Kudos to you! Thanks again mate! Stay safe!
@NicholasRenotte3 жыл бұрын
Thanks @Techy Isle, I'm definitely going to be going into more detail. Been studying some hardcore DL stuff like crazy while producing the Python basics course!
@baronvonbeandip Жыл бұрын
This is way more useful than the last one. The more you can modify OpenAI's envs, it seems, the more that you can get out of the reinforcement learning schema.
@user___013 жыл бұрын
Man you can't stop giving us this gold of a tutorial!
@NicholasRenotte3 жыл бұрын
Definitely! Two a week that's the goal man!
@MuazRazaq3 жыл бұрын
I Can't stop myself from commenting on this exceptionally good tuotorial. Sir, really amazing job. I must say you should continue this good work, the way you explain each and every line is something that is very rare in the material that is available till now. Much love from a Pakistani student currently in South Korea 😍
@NicholasRenotte3 жыл бұрын
Ohhhh thanks so much @Muaz! Soo glad you enjoyed it.
@pratyushpatnaik46173 жыл бұрын
Sir that was exceptionally good!!! 🔥 I would really love to see the render function in play using pygame. Waiting eagerly for it!!!!
@NicholasRenotte3 жыл бұрын
Definitely, can't wait to finally do something with Pygame!
@viswanathansankar37893 жыл бұрын
@@NicholasRenotte Yeah Please do it as soon as possible...
@albertsalgueda10362 жыл бұрын
Yes! there is a need for Environment viz.
@prakhars9623 жыл бұрын
just recommended this video to one of my coursemate. your videos are worth sharing.
@NicholasRenotte3 жыл бұрын
Thanks soo much!
@charlesewing97722 жыл бұрын
Hi, great video! I was just wondering what happens if say for example the temperature is at 100 and the model try's to add 1 to temperature (so now outside the limits), does it then resample automatically or would you have to implement this in the code yourself?
@yxzhou54022 жыл бұрын
As a beginner of RL, all your videos really help me a lot so thank u!!! And I just wonder if there is any chance to see the tutorial on how to build the env with multi-dim action?
@SatoBois3 жыл бұрын
Hello Nick! I love your tutorial and it's actually helping so much in university especially consider the lack of documentation for openai. I was actually doing a custom environment for tictactoe to practice but for some reason when I run dqn.fit() like you did with the same everything for the keras-rl training part I get this: "ValueError: Error when checking input: expected dense_16_input to have 2 dimensions, but got array with shape (1, 1, 3, 3)" I don't quite understand why it got that shape because my tictactoe game's observation space is a np.array([Discrete(3)]*9) to represent the nine tiles and the three possibilites of what could be in them. Again, thank you for the helpful tutorials!
@myceliumbrick1409 Жыл бұрын
yep i have the same error. Did you manage to solve the issue?
@ProfSoft3 жыл бұрын
great job , thanks i have a question , why in the model building you put the last layer activaition function to 'linear' , i think we should make it softmax because i think it is classification problem ??
@NicholasRenotte3 жыл бұрын
Hmmm, could definitely change the activation function there!
@ProfSoft3 жыл бұрын
@@NicholasRenotte Thank you very much, I just wanted to make sure there was no specific reason for choosing linear Activaotion function, God bless you for this great effort
@oliverprislan39402 жыл бұрын
Thank you Nicholas, this is a very good example to give it a kick start.
@stevecoxiscool2 жыл бұрын
I guess what urks me the most about all the universe/retro/baselines gym examples is that it's not straight forward to get your bright/shiny newly trained model to run in other environments. These gym examples have so many interdependencies and one does not really know what is going on inside the box. This is why I am glad you are doing the video on getting other environments to work with RL algos. Unreal is my choice sine Unity already has a ML examples.
@NicholasRenotte2 жыл бұрын
100% I took a look into the Unity environment over the Christmas break and was godsmacked. Well documented, logging and training was clear. I love OpenAI Gym but it seriously Unity ML agents appear to be so much easier to deal with.
@stevecoxiscool2 жыл бұрын
@@NicholasRenotte I really wish Unreal was at par with Unity on the ML technology. I am using UnrealPythonPlugin to send images to a remote python client running opencv DNN. The video doing this on my youtube is a few years old. Your custom gym environment linked to Unreal is doable. Thanks for your videos !!!!
@sommojames2 жыл бұрын
Great video, but what's the point with observation space? Looks like your agent is not using it
@markusbuchholz35183 жыл бұрын
Nicholas as I mention sometimes ago your YT channel is outstanding and your effort impressive. The RL is my favourite branch of ML so I extra enjoyed watch your performance. Exceptionally, you built also customised environment. The idea can be easily populated and applied to other specific tasks. It is a great pleasure to watch your channel and I will recommend everyone to be here (to subscribe) Have a nice day!.
@NicholasRenotte3 жыл бұрын
Thank you so much @Markus! Glad you enjoyed the RL videos, I think it's a super interesting field with a ton of interesting applications. I'm hoping later on this year we might be able to apply some of it into hardware applications with Raspberry Pi or ROS!
@markusbuchholz35183 жыл бұрын
@@NicholasRenotte Thank you wonderful feedback! Yes ROS/ROS2 is great robotics framework. Now I am more inspired by Nvidia Jetson Xavier since it is "slightly" more powerful. Good luck!!!
@NicholasRenotte3 жыл бұрын
@@markusbuchholz3518 oooh yeah, I took at that yesterday. Looks awesome! The OAK camera looks promising as well!
@islam69163 жыл бұрын
Thank you for the video ⚡⚡⚡ I hope you can make a Custom Agent Next time ✅ Looking forward to see that ✨
@NicholasRenotte3 жыл бұрын
Heya! Definitely, code is 80% of the way there, I should have it up in the coming weeks!
@islam69163 жыл бұрын
@@NicholasRenotte Great !!!
@jiajun8983 жыл бұрын
Great tutorial. A question though. What would be the benefit of transferring your reinforcement learning from the keras implementation to the openai gym environment implementation?
@NicholasRenotte3 жыл бұрын
This is Gym, it's more the rl agents that I've started migrating (better stability, control and exporting).
@TheNativeTwo Жыл бұрын
Great video, I like how you explain each line of code. My one complaint is not your fault... Getting the right environment and versions of the packages. I got right to the end... And couldn't get it working. A bit frustrating lol.
@davidowusu11843 жыл бұрын
Great Video. I was able to use this as a basis to create an environment for my specific needs. I have one question though Once you've trained your model and saved your weights, how do you use it? I mean actually pass values to the model to get an action as a response
@NicholasRenotte3 жыл бұрын
You can pass the new state to the model and actions are returned as the output. Can then go and plug it into the real model/iot suite etc.
@davidowusu11843 жыл бұрын
@@NicholasRenotte Thanks so much for the wonderful content and thanks even more these replies. You're awesome.
@sebatinoco3 жыл бұрын
Hi Nicholas, amazing video! Quick question, how can I access the current action that the AI is taking? Thanks!
@NicholasRenotte3 жыл бұрын
Heya @Sebastian, I don't believe it's easily accessible through keras-rl. If you're using StableBaselines, you can access it through DQN.predict(obs) e.g. github.com/nicknochnack/StableBaselinesRL/blob/main/Stable%20Baselines%20Tutorial.ipynb shown towards the end.
@padisalashanthan98 Жыл бұрын
Great video! I am a little confused on how to solve a multi-states problem. Can you please give some pointers on that?
@KEFASYUNANA Жыл бұрын
Great Videos. Any idea on how to handle 2 or 3 states/observations in the codes say temperature and pressure or humidity
@melikad27682 жыл бұрын
Hi Nick. Thank you veryy much, I learned a lot. But there is a question: How can see which action should the shower can take? I mean, how can i understand about the action which the agent can take based on the reward?
@Spruhawahane2 жыл бұрын
On my mac the kernel keeps dying when I run the basic cartpole example. Don't know how to troubleshoot. Pls help.
@OmarAlolayan3 жыл бұрын
Thank you Nicholas ! Can you please advice me on how to use the step function if I have a mutlidiscrete action space?
@NicholasRenotte3 жыл бұрын
Heya @Omar, what does your output look like if you run env.action_space.sample()
@@OmarAlolayan oh wow, can you try using this with stable-baselines instead? It might be easier to model as the algorithm will pick up the observation space without the need to define the neural network.
@AldorCrap Жыл бұрын
very useful tutorial. Although I would need some help with mine. I'm working to train a model for optimal path routing. I'm struggling on defining the observation_space, like does it have to be the whole road graph (as a spaces.Graph) or Box with the parameters (like current_coords, dest_coords, edges_max_car_speed) or maybe both? how do I should approach this?
@idrisbima53692 жыл бұрын
Hello Nick, wonderful video. I am having the same error message you pointed out in the video and tried resolving it as shown but it is giving me a different error message stating the name model is not defined. Please help
@julian.estevez8 ай бұрын
Thanks a lot for the clarity of explanation.
@Sam-iy1kv Жыл бұрын
Hi, very nice video ! May I ask one question, what if I need a continuous model for the training task, the discrete action will not feasible , how can I do ?
@khaileng3020 Жыл бұрын
to fix the Sequential error: just arrange the order of importing library import numpy as np from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten from tensorflow.keras.optimizers import Adam import tensorflow as tf from rl.agents import DQNAgent from rl.policy import BoltzmannQPolicy from rl.memory import SequentialMemory states = env.observation_space.shape actions = env.action_space.n print(actions) def build_model(states, actions): model = tf.keras.models.Sequential() model.add(Dense(24, activation='relu', input_shape=states)) model.add(Dense(24, activation='relu')) model.add(Dense(actions, activation='linear')) return model model = build_model(states, actions) print(model.summary()) def build_agent(model, actions): policy = BoltzmannQPolicy() memory = SequentialMemory(limit=50000, window_length=1) dqn = DQNAgent(model=model, memory=memory, policy=policy, nb_actions=actions, nb_steps_warmup=10, target_model_update=1e-2) return dqn dqn = build_agent(model, actions) dqn.compile(Adam(lr=1e-3), metrics=['mae']) dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
@traze78 Жыл бұрын
thanksssssss!! it helped a lot
@nilau84632 жыл бұрын
Thank you for the guide, really gave me a good idea how to implement my own models!
@NuHoaNgonLa3 жыл бұрын
what does the Env argument inside the ShowerEnv() class do?
@NicholasRenotte3 жыл бұрын
Should be the parent class, I may have forgotten to run super().__init__() inside of the __init__ function.
@erfankhordad94033 жыл бұрын
Thanks Nicholas for the great explanation. I have tested this custom environment with PPO and MlpPolicy and got very low rewards around -40 (even with 200000 time steps for model.learn). Any idea why I get poor results? thanks
@NicholasRenotte3 жыл бұрын
Same env as this one or custom one? Might need a little HPO or possibly an alternate algorithm, I think I did it with a slightly different model in the full RL course with better results!
@kayleechu9312 жыл бұрын
Hi Nicholas, thanks a lot for your video! I wonder how can I know about the objective function of the agent? Is there a way that I can change the objective function myself? Thanks a lot!
@candychebet89611 ай бұрын
Hello. Did you get to doing the visualization?
@DreamRobotics3 жыл бұрын
Very simple and very nice. Good work.
@NicholasRenotte3 жыл бұрын
Thanks so much @Dr. Abdul-Mannan Khan!
@talitaaraujo13273 жыл бұрын
Man, your videos are so great! Congrats!!!!! I have one question: i can't install a keras-r12. Maybe you can help me.
@NicholasRenotte3 жыл бұрын
Try keras-rl2, it should be an l instead of 1
@talitaaraujo13273 жыл бұрын
@@NicholasRenotte Thank you!!!
@ehrotraabhishekm4824 Жыл бұрын
Fantastic video, thank you so much for it....i have one doubt regarding DQN in gym....can you please share some details for how to proceed DQN with multi-dimension state space(4 D) which was 1 D in your case (temp)
@travelthetropics61903 жыл бұрын
Thanks for the informative series on reinforcement learning. Are you running this on CPU or GPU ? At [23:23]. I have noticed that in your PC, it is like 47-55 sec per 10000 steps. I am getting 118-120 sec with my GPU and 59-63 sec with my CPU only. It seems like, this small model works better with CPU only, may be due to the extensive copying time to GPU :D
@NicholasRenotte3 жыл бұрын
Yeah, I noticed that as well, with RL oftentimes the model won't benefit as much from GPU acceleration.
@svh023 жыл бұрын
hey @Nicholas, awesome as usual !! Any reason why you chose to build your agent with Keras-RL and not with the ones provided by Stable-Baselines? Hope you keep making videos about custom environmets. I think that's what's most useful. KZbin is already crowded with videos about the common environmets for games and stuff like that.
@NicholasRenotte3 жыл бұрын
Was a little early on when I did this, I've since transitioned most of my rl projects to sb! Got plenty planned on custom environments, stay tuned!
@samuelebolotta80073 жыл бұрын
Hi Nicholas, great great work! It would be interesting to see a parallel with ML agents from Unity, to see the differences with OpenAI Gym. Thanks!
@NicholasRenotte3 жыл бұрын
YESS! I've been waiting for someone to ask for it, I've started testing it out already, should have a tutorial on it kinda soonish!
@jugalyadav71102 жыл бұрын
Hello Nicholas, firstly this video helped me a lot to get my basics cleared up regarding RL. Currently I am working with my own custom env and building a SAC model over it. I wanted to plot the actor and critic losses, and from your video I get that it should be done within the render function. It would be great if you could post some video summarizing the plots in render function. Cheers !
@SuperHockeygirl98 Жыл бұрын
Hey , thank you so much for this video. It really helped me. I have a question: can you define your observation space using CSV files and then iterate over it, so the agent needs to deal with differing environments?
@RafalSwiatkowski3 жыл бұрын
Greetings from Poland. Extra tutorial, it will be great if you show how to combine pygame with reinforcement learning
@NicholasRenotte3 жыл бұрын
Woah Poland, what's happening! Definitely, I'll get cracking on it. Much love from Sydney!
@RafalSwiatkowski3 жыл бұрын
@@NicholasRenotte Thank u master ;)
@NicholasRenotte3 жыл бұрын
@@RafalSwiatkowski anytime!! 🙏
@monirimmi86162 жыл бұрын
Hi Nick, Thank you very much for your nice explanation with outstanding implementation. To this end, I have a question, How can I check the model parameter update, for example, the weight of each layer? When each training episode is done? Is there any way to check those parameters?
@NicholasRenotte2 жыл бұрын
I think you can export the final keras model, this should allow you to see the model weights etc
@vincentroye3 жыл бұрын
Excellent tutorial, thanks! Is it possible for a RL model to output a pair of ints or floats ( like [1.5, 2.8] ) instead of a discrete value? What would the output layer look like?
@NicholasRenotte3 жыл бұрын
I believe so, you would need your final layers to have a linear activation function and need a box state space! What's the use case if you don't mind me asking @Vincent?
@vincentroye3 жыл бұрын
@@NicholasRenotte thanks for answering. I'd be interested to see how a model could output the best geometrical coordinates for a given state. It could be a 2D game where the player would have to avoid bombs that pseudo-randomly hit a finite surface for example.
@NicholasRenotte3 жыл бұрын
@@vincentroye oh got it! Might I suggest you approach it slightly differently. You would ideally store the state of the objects coordinates and just output the actions for your agent in response to those coordinates. It would be akin to your agent walking around using something like sonar.
@vincentroye3 жыл бұрын
@@NicholasRenotte could the actions in that case be to move x (left or right) and y (up or down) at the same time? That would be the reason for having 2 outputs. I'd be interested to see how a dqn agent would train the model in that case. In your video it takes a discrete value as nb_actions, how would that be done with 2 continuous outputs? that's where I'm a bit confused, that would give place to a huge amount of possible actions.
@andreamaiellaro6581 Жыл бұрын
Hi Nicholas, I found this tutorial of great help!Thank you. However I'd like to ask you if what I have in mind is correct or not: within the step function can I update what side the observation_state?
@sandeepagarwal85662 жыл бұрын
Thank you for the tutorial.Like sklearn has hyperparameter tuning,Please let us know how can we tune hyperparameters in case of DQN...like any package or library or any kind of reference would be helpful...Thanks
@kushangpatel9833 жыл бұрын
Really useful tutorial, Nick! Keep it up, mate!
@NicholasRenotte3 жыл бұрын
Thanks @Kushang!
@Oriol.FernandezPena3 жыл бұрын
Your content is the best!! 🔥🔥
@NicholasRenotte3 жыл бұрын
Thanks so much!!! 🙏 🙏
@master2310903 жыл бұрын
Amazing video! I had a question about the activation layer and performing the final action. Your activation layer is a linear function. How does link to picking the action?
@aayusheegupta2 жыл бұрын
Hello Nicholas! Great tutorial on building customized environment with Gym. Could you please share any pointers on how to load our own dataset while building an environment? I want to load and train RL agent with natural language sentence embeddings and create a proof tree.
@PedroAcacio10002 жыл бұрын
Great video! Thank you very much, sir! What about if my problem only allow me to determine the reward on the next step, after we took some action? Do you have any video talking about such problems? Thanks again, your content is helping a lot.
@boonkhao8 ай бұрын
Hi, it is a great video tutorial for customizing environment. However, I copy your code and run on jupyter notebook, I stuck in a problem that rl.agents could not find the version of keras. I have tried many ways to solve this but I am still cannot solve it. So, please help me.
@montraydavis2 жыл бұрын
Fantastic tutorial! I have a question though regarding the DQNAgent test. I noticed that at the test function, the only two actions being sent to step are the low and high values. Why is that? How would I go about this because I need action to be equal to 0, 1 or 2 for my application. Thanks a lot for this resource!
@dhiyamdumur62452 жыл бұрын
Hi Nicholas! Very informative video! I would like to know if we can implement DDPG in the context of routing in a simulated networking environment to assess its performance in terms of network delay. Thank you
@NicholasRenotte2 жыл бұрын
Yeah probably! I would think you would have different routes or paths for network load then reward based on latency or something of the like!
@ameerazam32693 жыл бұрын
Again Best ever explanation Sir appreciate your work keep it up for us
@NicholasRenotte3 жыл бұрын
Thanks so much @Ameer!
@jordan69213 жыл бұрын
Ooo pygame would be such a cool thing to see. I wonder if Retro Gym environments work too!
@NicholasRenotte3 жыл бұрын
IKR, Pygame is definitely on the list! I've tested with some of the Atari envs and them seem to work, take a while to train but they work!
@sahilahammed73862 жыл бұрын
hi, do the Observation space and state are same? here observation space isn't used while model training. right?
@mzadeh3 жыл бұрын
Thank you very much, very clear and clean.
@NicholasRenotte3 жыл бұрын
Thank you so much @Mostafa!
@saaddurrani89303 жыл бұрын
i am doing a project: RL for smart car (prototype )by using DQN or any other RL algorithm. So i am thinking to feed in images as a state (from the camera mounted on the car) and my car is able to take 3 actions (forward, right and left).. I am keeping it quiet simple i.e by keeping the car in front of our goal, and as the car sees the goal i want to reward it and take the next action , now if it takes sucha random action where the goal is no more in the vision of the camera, it gets a penalty (state,action,reward/panelty, next state and so on). The episode time is limited to 2 mins.My aim is that the car moves towards it goal (and the more it moves towards the goal the more the size of that feature would be larger, and hence it will get another reward bcz its moving towards its goal) (goal would be an image "Triangle" at the end of the room infront of the car intial position. Now before implementing my DQN into the real life prototype i need to train it on open AI gym (3d). I have no idea how i can build such a environment where i can train my DQN RL by simulation. any help and suggestion are apreciated
@NicholasRenotte2 жыл бұрын
Take a look at how some of the video game driving environments are built! Should be a good start for how to kick it off!
@fernandomelo84603 жыл бұрын
First, your channel is amazing. Second, i try to adapt your custom env to a trading env, but, when i use Deep Learning/Keral RL2 it doesnt looks good, my Reward is always the same (and the maximum). I think the problem is the NN architecture or/and Rl models (BoltzmannQPolicy/DQNAgent), because the loop before Deep Learning part looks ok, do you have any tips?
@NicholasRenotte3 жыл бұрын
Check this out: kzbin.info/www/bejne/emrWhmSegbljh7s
@mohamadalifahim3443 жыл бұрын
Overly tensor flow for baseline import A2C not work. How to solve
@anamericanprofessor2 жыл бұрын
Any good links to actually overriding the render function for showing our own custom visualization?
@hanswurst96673 жыл бұрын
Cheers Nicholas! How would one got about adding another dimension to the state? for example velocity of the water?
@NicholasRenotte3 жыл бұрын
Heya @hans, you could add another box which represents the by grouping them into a Dict space: e.g. from gym.spaces import Discrete, Box, Dict spaces = { 'temperature': Box(low=np.array([0]), high=np.array([100])), #degrees 'velocity':Box(low=np.array([0]), high=np.array([30])) #litres/sec } dict_space = gym.spaces.Dict(spaces) self.observation_space = dict_space
@hanswurst96673 жыл бұрын
@@NicholasRenotte Thanks so much for the great reply! You wouldnt know by chance any great resources/tutorials concerning multi agent rl?
@NicholasRenotte3 жыл бұрын
@@hanswurst9667 nothing that I've tested out already unfortunately :( Will let you know when I find something that works seamlessly!
@vigneshpadmanabhan2 жыл бұрын
Is there a Deep Reinforcement learning algorithm we can experiment on regression based tabular data or sensor data etc? If so it would be much appreciated if you could make a video on it. Thanks !
@christiansiemering81293 жыл бұрын
Hi Nicholas, thx for your great video! Is there an easy way to create customized multi-agent environments with aigym? I want to create an AI that competes against another agent in a Multiplayer-"game".
@NicholasRenotte3 жыл бұрын
AFAIK it's not super straightforward with pre-built RL packages that are out there atm. Will probably get to it in future vids!
@zahrarezazadeh2932 жыл бұрын
Thanks Nicholas for the nice tutorial! I have two questions. 1. I'm trying to implement this on PyCharm with Python 3.10, on a MacOS Monterey, Core i3, but with built-in Python 2 something. I can't install and import tensorflow. It says it can't find a satisfying version. Any idea where the problem comes from? different Python versions? any solutions? 2. I'm starting to build my own environment, which is not like any of the ones available. It's a 2D path an agent should try to stay close to, by going left and right, with some gravity. Any suggestions where to start coding it? or any environments you know similar to this? THANK YOU!
@NicholasRenotte2 жыл бұрын
Woah fair few there, take a look at some of the existing Gym envs, I think there might be some path focused ones I might have seen a while ago.
@alirezaghavidel45943 жыл бұрын
Thank you for your amazing work. I have a question regarding the defined Environment. I defined the self.state as vector in the __int__ function (self.state =np.zeros(shape=(5,),dtype = np.int64)), but when I want to recall the self.state in Step function, it is an integer. How can I have the vector state in the Step function as well?
@NicholasRenotte3 жыл бұрын
I just took a look at my code and I think it's not perfect tbh. Try setting initial state inside of the reset method.
@alirezaghavidel45943 жыл бұрын
@@NicholasRenotte Thank you. I did
@alirezaghavidel45943 жыл бұрын
@@NicholasRenotte I have another question and I would be appreciated if you help me. I defined the environment for multi-component example and I defined the action space and observation space as vectors and now I want to recall them for RL in keras (you used states=env.observation_space.shape and actions = env.action_space.n for input parameters of built_model function). How can I recall them for my multi component example? do you have any example for multi-component example for RL in keras? Thank you
@frankkreher48322 жыл бұрын
Thank you, once again, for the very educational video. Great work!
@kheangngov80052 жыл бұрын
Hello Nicholas, Ur video is very helpful. I have some questions to ask. I wonder if it is possible to customize actions space for each state and reward only given at the terminal state. For example state 1, with 3 actions, state 2 with 5 actions, 3 with 10 actions, and reward can be calculated based on those action sequence whether it is a win or lose. Thank you.
@lukejames35703 жыл бұрын
two quesitons, Is the aciton came out of the network go to next step of the env? and how can you be sure the output of the network is 0 1 2 instead of other random number?
@hariprasad11683 жыл бұрын
Thanks! This is really helping me a lot.
@bananabatsy37083 жыл бұрын
I am just getting a NotImplemented Error in the FOR LOOP cell. I looked up. It has to do with inheritance. But I cannot find how.
@NicholasRenotte3 жыл бұрын
Heya @BananaBatsy, where about's is the error being triggered?
@TheOfficialArcVortex2 жыл бұрын
Any chance you could do a tutorial on how to use this for physical computing? For example, how would you implement two led's say using GPIO on a raspberry pi when temp goes up or down and an input sensor for temperature. Or say an accelerometer and a motor for balancing.
@PhilippWillms2 жыл бұрын
Where does the 24 come from in defining the neural network layers?
@NicholasRenotte2 жыл бұрын
Completely subjective, could change it to a larger or smaller value depending on the complexity of the problem you're trying to solve Philipp!
@fidelesteves63933 жыл бұрын
What an amazing tutorial! Thanks
@NicholasRenotte3 жыл бұрын
Thanks so much @Fidel!
@zitongstudio3 жыл бұрын
Hi, I don't understand why during training the reward is around -0.5, while during testing the reward is around -60. Is it because that the number of steps used for training and testing are different? For training it is over 10000 steps, for testing only 60 steps.
@NicholasRenotte3 жыл бұрын
Different starting points without enough steps to get to the final result. Increasing testing steps would allow the agent to iterate closer.
@caio.cortada Жыл бұрын
Have you done any of those using MuJoCo environments? Do you have any bibliography on that?
@Antonio-om4sg2 жыл бұрын
Would it be possible to see the evolution (a plot) of the temperature of the water when the agent is run on the scenario? For each episode we would see, for each step, the evolution of the water temperature
@edzme3 жыл бұрын
Yep this is exactly what I was looking for. Could you make an example with a Dict space? Please thx!
@NicholasRenotte3 жыл бұрын
You got it, added to the list @Ed!
@edzme3 жыл бұрын
@@NicholasRenotte yessssssss!
@NicholasRenotte3 жыл бұрын
@@edzme yeahhhhyaaaa!!
@samiul20093 жыл бұрын
Awesome!! I was wondering how would it work with path finding with some obstacles.
@NicholasRenotte3 жыл бұрын
Normally the RL agent takes a while to find it's way but eventually works around it. I'm actually working on RL for super mario atm, it's definitely taking a while though @Samiul!
@samiul20093 жыл бұрын
@@NicholasRenotte wow! Waiting for rendering work of the environment
@NicholasRenotte3 жыл бұрын
@@samiul2009 awesome, yup got pygame in the pipeline!
@ihebbibani71223 жыл бұрын
As usual , excellent content. Thank you so much :)
@NicholasRenotte3 жыл бұрын
Thanks so much @Iheb!
@raihankhanphotography60413 жыл бұрын
Thank you for the tutorial. This was super helpful!
@NicholasRenotte3 жыл бұрын
Anytime @RaihanKhan!
@jossgm7480 Жыл бұрын
How can I access the Q-Values? I see that in the training the "mean_q" variable is displayed. How can I access these values (mean q values)?
@alessandroceccarelli68893 жыл бұрын
Shouldn’t you use a softmax output function instead?
@NicholasRenotte3 жыл бұрын
In retrospect, I think I could've done some tweaking to support an alternate activation.
@apreceptorswanhindi2 жыл бұрын
Hey man, Thanks for the wonderful video. I made a custom environment, and Keras-rl2 is taking a lot of time, not utilizing the GPUs. How can I optimize the training of this or similar codes using TensorFlow 2 on remote GPU with Ubuntu 20.0? 20.04.4 LTS (GNU/Linux 5.13.0-52-generic x86_64) NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 with four NVIDIA GeForce RTX 3080 GPUs 10 GB each
@JJGhostHunters Жыл бұрын
I love these tutorials, however I have spent hours and hours trying to even get the most simple environment to "render". I have tried two computers with Spyder, Jupyter Notebooks and even the command line and have never been able to even get a window to pop up with a rendered environment. I am continuing to learn theory of RL, however it is very frustrating not be able to follow along with these tutorials.
@juanguitarte54803 жыл бұрын
How do I custom my environment to make it visual? Using pygame? or how? Thank you
@NicholasRenotte3 жыл бұрын
Yep, can definitely do it with PyGame!
@Mesenqe2 жыл бұрын
Nice tutorial, could you please make one tutorial on how to use RL in image classification.
@summanthreddemulkkalla67862 жыл бұрын
sir can you implement the optimally placing the Electric vehicle charging stations using DQN please thank you
@tareklimem2 жыл бұрын
Hei Nick, i need a class for reinforcement learning. How can we get in contact please?
@davidromens95413 жыл бұрын
Have the libraries for kera rl or keras-rl2 been updated recently? I have been building a custom environment and training a NAF agent to solve it. Last week it was working, but when I came back this week, the code is throwing a key error:0 when my NAF.fit line is run. Any suggestions or help would be greatly appreciated.
@davidromens95413 жыл бұрын
FYI I am using google colab which requires me to reinstall all libraries every session. I know its not ideal, but unfortunately I am on Windows.
@NicholasRenotte3 жыл бұрын
Heya @David, not too sure I haven't been using keras-rl2 lately, I've been working with stable baselines in it's place. Did you have errors that you can share?
@davidromens95413 жыл бұрын
@@NicholasRenotte Turned out to be a weird bug with google colab. Restarted my computer and is working fine now. Thanks for the reply though!
@NicholasRenotte3 жыл бұрын
@@davidromens9541 anytime, you're welcome. Weird though. Building anything interesting in the RL space?
@fufufukakaka3 жыл бұрын
keras-rl2 is already archived.
@vts_222 жыл бұрын
Hi Nicholas, I watched all of the Reinforcemenet videos of yours and on the internet.(And i have been trying for 10 hours) What if my state is [10,20,30,40] and my action_space is Discrete 4. I am getting "DQN expects a model that has one dimension for each action, in this case 4" My shape is (None, 1,4 ) i cant fix it.
@vts_222 жыл бұрын
I should probably change DQN and ADAM
@NicholasRenotte2 жыл бұрын
Discrete(4) will return actions 0,1,2,3 as integer values. That sounds like your observation space is incorrect, looks like that would be (None, 4) if you've got [10,20,30,40]
@vts_222 жыл бұрын
@@NicholasRenotte I edited Sequential layers by looking at your videos, my observation_space was same with you observation_space in one video. I watched all your videos thanks for videos.I built basic snake game but im amining to design 2d space travel game with gravitation and orbits. My problem was creating wrong layers, i understand it better. Thanks again for your excellent videos.
@sarash50613 жыл бұрын
This was a very good video and I've learned alot. Thanks. I'm wondering if you make a custom environment or basically a custom agent in price optimisation? I want to train an agent for price optimisation however, I don't know what reward should be for the agent. I do have the distribution of the past sales but don't know how to define the reward for my agent. Should it be the probability of buying or should it be price? Can you please help me here or create a content for it please?
@NicholasRenotte3 жыл бұрын
Depends what you're optimizing for, for price presumably it would be max profit or sales?
@sarash50613 жыл бұрын
@@NicholasRenotte in both cases my agent would be a customer, right? Logically increasing sale or profit would be in favor of the company not the customer. Then how do I train my agent while the reward is in favor of the company and not necessarily in favor of customer (agent)?
@NicholasRenotte3 жыл бұрын
@@sarash5061 the agent is more likely to be the dynamic price adjuster. So it will make dynamic adjustments to price in response to sales or profit. E.g. 1. Agent sets initial price 2. Environment (aka customers) respond by buy X quantities 3. Calculate reward as sales or profit 4. Agent proceeds to learn to adjust price to maximize sales or profit 5. Customers buy more...and so on
@sarash50613 жыл бұрын
@@NicholasRenotte thank you very much. I will give it go.
@perryfisch20033 жыл бұрын
Your tutorials are all amazing! Thank you for taking the time to make all of these videos! I do have a question regarding making a custom Gym environment in which there's an OpenGL rendered environment that can be seen/viewed, similar to the race car environment. Say a 50 x 50 cell grid with a robot contained within it and several obstacles to navigate around as well as a target/goal. I've searched high and low and cannot find any good examples. Could you please point me in the right direction? Many thanks in advance.
@NicholasRenotte3 жыл бұрын
Not sure about OpenGL, but I've started doing it with PyGame and integrating that way!
@perryfisch20033 жыл бұрын
@@NicholasRenotte Great, I have no attachment to OpenGL, I just thought that was the way its done. Do you have an example with PyGame?