Deep Reinforcement Learning in Python Tutorial - A Course on How to Implement Deep Learning Papers

Рет қаралды 289,782

Күн бұрын

Пікірлер: 85

@Krigalishnikov Жыл бұрын

@1:48:25 this cannot be overstated enough for people getting in the field, even for very experienced and skilled people like Phil these papers can take days and weeks to be truly understood, we are very lucky to be able to get such content for free. Thanks for this wonderful course Phil!

@manuelnovella39 2 жыл бұрын

This guy is a master! Do check its own KZbin channel. Thanks to him my undergraduate final project went so well.

@masoudmasoumimoghaddam3832 5 жыл бұрын

This is actually a video tutorial with so much academic quality. I am really amazed by this video and ability to implement a paper in this pace. Would plz keep up your good job? Thanks bro.

@kiddbrand3056 2 жыл бұрын

Poooo

@kiddbrand3056 2 жыл бұрын

Ooo

@kiddbrand3056 2 жыл бұрын

Ooooooo

@kiddbrand3056 2 жыл бұрын

Oooooooo

@connor-shorten 5 жыл бұрын

Thank you! This is an incredible Reinforcement Learning tutorial!

@MachineLearningwithPhil 5 жыл бұрын

You're welcome!

@aman-tp3xb 2 жыл бұрын

🙏8j

@yairphyo1994 2 жыл бұрын

😀😀

@yairphyo1994 2 жыл бұрын

😀

@yairphyo1994 2 жыл бұрын

😀

@jupzchris 2 жыл бұрын

yea man...im in some super deep rabbit hole of programming videos while eating candy. Idk how I got here.

@huseyinyldz7966 5 жыл бұрын

Thanks soo much you Bro ;) New Magicians in Future They are Coders,Programmers,Engineers and All Designers.

@hossein_haeri 4 жыл бұрын

At 43:01, you say: " i is each element of that minibatch transitions" which is wrong. i is just the index of the reply memory, i.e. state i+1 follows after state i. And thanks for your great explanation. helped me a lot.

@lahaale5840 4 жыл бұрын

really? then how you loop over the entries by index i? (of course, if you doing with matrix then i is not need, but given the pseudo code, I think it is more reasonable to consider it as the index of the element in the minibatch)

@sorvex9 2 жыл бұрын

"Sample a random minibatch of N transitions from R" right above the equation, and you say that i is not an element of that minibatch transition? Lol?

@priyanshu_hiro 2 жыл бұрын

Why is it in my recommendation after 3 years. I wish this was recommended earlier.

@vandangorade6738 5 жыл бұрын

Please make more videos on implementing research papers on your channel😃

@pmarcowski9473 2 жыл бұрын

I'll

@gshan994 5 жыл бұрын

Its very advanced for me i guess (still watched watched for 20mins) ... Hope to get some advice from phil for beginners... To really reach to a level of implementing papers....any advice on learning road path would be helpful. Have subscribed to your channel also.Thanks Phil. :)

@MachineLearningwithPhil 5 жыл бұрын

Thanks for subbing! Check out my RL course on my channel to get started. Just search for Reinforcement Learning Jump Start

@ansupbabu8557 2 жыл бұрын

Priceless stuff

@jimothyus 5 жыл бұрын

Phil youre a fucking legend

@jimothyus 5 жыл бұрын

I forgot to say thank you Thank you for this amazing video.

@satyendrashukla9137 3 жыл бұрын

Thank you so much. I have one question, implementing D3QN in dynamic environment, where obstacles are continuously moving, how one can implement it on hardware. And which one is better DDPG or D3QN in the scenario started above.

@deepaks.m.6709 5 жыл бұрын

Thank u for this video :D Are u a brother of Bucky Roberts (thenewboston) ?

@MachineLearningwithPhil 5 жыл бұрын

hah! I am not, but I was an avid consumer of his content many years ago.

@mrt1680 Жыл бұрын

@@MachineLearningwithPhil Hi Phil, would u mind to share us your full code for this tutorial? Many thx🙏

@AdamsProgrammer 5 жыл бұрын

Bunch of thanks 🙏

@JavaAidTutorials 5 жыл бұрын

interesting..!!

@imranq9241 2 жыл бұрын

Amazing!

@arunarkamukhopadhyay6443 5 жыл бұрын

Thank you!!

@alvinkuruvilla5163 5 жыл бұрын

This looks awesome But I have a question, do you think this could be applied to website fingerprinting, where an ai is able to figure out the port and server information as you visit a website to make sure it's secure?

@MachineLearningwithPhil 5 жыл бұрын

I think applying RL to that would be possible, and an interesting niche case. Depending on how large you made the action space, DDPG may or may not be a good algorithm to leverage.

@b.sgaming6870 2 жыл бұрын

Thank you for tNice tutorials video. I just downloaded soft soft and I was so, so lost. I couldn't even figure out how to make a soft. Your video

@herabidas6615 3 жыл бұрын

Perfect

@thcoura 2 жыл бұрын

Hi, I don't know if you the right person, and I'm an ignorant in machine learning. This being said I would like to know just a simple question to justify if i jump in the world of ML. The problem is the development of question answer system. Think in a project with many disciplines with 200-300 people where the information is dynamic spread in many documents and wiki's and the data change along the time. Is it possible to have a question and answer system with natural language that can understand the progression of time. 2 pieces of information had a relationship in the past but now they are not related and the system refrain on the actual question to mix the past information with the new one. The system can show how the question changed along time but not infer relationships of past events with actual ones.

@keithmason5036 3 жыл бұрын

thanks

@devisle3169 5 жыл бұрын

Awesome! Looking to learn more and post on my channel.

@shahzadmcs 10 күн бұрын

Hi, I need this code, where can I find the ode, if some one may provide this code please. thanks

@omarelkhatib150 5 жыл бұрын

thanks please more like those read scientific papers .

@fktudiablo9579 5 жыл бұрын

2:45:50 Michael Jackson still alive guys

@thijsvandenhout6279 4 жыл бұрын

Why not use an IDE to see typos before running it?

@jasmin2795 3 жыл бұрын

where can I find this paper nd what is it called?

@aminrabiei1069 Жыл бұрын

Thank you us much for this helpful video. I run the code, but there was an error that I was not able to fix, although I searched a lot for that. Can anyone help me? The Error "I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled"

@AkshayAradhya 4 жыл бұрын

41:00 I dont think you need to wait a million steps. If you minibatch size is 64 then you just need to wait 64 steps. Right ?

@noreasonchannel4012 5 жыл бұрын

Reminded of Sherlock's assistant.

@liudaniel6800 5 жыл бұрын

Hi, i have run the code, but it did not converge at all. So I waana to know your hyperparameter's setting. Thanks a lot =-=

@hejarshahabi114 5 жыл бұрын

I really like to learn python, and I have a question what is this video about? cause I didn't get anything

@SushritPasupuleti 5 жыл бұрын

It's about implementing an ML model from a research paper from given math equations and info.

@saisrinivasmurthynarayanam5340 2 жыл бұрын

using the code piece state = state[np.newaxis,:] is giving an error ValueError: Cannot feed value of shape (3,) for Tensor TargetActor_9/inputs:0, which has shape (None, 3) can anyone help me with this ?

@that_endo 2 жыл бұрын

why did i wake up to this

@debayondharchowdhury2680 5 жыл бұрын

Where is the code of this video?

@carissa8283i 2 жыл бұрын

You don't spy on other human beings or their broadcasted thoughts. Especially a human that created the universe that you reside in.

@CrazyMiso 2 жыл бұрын

Hello, thank you for your tutrial. I have only one issue. I tried to replicate your code, but I have an error "cannot import name 'plotLearning' from 'utils'". Do you have any idea how can I fix that?

@yahayamuhammedawwal754 2 жыл бұрын

have u resolved it experiencing same thing here

@CrazyMiso 2 жыл бұрын

@@yahayamuhammedawwal754 not realy, but you can still plot the graph using matplotlib, hope it helps, if you need the code i will provide it

@ashuthe1 Жыл бұрын

@@CrazyMisohi can you provide the code?

@NextbyYounes 3 жыл бұрын

Thank you this is an amazing tutorial , but i want to ask you about the traveling thief problem , and about that environment if i want to sovling by deep reinforcment learning..... can you give me some advice about this approach???

@Quodergo 2 жыл бұрын

woke up to this lol

@martincervantes4775 2 жыл бұрын

Hello, I'm trying to decompose all the problem and I have a question when you use OUActionNoise based on Ornstein-Uhlenbeck process x = ( self.x_prev + self.theta * (self.mu - self.x_prev) * self.dt + self.sigma * np.sqrt(self.dt) * np.random.normal(size=self.mu.shape) ) I check the equations of OU process but I dont know how this "np.sqrt(self.dt)" is a valid implementation of a differential.

@mobeenalhalabi4835 5 ай бұрын

How can I contact you?

@yairphyo1994 2 жыл бұрын

😀😀

@armelinedembomafuta2583 4 жыл бұрын

Hello, I need help with a paper " A new Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of things". How can I contact you, inbox?