Check out the corresponding blog and other resources for this video at: deeplizard.com/learn/video/my207WNoeyA
@beltusnkwawir29082 жыл бұрын
Can we take a second and just appreciate the work put in producing such high-quality videos in bites that are easy to understand?
@aparvkishnov45954 жыл бұрын
Thanks deeplizard for doing the hard work on illustrations to explain it to the feeble-minded. Its like training a donkey, how to solve calculus.
@drewwilkins99635 ай бұрын
"Eee-ore!", says me. Oh, and THANKS!
@amirhosseinesteghamat76214 жыл бұрын
I saw different channels but no one explained this topic better than you . thanks alot
@mike138912 жыл бұрын
I’m so glad you produced this series of videos. I was intimidated by all the math and algorithm variations covered in the first four lectures of my graduate course. After watching these videos and then revisiting my grad lectures, I now actually understand what my professor was trying to teach. Thank you!
@SandwichMitGurke5 жыл бұрын
this is by far the best tutorial I've seen about this topic. I'm about to watch the whole series :D
@deeplizard5 жыл бұрын
Whoop! Thank you :) More videos will continued to be added to this series as well!
@cuteruby73924 жыл бұрын
subscribed!
@alokk73474 жыл бұрын
I was wandering here and there looks like I have landed a perfect place to learn Deep Learning.... Thanks .. I will continue.
@muomgu4 жыл бұрын
You are awesome. This series would help me for my project. Thank you so much. Best regards...
@danielzoulla38984 жыл бұрын
amazing explanation of what is RL. I will watch the whole series from now
@haneulkim49023 жыл бұрын
Seriously... Amazing tutorial! I really like how you offer text version as well. Thanks you :)
@sahanakaweerarathna93986 жыл бұрын
Best youtube channel to learn ML
@thusharadunumalage7094 жыл бұрын
Great tutorial, understood the concept clearly for the first time, after going through many. Thank you very much.
@theliterunner8 ай бұрын
- **Introduction to Markov Decision Processes (MDPs)**: - 0:00 - 0:17 - **Components of MDPs**: - 0:23 - 1:43 - **Mathematical Representation of MDPs**: - 1:47 - 3:59 - **Probability Distributions and Transition Probabilities**: - 4:02 - 4:56 - **Conclusion and Next Steps**: - 5:01 - 5:47
@sahand52776 жыл бұрын
Keep up the good work, thank you for the time your are putting on making this series :)
@jscf925 жыл бұрын
This series is awesome. Make learning a lot easier. Thank you so much.
@MrJoeDone2 жыл бұрын
There really should be more videos in this style. I hope there will be a lot more videos on this channel usefull to me
@nossonweissman2 жыл бұрын
This video can be denoted by n as n approaches perfection.
@ilovemusic4655 жыл бұрын
Very intuitive and easy explanation. Thank you! 🤗😀
@asdfasdfuhf4 жыл бұрын
Second video completed, the video was clear as day
@Galinator90003 жыл бұрын
Great video with intuitive explanations 👌
@christopherherrera50153 жыл бұрын
Thank you so much it is very clear the explanation of MDPs.
@harshadevapriyankarabandar54565 жыл бұрын
very very very very help full..thnks for making these videos..pls keep it going
@amadlover6 жыл бұрын
More power to you @Deeplizard
@jeffreyredondo3 жыл бұрын
well explained and easy to listen.
@ushnishsarkar70004 жыл бұрын
{ "question": "State and Reward at time t depends ", "choices": [ "State Action pair for time (t-1)", "Cumulative reward at time t ", "Agent Dynamics", "State Action pair for all time instances before t" ], "answer": "State Action pair for time (t-1)", "creator": "Ushnish Sarkar", "creationDate": "2020-06-01T16:24:16.894Z" }
@deeplizard4 жыл бұрын
Thanks, ushnish! Just added your question to deeplizard.com/learn/video/my207WNoeyA :)
@adamhendry9454 жыл бұрын
Please give credit to "Reinforcement Learning: An Introduction" by Richard S. Sutton and Andrew G. Barto, copyright 2014, 2015. You allow viewers to pay you through Join and this book material is copyrighted, but you do not reference them anywhere on your website. The equations and material are pulled directly from the text and it presents an ethical issue. Though the book is open-sourced, it is copyrighted, and you are using this material for financial gain. This text book has been used in several university courses on reinforcement learning in the past. I love these videos, but proper credit and securing approval from the authors must be obtained!
Thank you, Rooney! First quiz question for this video :D I believe you mistakenly chose the wrong answer, so I corrected it and just pushed it to the site. Take a look :) deeplizard.com/learn/video/my207WNoeyA
@grandson_f_phixis94805 ай бұрын
Thanks a lot, much appreciated
@nossonweissman2 жыл бұрын
{ "question": "If a math student is the agent, then the _______________ is the environment.", "choices": [ "math quiz", "math professor", "quiz score", "Swiss mathematician Leonhard Euler" ], "answer": "math quiz", "creator": "N Weissman", "creationDate": "2022-03-21T22:50:05.763Z" }
@deeplizard2 жыл бұрын
Thanks for the great quiz question!
@نسرينة2 жыл бұрын
This is the best lecture in RL, Thank you.. Can I get the presentaion please
@patrick.t19785 жыл бұрын
Thanks a lot, your explanation's very clear and detailed.
@alexusnag5 жыл бұрын
Really friendly beginning.
@mash-sings6 жыл бұрын
Thanks for this content good going.
@elshroomness Жыл бұрын
OMG its clicking. ITs actually clicking in my head!!!
@deeplizard Жыл бұрын
💡🤯
@thatipelli15 жыл бұрын
Excellent explanation. It will be great if you could make a video series on all Math concepts behind Machine learning.
@deeplizard5 жыл бұрын
Thanks, Anirudh. If you haven't checked out our Deep Learning Fundamentals course, I'd recommend it, as it has some of the major math concepts fully detailed there.
@tingnews72736 жыл бұрын
What I learned: 1、MDP is formalize decision making process. (Yeah, everybody teach the MDP at first ,no body tell me why until now . Its a strange world) 2、The R(t+1) is because of At , before I always think ,Rt is pair with At 3、The agent is care about accumulate reward ( For others dont know )
@deepakkumarmeena18905 жыл бұрын
Appreciate the cute example
@deeplizard5 жыл бұрын
🐿️😊
@rapisode13 жыл бұрын
You guys rock! Thanks so much!
@faqeerhasnain Жыл бұрын
The agent is not part of the MDP itself but rather interacts with it. The agent's role is to select actions based on the current state and the policy it's following, and it receives feedback in the form of rewards and new state observations from the environment, which is modeled as an MDP.
@atmadeeparya24544 жыл бұрын
Hi, This is extremely intuitive and super easy to understand. I was wondering if you could tell me what resources you used to learning this material? How do you learn material like this (your best practices) and how much time it took you to learn the material (for making deeplizard content)? Thanks a lot for making this content and waiting for your reply.
@deeplizard4 жыл бұрын
As formal resources, I used the book “Reinforcement Learning: An Introduction” Second edition by Richard Sutton and Andrew Barto, along with this DeepMind paper: www.cs.toronto.edu/~vmnih/docs/dqn.pdf I also used various informal resources, like reading many blog articles, forums, etc.
@MrRynRules3 жыл бұрын
Thank you!
@qusayhamad72433 жыл бұрын
thanks
@dukedaffy54573 жыл бұрын
{ "question": "Which is the correct order for the components of MDP?", "choices": [ "Agent--->Environment--->State--->Action--->Reward", "Environment--->Agent--->State--->Action--->Reward", "State--->Agent--->Environment--->Action--->Reward", "Agent--->State--->Environment--->Action--->Reward" ], "answer": "Agent--->Environment--->State--->Action--->Reward", "creator": "Duke Daffin", "creationDate": "2021-01-16T12:19:28.304Z" }
@deeplizard3 жыл бұрын
Thanks, Duke! Just added your question to deeplizard.com/learn/video/my207WNoeyA :)
@drewwilkins99635 ай бұрын
How do you represent the trajectory including the final state? Like this? S_0, A_0, R_1, S_1, A_1, R_2, …, R_T, S_T If not, what is and why?
@3maim6 жыл бұрын
Will you cover Q-learning in this series? I really like your tutorials, very well explained!
@deeplizard6 жыл бұрын
Hey Marius - Yes, Q-learning will be covered! Check out the syllabus video to see the full details for everything we'll be covering: kzbin.info/www/bejne/pKrNk5aIhpKoh5o
@3maim6 жыл бұрын
Super, thanks!
@avishekhbt6 жыл бұрын
Awesome!! Thanks! :)
@carlosromero-sn9nm5 жыл бұрын
Great video
@alevilghost3 жыл бұрын
Gracias por los subtítulos en Castellano. 🤗
@actionchaplin1494 жыл бұрын
Hey thanks for awesome videos. This is maybe a stupid question, but what's the difference between s and s' ?
@deeplizard4 жыл бұрын
s' is the symbol we use in this episode to denote the next state that occurs after state s.
@ashabrar24353 жыл бұрын
{ "question": "In MDP which component role is to maximize the total Reward R ", "choices": [ "Agent", "State", "Action", "Reward" ], "answer": "Agent", "creator": "Hivemind", "creationDate": "2020-12-27T00:22:07.005Z" }
@deeplizard3 жыл бұрын
Thanks, ash! Just added your question to deeplizard.com/learn/video/my207WNoeyA :)
@thinhdao70233 жыл бұрын
I am reading a paper of applying Q-learning in repeated Cournot Oligopoly game in Economics where firms are agents who choose their level of production to gain profit. I can understand in that environment actions are the possible level of output that firm choose to produce. However, it is unclear for me what the states are in this situation. Could you please provide a further explanation in this case?
@ArpitDhamija3 жыл бұрын
Its more like a podcast, took me 20x more time to write down everything you said from the captions😵
@animystic59703 жыл бұрын
Hi! Loved the video and I think I have a solid understanding of the MDP. But I'm having trouble making sense of the equation. Why is the LHS a probability and the RHS a set? And what does Pr stand for?
@deeplizard3 жыл бұрын
Thanks! Pr stands for "probability", so the RHS is a probability as well.
@animystic59703 жыл бұрын
@@deeplizard Oh now I see . It's an expansion of the same thing! Thanks for clarifying!
@ns3lover7796 жыл бұрын
awsome thank you .
@benvelloor5 жыл бұрын
Thank youu.
@louerleseigneur45324 жыл бұрын
merci
@keshavsairam36152 жыл бұрын
came to learn,but uh oh i saw dota
@santoshkumarganji18014 жыл бұрын
Could you pl provide any notes/PPT related to MDP process.
@chyldstudios6 жыл бұрын
Will you be using OpenAI Gym to demonstrate reinforcement learning concepts?
@deeplizard6 жыл бұрын
Hey Chyld - Yes, we'll be utilizing OpenAI Gym once we get into coding! Check out the syllabus video to see the full details for everything we'll be covering: kzbin.info/www/bejne/pKrNk5aIhpKoh5o
@an_omega_wolf6 жыл бұрын
Dota
@thak4564 жыл бұрын
When are you restarting ?
@prathampandey98982 жыл бұрын
What is the difference between s and s' (s prime)?
@deeplizard2 жыл бұрын
s' is the derivative of s
@aaronbaron64684 жыл бұрын
i came here to learn about a topic and left sad that OG.JeRax and OG.ana is'nt on the active roster, hopefully OG.Sumail will carry as well as ana did.
@christianliz-fonts35244 жыл бұрын
Where is the discord link?
@designwithpicmaker27856 жыл бұрын
when next videos coming? any scheduling
@deeplizard6 жыл бұрын
Hey navaneetha - Currently aiming to release a new video in this RL series at least every 3-4 days.
@mash-sings6 жыл бұрын
Could we please get the code files for free only for students.??
@deeplizard6 жыл бұрын
Hey Mayank - We currently don't have any systems in place to implement or track a setup like that. Just for clarity, note that all of the code will be fully shown in the videos, so the code itself is freely available. Also, the corresponding blogs for each video are freely available at deeplizard.com. The convenience of downloading the pre-written organized code files is what is available as a reward for members of the deeplizard hivemind. deeplizard.com/hivemind