Reinforcement Learning: on-policy vs off-policy algorithms

  Рет қаралды 5,525

CodeEmporium

CodeEmporium

Күн бұрын

Let's talk about on-policy vs off-policy algorithms in reinforcement learning
ABOUT ME
⭕ Subscribe: kzbin.info...
📚 Medium Blog: / dataemporium
💻 Github: github.com/ajhalthor
👔 LinkedIn: / ajay-halthor-477974bb
RESOURCES
[1] Reinforcement Learning book: incompleteideas.net/book/RLboo...
[2] Paradigms of ML: idapgroup.com/blog/types-of-m...
PLAYLISTS FROM MY CHANNEL
⭕ Reinforcement Learning: • Reinforcement Learning...
Natural Language Processing: • Natural Language Proce...
⭕ Transformers from Scratch: • Natural Language Proce...
⭕ ChatGPT Playlist: • ChatGPT
⭕ Convolutional Neural Networks: • Convolution Neural Net...
⭕ The Math You Should Know : • The Math You Should Know
⭕ Probability Theory for Machine Learning: • Probability Theory for...
⭕ Coding Machine Learning: • Code Machine Learning
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.net/MathML
📕 Calculus: imp.i384100.net/Calculus
📕 Statistics for Data Science: imp.i384100.net/AdvancedStati...
📕 Bayesian Statistics: imp.i384100.net/BayesianStati...
📕 Linear Algebra: imp.i384100.net/LinearAlgebra
📕 Probability: imp.i384100.net/Probability
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.net/Deep-Learning
📕 Python for Everybody: imp.i384100.net/python
📕 MLOps Course: imp.i384100.net/MLOps
📕 Natural Language Processing (NLP): imp.i384100.net/NLP
📕 Machine Learning in Production: imp.i384100.net/MLProduction
📕 Data Science Specialization: imp.i384100.net/DataScience
📕 Tensorflow: imp.i384100.net/Tensorflow

Пікірлер: 13
@MrFalk358
@MrFalk358 5 ай бұрын
Ok i will indulge your quiz time questions since your videos are really great! Question 1: A is correct. it would not learn at all, since the target policy is the policy which we are trying to learn. Setting it fixed would imply it not changing, which would imply it staying random, therefore we are not learning Question 2: Im not completely sure but i would say B is correct, since SARSA uses its target policy both to choose action and to "look" (by taking the action according to the target policy) at its follow up state Hope more people comment so the algorithm boosts your channel!
@CodeEmporium
@CodeEmporium 5 ай бұрын
Ding ding ding! You have been paying attention :) Also thanks a ton for indulging me here. I am trying new ways to make sure this content is engaging and educational at the same time. So the more people like yourself that participate, the more I see the value in this content.
@MrFalk358
@MrFalk358 5 ай бұрын
@@CodeEmporium i taking a course on rl at the moment which is quite disorganized, your content definitely helps a ton with understanding!
@0xabaki
@0xabaki 3 ай бұрын
@@CodeEmporium I love quiz time! It felt best when professors would quiz us on topics so I can re-engage.
@mumbo2526
@mumbo2526 5 ай бұрын
Amazing Video, thank you!
@zhezhe3351
@zhezhe3351 Ай бұрын
Good video!there is a small typo at the summary page about on-policy
@moaaathkhalil
@moaaathkhalil 5 ай бұрын
Well explained!
@alonsovalderramahickmann940
@alonsovalderramahickmann940 5 ай бұрын
Very nice video man
@kiranbade9481
@kiranbade9481 Ай бұрын
well explained brother
@broccoli322
@broccoli322 6 ай бұрын
Thanks for the video! ☺
@CodeEmporium
@CodeEmporium 6 ай бұрын
You are very welcome :)
@hugeturnip3520
@hugeturnip3520 2 ай бұрын
Thank you so much dude
@user-xv9qk3iz7b
@user-xv9qk3iz7b 3 ай бұрын
Monte Carlo in Reinforcement Learning
11:49
CodeEmporium
Рет қаралды 7 М.
Q-learning - Explained!
11:54
CodeEmporium
Рет қаралды 12 М.
О, сосисочки! (Или корейская уличная еда?)
00:32
Кушать Хочу
Рет қаралды 7 МЛН
The best home workout !! 😱😱
00:27
Tibo InShape
Рет қаралды 12 МЛН
it takes two to tango 💃🏻🕺🏻
00:18
Zach King
Рет қаралды 23 МЛН
Policy Gradient Theorem Explained - Reinforcement Learning
59:36
Elliot Waite
Рет қаралды 56 М.
Q Learning simply explained | SARSA and Q-Learning Explanation
9:46
Proximal Policy Optimization | ChatGPT uses this
13:26
CodeEmporium
Рет қаралды 10 М.
Bellman Equation -  Explained!
9:05
CodeEmporium
Рет қаралды 13 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 164 М.
Building your first Neural Network
15:16
CodeEmporium
Рет қаралды 3,6 М.
Embeddings - EXPLAINED!
12:58
CodeEmporium
Рет қаралды 4,3 М.