Reinforcement Learning Made Simple - Policy

  Рет қаралды 2,748

Edan Meyer

Edan Meyer

Күн бұрын

Пікірлер: 6
@Maxwellpaulwall
@Maxwellpaulwall Жыл бұрын
phenomenal content. Great work, very easy to understand
@zuloo37
@zuloo37 3 жыл бұрын
The Gaussian shape of the output makes sense if the output can be any real number, but if it's on a finite continuous range like [0, 1], wouldn't that make the probability densities at the endpoints unusually high and discontinuous (assuming you clip the output)? Would it make more sense to use something like a beta distribution for that kind of space?
@EdanMeyer
@EdanMeyer 3 жыл бұрын
This is a really great question! You absolutely could use something like a Beta distribution. As far as I'm aware there isn't any mathematic reason in particular that OpenAI likes to use Gaussian distributions (other than their natural convenient properties). You could even drop the distribution entirely and just predict a specific value, but predicting the distribution instead can help with exploration.
@zuloo37
@zuloo37 3 жыл бұрын
Is there any problem with lack of differentiability at the endpoints if you use a Gaussian on a clamped output range? Or does this system not use backprop?
@EdanMeyer
@EdanMeyer 3 жыл бұрын
@@zuloo37 Just thinking off the top of my head here, but one thing you can do if you are using a learning algorithm that requires differentiability is to perform the loss calculation on the unclipped output. But you can still clip it before you apply it to the environment, and that should work.
@さやまめ-q6w
@さやまめ-q6w 3 жыл бұрын
So talented 😲
Reinforcement Learning Made Simple - Reward
15:55
Edan Meyer
Рет қаралды 8 М.
Agent Learns to do Reinforcement Learning
24:46
Edan Meyer
Рет қаралды 10 М.
So Cute 🥰 who is better?
00:15
dednahype
Рет қаралды 19 МЛН
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН
The evil clown plays a prank on the angel
00:39
超人夫妇
Рет қаралды 53 МЛН
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
MIT 6.S191: Reinforcement Learning
1:00:19
Alexander Amini
Рет қаралды 63 М.
Training RL From YouTube Videos
31:49
Edan Meyer
Рет қаралды 7 М.
This Embodied LLM is...
32:50
Edan Meyer
Рет қаралды 7 М.
Reinforcement Learning, by the Book
18:19
Mutual Information
Рет қаралды 113 М.
Sparks of AGI: What to Know
21:46
Edan Meyer
Рет қаралды 32 М.
AI - Deep Reinforcement learning made easy again! - CrossQ
46:58
Machine Learning and AI Academy
Рет қаралды 585
Reinforcement Learning: Machine Learning Meets Control Theory
26:03
Steve Brunton
Рет қаралды 288 М.
GPT-4 Outperforms RL by Studying and Reasoning... 🤔
19:47
Edan Meyer
Рет қаралды 26 М.
2 Years of My Research Explained in 13 Minutes
13:51
Edan Meyer
Рет қаралды 58 М.
So Cute 🥰 who is better?
00:15
dednahype
Рет қаралды 19 МЛН