Reinforcement Learning from scratch

Рет қаралды 28,648

Күн бұрын

How does Reinforcement Learning work? A short cartoon that intuitively explains this amazing machine learning approach, and how it was used in AlphaGo and ChatGPT.
Part 1 of 3.
0:00 - intro
0:13 - pong
0:28 - the policy
0:51 - policy as neural network
1:32 - supervised learning
2:51 - reinforcement learning using policy gradient
4:24 - minimizing error using gradient descent
4:45 - probabilistic policy
5:01 - pong from pixels
6:58 - visualizing learned weights
8:18 - pointer to Karpathy "pong from pixels" blogpost

Пікірлер: 36

@darthvader4899 Ай бұрын

this is video is super underrated. In fact the whole channel is underrated.

@themathguy3149 6 ай бұрын

Your Channel IS SO GREAT, I share with all my eng friends for you to get more visibility!

@metaljacket8102 22 күн бұрын

This is really awsome! It's the best video that explains DRL in such an easy to understand way!

@tushargupta1999 Ай бұрын

This video is amazing. You explained everything in such a simple manner. I am feeling really motivated to learn more about reinforcement learning and neural networks after watching this.

@a.aspden 7 ай бұрын

Your videos are great. Looking forward to more!

@ashketchum1244 8 ай бұрын

I don't know how I stumbled upon this video but that was very interesting and intuitive to understand. Thank you.

@marcinstrzesak346 7 ай бұрын

Great video, very helpful, easy to understand.

@gmjammin4367 8 ай бұрын

Amazing video as always :)!

@moldo800 3 ай бұрын

Excellent. Congratulations ❤

@mado.madeleine 8 ай бұрын

Super helpful! Thank you 🙏🏽

@jameslibby5215 7 ай бұрын

Very very underrated channel

@benc7910 3 ай бұрын

Underrated, two Rs

@jameslibby5215 3 ай бұрын

@@benc7910 thank ya sir

@nikbivation 8 ай бұрын

thank you for this!

@cloudysh 22 күн бұрын

This was so surprisingly great :3

@themax2go Ай бұрын

agi: 1. ai develops understanding of win-loss conditions and sets policy params (inputs & actions) accordingly. 2. ai creates (= designs & builds) training env(s). 3. ai iterates, avals & adjusts policy parameters accordingly 4. done (or validation run(s) w/ human(s))

@ireoluwaTH 8 ай бұрын

Thank you!!!

@mohajeramir 16 күн бұрын

Excellent

@CptDoge-rn3ou 6 ай бұрын

I really like the way you visualize what you are talking about. Thank you for putting in the effort!

@kniv0gaffel 6 ай бұрын

Brilliant

@solveigberling1662 Ай бұрын

That was dope

@BlueBirdgg 8 ай бұрын

Can you playlist each one of your topics plz? I wanted to post on Twitter(X) your video topics but could only post a single video at a time. Great content by the way. Ty very much. Your perspective on some topics helped me a lot to get a more intuitive understanding.

@g5min 8 ай бұрын

Good idea! Here's one on generative AI: kzbin.info/aero/PLWfDJ5nla8UoR8P7AGqVw7ZPjXajUFLMo Here's one on reinforcement learning kzbin.info/aero/PLWfDJ5nla8UoexEaLqVMw7q3Ft0vRYscL Here's one on LLMs + text-to-image kzbin.info/aero/PLWfDJ5nla8UoG2mvvHs_OS0asAKC5HJeu

@BlueBirdgg 8 ай бұрын

@@g5min Ty!