Reinforcement Learning: AlphaGo

  Рет қаралды 8,556

Graphics in 5 Minutes

Graphics in 5 Minutes

Күн бұрын

How AlphaGo works, based on Reinforcement Learning.
Part 2 of RL from scratch series.
• Reinforcement Learning...
0:00 - intro
0:06 - how to play Go
0:21 - introducing alphaGo
0:46 - analyzing expert games
2:17 - training an expert policy
2:47 - value functions
4:05 - search trees
5:42 - reinforcement learning
6:17 - alphaGo's value function
7:47 - alphaZero

Пікірлер: 2
@ireoluwaTH
@ireoluwaTH 8 ай бұрын
Thank you for these rather clear explanations!
@onhazrat
@onhazrat 8 ай бұрын
🎯 Key Takeaways for quick navigation: 00:41 🧠 AlphaGo, the Go-playing AI, learns from human experts by analyzing prior games and then plays millions of games against itself using reinforcement learning to improve. 02:25 🤖 A policy neural network is trained to predict good moves based on the state of the Go board. 03:41 🌐 The value function estimates the likelihood of winning from a given state, helping the AI plan ahead and make strategic moves. 06:10 🔄 AlphaGo uses reinforcement learning to refine its move policy and value estimation through self-play, simulating millions of games. 07:51 🤯 AlphaZero, a newer approach, relies solely on reinforcement learning and is even more advanced, eliminating the need for learning from human experts. Made with HARPA AI
Reinforcement Learning:  ChatGPT and RLHF
6:31
Graphics in 5 Minutes
Рет қаралды 7 М.
AlphaGo
14:12
Connor Shorten
Рет қаралды 7 М.
маленький брат прыгает в бассейн
00:15
GL Show Russian
Рет қаралды 3,9 МЛН
I PEELED OFF THE CARDBOARD WATERMELON!#asmr
00:56
HAYATAKU はやたく
Рет қаралды 30 МЛН
蜘蛛侠这操作也太坏了吧#蜘蛛侠#超人#超凡蜘蛛
00:47
超凡蜘蛛
Рет қаралды 48 МЛН
Q-Learning Explained - A Reinforcement Learning Technique
8:38
deeplizard
Рет қаралды 221 М.
Evolving Genetic Neural Network Optimizes Poly Bridge Problems
9:59
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 187 М.
Lee Sedol vs AlphaGo  Move 37 reactions and analysis
4:08
rakstreams
Рет қаралды 529 М.
AI Learns to Walk (deep reinforcement learning)
8:40
AI Warehouse
Рет қаралды 8 МЛН
AlphaZero: DeepMind’s AI Works Smarter, not Harder
4:27
Two Minute Papers
Рет қаралды 92 М.
Monte Carlo in Reinforcement Learning
11:49
CodeEmporium
Рет қаралды 6 М.
маленький брат прыгает в бассейн
00:15
GL Show Russian
Рет қаралды 3,9 МЛН