From AlphaGo to MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

  Рет қаралды 7,505

Harvard CMSA

Harvard CMSA

Күн бұрын

3/31/2021 Colloquium
Speaker: Thore Graepel (DeepMind/UCL)
Title: From AlphaGo to MuZero - Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Abstract: Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the MuZero algorithm which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. MuZero learns a model that, when applied iteratively, predicts the quantities most directly relevant to planning: the reward, the action-selection policy, and the value function. When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - our new algorithm achieved a new state of the art. When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Пікірлер: 7
@edupignatelli
@edupignatelli 2 жыл бұрын
Thore's explations are always as clear as he is explaining to a 5 years old child, and as respectful as he is speaking in a UN meeting at the same.
@JimSlattery
@JimSlattery 27 күн бұрын
26:10 this part really stuck with me. We see all the handcrafted expert logic in the Stockfish engine, and yet machine learning can achieve all of that and more in an automated way. This is amazing technology!
@DrTune
@DrTune 2 жыл бұрын
Great vid, very approachable. Thanks!
@martinsosmucnieks8515
@martinsosmucnieks8515 2 жыл бұрын
But very great vid!!!
@martinsosmucnieks8515
@martinsosmucnieks8515 2 жыл бұрын
Why Mu?
@jeshweedleon3960
@jeshweedleon3960 2 жыл бұрын
pretty sure “mu” is japanese for “emptiness” or “absence”, referring to the fact that muzero isn’t given hardcoded rules.
@themanwhoknewtoomuch6667
@themanwhoknewtoomuch6667 Жыл бұрын
@@jeshweedleon3960 Hoftstadterish. Computer Scientist sure has a siccc sense of humor and cool artistic taste!
David Ben-Zvi |  Electric-Magnetic Duality for Periods and L-functions
1:04:19
MuZero - ICAPS 2020
33:55
Julian Schrittwieser
Рет қаралды 17 М.
Indian sharing by Secret Vlog #shorts
00:13
Secret Vlog
Рет қаралды 50 МЛН
Eccentric clown jack #short #angel #clown
00:33
Super Beauty team
Рет қаралды 20 МЛН
Did you find it?! 🤔✨✍️ #funnyart
00:11
Artistomg
Рет қаралды 121 МЛН
Deepmind AlphaZero - Mastering Games Without Human Knowledge
42:29
The Artificial Intelligence Channel
Рет қаралды 191 М.
Alpha Zero's "Immortal Zugzwang Game" against Stockfish
15:37
agadmator's Chess Channel
Рет қаралды 1,6 МЛН
MuZero: DeepMind’s New AI Mastered More Than 50 Games
5:27
Two Minute Papers
Рет қаралды 123 М.
AlphaZero: An Introduction
9:56
Aaron Davis
Рет қаралды 40 М.
How AlphaZero Completely CRUSHED Stockfish
33:48
GothamChess
Рет қаралды 4,2 МЛН
The Evolution of AlphaGo to MuZero
15:36
Connor Shorten
Рет қаралды 13 М.
wyłącznik
0:50
Panele Fotowoltaiczne
Рет қаралды 22 МЛН
Как я сделал домашний кинотеатр
0:41
RICARDO
Рет қаралды 1,5 МЛН
С Какой Высоты Разобьётся NOKIA3310 ?!😳
0:43
iPhone 15 Pro vs Samsung s24🤣 #shorts
0:10
Tech Tonics
Рет қаралды 9 МЛН