Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

  Рет қаралды 38,235

AI Prism

AI Prism

Күн бұрын

Пікірлер: 15
@mansurZ01
@mansurZ01 5 жыл бұрын
1:12 Outline 1:36 Approaching New Problems 2:00 When you have a new algorithm 4:50 When you have a new task 6:21 POMDP design 9:31 Run baselines 10:56 Run algorithms reproduced from paper with more samples than stated 13:00 Ongoing development and tuning 13:18 Don't be satisfied if it works 14:50 Continually benchmark your code 15:25 Always use multiple random seeds 17:10 Always be ablating 18:21 Automate experiments 19:17 Question on frameworks for tracking experiment results 19:47 General tuning strategies for RL 19:58 Standardizing data 22:17 Generally important hyperparameters 25:10 General RL Diagnostics 26:15 Policy Gradient strategies 26:21 Entropy 27:02 KL 28:07 Explained variance 29:41 Policy initialization 30:21 Q-learning strategies 31:27 Miscellaneous advice 35:00 Questions 35:21 how long to wait until deciding whether code works or not 36:18 unit tests 37:35 what algorithm to choose 39:28 recommendations on older textbooks 40:27 comment on evolution strategies and OpenAI blog post on it 43:49 favorite hyperparameter search framework
@TheAIEpiphany
@TheAIEpiphany 3 жыл бұрын
I love John's presenting style he's super positive and enthusiastic, great tips thank you!
@agarwalaksbad
@agarwalaksbad 7 жыл бұрын
This is a super useful lecture. Thanks, John!
@FalguniDasShuvo
@FalguniDasShuvo 2 жыл бұрын
Wow! I love how simply John conveys great ideas. Very interesting lecture!
@SinaEbrahimi-ee3fq
@SinaEbrahimi-ee3fq 6 ай бұрын
Awesome talk! Still very relevant!
@sclmath5
@sclmath5 7 жыл бұрын
What a number to end the video, 44:44.
@ProfessionalTycoons
@ProfessionalTycoons 6 жыл бұрын
this was a great talk .
@zhenghaopeng6633
@zhenghaopeng6633 4 жыл бұрын
Hi there! Can I upload this lecture in Bilibili, a similar-to-youtube, famous video website in China? Many students are there and wish to get access to this insightful talks! Thanks!
@piyushjaininventor
@piyushjaininventor Жыл бұрын
may be view on youtube? its free :)
@BahriddinAbdiev
@BahriddinAbdiev 6 жыл бұрын
We (3 students) exploring DQN and different types of it i.e. Double DQN, Doube Duelling DQN, Prioritized Experience Replay, etc. There is one thing that we all are facing: even it converges, if you run it long enough at some point it diverges again. Is this normal or it should converge and stay there or become even better always? Cheers!
@alexanderyau6347
@alexanderyau6347 6 жыл бұрын
Hi, I think it normal. But I don't know how does it come out. Maybe the model learned too much and become stupid, LOL.
@yoloswaggins2161
@yoloswaggins2161 6 жыл бұрын
No this is not supposed to happen. I've seen it happen for a couple of reasons but the most common is people scaling by a standard deviation that gets very close to 0 due to too much similar data.
@georgeivanchyk9376
@georgeivanchyk9376 4 жыл бұрын
If you cut all the times he said 'ah', the video would be 2 times shorter
Deep RL Bootcamp  Lecture 4A: Policy Gradients
53:56
AI Prism
Рет қаралды 61 М.
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 45 МЛН
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
What is generative AI and how does it work? - The Turing Lectures with Mirella Lapata
46:02
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 783 М.
An introduction to Policy Gradient methods - Deep Reinforcement Learning
19:50