The Alberta Plan for AI Research: Tea Time Talk with Richard S. Sutton

  Рет қаралды 6,589

Amii

Amii

Күн бұрын

Пікірлер: 13
@borntobemild-
@borntobemild- 2 жыл бұрын
When you found the gizmo, it was a good metaphor on how you are freeing up the agent in the world with technology.
@Gabriel-oi6zb
@Gabriel-oi6zb 2 жыл бұрын
Minute 11: Our interaction with the environment is not continual. There are special training periods: sleep -- a crucial step in all mammals, might even extend back to all vertebrates.
@erkinalp
@erkinalp Жыл бұрын
There are total insomniacs who cannot sleep for years, but they do not exhibit significant learning-related disabilities. Hence sleep should not be considered the only factor for unlearning the falsehoods.
@Gabriel-oi6zb
@Gabriel-oi6zb Жыл бұрын
​@@erkinalp you might want to wiki that, total insomnia (also called fatal insomnia because you die from it) causes hallucinations.
@schok51
@schok51 Жыл бұрын
​@@erkinalp Sources? I thought sleep deprivation and disorders were pretty universally harmful to cognitive abilities. You cannot simply not sleep and be healthy and functional.
@howtobe7460
@howtobe7460 Жыл бұрын
This entire comment section looks AI generated 😂😂
@judgeomega
@judgeomega 2 жыл бұрын
it seems a contradiction to say you want a model with no domain knowledge yet still having a reward function. doesnt knowledge of a reward imply knowledge of the domain of that reward? the amount of knowledge in the universe is nigh infinite, and we need that reward to anchor our focus on just that which has utility with respect to our goals(rewards).
@schok51
@schok51 Жыл бұрын
I guess that's just semantics and that the point is that the reward function should encode all that is relevant about the domain?
@LionKimbro
@LionKimbro Жыл бұрын
I was wondering the very same thing. What's your reward function? With ChatGPT, the score comes from "did I predict the next word accurately?" I have no idea what this system is going to use. One possibility is -- is it going to be an auto-decoder? Don't know.
@ArtOfTheProblem
@ArtOfTheProblem 11 ай бұрын
here I believe he means the "value function" defines the reward, specificially is it getting better or worse. It's not inputting an external reward. reward is part of perception and is learned by the value function (if you understand TD learning)
@googm
@googm Жыл бұрын
"paltry worries like THE ECONOMY IS IN TROUBLE".
@richsutton366
@richsutton366 2 жыл бұрын
The slides can be found on my web site richsutton.com.
I Talked with Rich Sutton
1:30:22
Edan Meyer
Рет қаралды 16 М.
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН
Try this prank with your friends 😂 @karina-kola
00:18
Andrey Grechka
Рет қаралды 9 МЛН
Richard Sutton on Pursuing AGI Through Reinforcement Learning
55:32
Upper Bound 2023: The Alberta Plan for AI Research
1:02:17
Dynamic Deep Learning | Richard Sutton
1:04:32
ICARL
Рет қаралды 12 М.
Amii's AI Week: Eyes On The Prize
1:15:50
Amii
Рет қаралды 2,6 М.
It's Not About Scale, It's About Abstraction
46:22
Machine Learning Street Talk
Рет қаралды 104 М.
The reward hypothesis | Richard Sutton & Julia Haas | Absolutely Interdisciplinary 2023
1:56:40
Generative AI is a Parasitic Cancer
1:19:55
Freya Holmér
Рет қаралды 265 М.
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН