Reinforcement Learning in DeepSeek R1 Explained

  Рет қаралды 7,100

AGI Lambda

AGI Lambda

Күн бұрын

Пікірлер: 8
@AGI.Lambdaa
@AGI.Lambdaa 17 сағат бұрын
oin our Discord server for discussions on topics related to AI, especially its future and the development of AGI. discord.com/invite/pBj2MwHhVk Note: Actually, in definition, state 's' and observation 'o' are different, but for this video, you can consider them the same thing. By mistake, I used both letters 'o' and 's.'
@AbdallahBoukouffallah
@AbdallahBoukouffallah 5 күн бұрын
The fact that KZbin suggested this means its algorithm is working well, one of the few videos that talk about R1 and actually go the details, keep up bro
@MelissaNieves-k8q
@MelissaNieves-k8q 9 сағат бұрын
Thanks for sharing such valuable information! I need some advice: I have a SafePal wallet with USDT, and I have the seed phrase. (air carpet target dish off jeans toilet sweet piano spoil fruit essay). How should I go about transferring them to Binance?
@SahimKazi-cl2hz
@SahimKazi-cl2hz 5 күн бұрын
Thank you for making this kind of video it is very helpful for begginer and now i subscribed the channel
@SphoenixBlue
@SphoenixBlue 3 күн бұрын
For the future can you link the paper in the description please for reference
@sunshineda666
@sunshineda666 4 күн бұрын
Thanks for this video, easy to follow up!👍 Could you help to share the link about the prerequisite video, I didn't find it in your channel but some similar ones(which also mentioned the same video...), not sure I missed anything. Thanks in advance.
@AGI.Lambdaa
@AGI.Lambdaa 3 күн бұрын
kzbin.info/www/bejne/h6itlHlmbc2ZqZosi=1X7foNRWbuXedfip and for Reinforcement learning: Introduction to Reinforcement learning by suttton and barto.
@sunshineda666
@sunshineda666 3 күн бұрын
@@AGI.Lambdaa Oh Thanks! My bad, I saw it but with different name, must overclick it. and Thanks for your video again. FOLLOWED :)
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,7 МЛН
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
Group Relative Policy Optimization | DeepSeek r1
6:52
AGI Lambda
Рет қаралды 893
Reinforcement Learning from scratch
8:25
Graphics in 5 Minutes
Рет қаралды 123 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 596 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,6 МЛН
Large Language Models explained briefly
7:58
3Blue1Brown
Рет қаралды 1,4 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
RLHF & DPO Explained (In Simple Terms!)
19:39
Entry Point AI
Рет қаралды 6 М.
DeepSeek R1 Theory Overview | GRPO + RL + SFT
25:36
Deep Learning with Yacine
Рет қаралды 46 М.
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН