Reinforcement Learning in DeepSeek R1 Explained

Рет қаралды 7,100

Күн бұрын

Пікірлер: 8

@AGI.Lambdaa 17 сағат бұрын

oin our Discord server for discussions on topics related to AI, especially its future and the development of AGI. discord.com/invite/pBj2MwHhVk Note: Actually, in definition, state 's' and observation 'o' are different, but for this video, you can consider them the same thing. By mistake, I used both letters 'o' and 's.'

@AbdallahBoukouffallah 5 күн бұрын

The fact that KZbin suggested this means its algorithm is working well, one of the few videos that talk about R1 and actually go the details, keep up bro

@MelissaNieves-k8q 9 сағат бұрын

Thanks for sharing such valuable information! I need some advice: I have a SafePal wallet with USDT, and I have the seed phrase. (air carpet target dish off jeans toilet sweet piano spoil fruit essay). How should I go about transferring them to Binance?

@SahimKazi-cl2hz 5 күн бұрын

Thank you for making this kind of video it is very helpful for begginer and now i subscribed the channel

@SphoenixBlue 3 күн бұрын

For the future can you link the paper in the description please for reference

@sunshineda666 4 күн бұрын

Thanks for this video, easy to follow up!👍 Could you help to share the link about the prerequisite video, I didn't find it in your channel but some similar ones(which also mentioned the same video...), not sure I missed anything. Thanks in advance.

@AGI.Lambdaa 3 күн бұрын

kzbin.info/www/bejne/h6itlHlmbc2ZqZosi=1X7foNRWbuXedfip and for Reinforcement learning: Introduction to Reinforcement learning by suttton and barto.

@sunshineda666 3 күн бұрын

@@AGI.Lambdaa Oh Thanks! My bad, I saw it but with different name, must overclick it. and Thanks for your video again. FOLLOWED :)