Everything You Wanted to Know About LLM Post-Training, with Nathan Lambert of Allen Institute for AI

  Рет қаралды 2,309

Cognitive Revolution "How AI Changes Everything"

Cognitive Revolution "How AI Changes Everything"

Күн бұрын

Пікірлер
@IvanBudiselic
@IvanBudiselic 6 күн бұрын
Nathan, the quality of your guests and conversations is so high that this is literally the only channel on all of KZbin where I tolerate in-video ads. Though I have to say that the in-video ads are also the only explanation I can find for why this channel doesn't already have 100k+ subs :)
@alexm1815
@alexm1815 5 күн бұрын
This is my favorite episode yet, incredibly information dense. Thank you Nathan^2!
@zyzhang1130
@zyzhang1130 5 күн бұрын
Very juicy content that is really lacking elsewhere (as far as I know)
@AR-iu7tf
@AR-iu7tf 19 сағат бұрын
As others have already said below, the most substantive and informative conversation on post training I have seen to date. Thank you so much for shedding light on an area that is almost like a black box now! - all we can find online is tidbits of speculation. You mention a paper on Verifier RL. I couldnt find a link to it online, perhaps it is not published yet?. Could you please share that if it is. Also, I know we can only speculate what o1 or Deepseek is doing for the reasoning sequences, but would it be fair to assume, during training they are doing some of reward model/verifier feedback at intermediate stages of a sequence that leads to a correct result, as opposed to just one reward signal for an entire sentence like what ChatGPT (perhaps!) does? In other words, is it likely to be a Bellman update all the tokens at the end of the sequence or at intermediate stages - also thank you so much for clarifying how the single reward value at the end is converted to individual rewards of each token that constituted that sentence.
Vertical AI Agents Could Be 10X Bigger Than SaaS
42:13
Y Combinator
Рет қаралды 178 М.
Hoodie gets wicked makeover! 😲
00:47
Justin Flom
Рет қаралды 138 МЛН
ТЮРЕМЩИК В БОКСЕ! #shorts
00:58
HARD_MMA
Рет қаралды 2,7 МЛН
ТВОИ РОДИТЕЛИ И ЧЕЛОВЕК ПАУК 😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 6 МЛН
Noodles Eating Challenge, So Magical! So Much Fun#Funnyfamily #Partygames #Funny
00:33
AWS Certified Cloud Practitioner Training 2020 - Full Course
3:58:01
freeCodeCamp.org
Рет қаралды 7 МЛН
Tim Ferriss: How to Learn Better & Create Your Best Future | Huberman Lab Podcast
3:39:09
How to Get a Developer Job - Even in This Economy [Full Course]
3:59:46
freeCodeCamp.org
Рет қаралды 3,2 МЛН
WE GOT ACCESS TO GPT-3! [Epic Special Edition]
3:57:17
Machine Learning Street Talk
Рет қаралды 336 М.
Reinforcement Learning Course - Full Machine Learning Tutorial
3:55:27
freeCodeCamp.org
Рет қаралды 913 М.
Hoodie gets wicked makeover! 😲
00:47
Justin Flom
Рет қаралды 138 МЛН