12:12 "This something we did in our project" are you talking about allenai open instruct cot model which is on Hugginface?
@interconnects17 күн бұрын
Tulu 3 is trained with RLVR, but its a general chat model not a reasoning focused model.
@noa-h5b6 күн бұрын
Hi Nathan, thanks for the videos you are creating! I wanted to ask you for your opinion. I want to agin (I did a phd in information retrieval in 2014 and since then I stopped doing research) do research in LLM and I am quite lost with all the new papers, research topics .... so, I decide to get into RLHF, and I don t know what are worth exploring and research questions to tackle, if you have any recommendation on that Thanks for your help
@interconnects6 күн бұрын
I have an upcoming post on the blog about character training and other things I'm interested in. Generally, you have to commit to one thing and keep at it through the noise.