Scaling Test Time Compute: How o3-Style Reasoning Works (+ Open Source Implementation)

Рет қаралды 2,476

Күн бұрын

Пікірлер

@wwkk4964 3 күн бұрын

This was amazing! Thank you for sharing! This is waiting to explode as soon as we have a single breakthrough in cost of compute. Everything, no matter how small, will be able to hac away and reason about a task.

@AdamLucek 3 күн бұрын

Intelligence too cheap to measure is the hope! Given the insane competition in the space currently, I suspect we'll see similar downward cost trends as we have with regular input/output token consumption

@wwkk4964 3 күн бұрын

@AdamLucek Yes, it seems biology is a slower scaled up version of test time compute working at the cellular and somatic level. Some experiments By Michael Levin of Tufts show their lab used Bioelectric signalling to cells in animals like frogs and planaria to have them grow or regrow completely nivel morphology just by repeatedly interfering with their "ongoing computation" about what to build. I imagine we will be able to harness biological computation in nature and set them off to solve tasks for indefinite periods of time.

@bentouss3445 11 сағат бұрын

Your videos are so full of useful information, thank you so much! Love them

@edwardtbaum2169 3 күн бұрын

Love your videos man. You have a good grasp on the concepts. I've been going back through your catalogue of videos to help me brainstorm some of my own ideas.

@AdamLucek 3 күн бұрын

Thanks! Hope they help!

@tspis 2 күн бұрын

Subscribed immediately - excellent content & packed with value!

@adomicarts 3 күн бұрын

Very informative and well explained Adam

@AdamLucek 3 күн бұрын

Thanks for watching!

@MaJetiGizzle 3 күн бұрын

10:23 What about the original STaR paper? That seems to be a likely candidate as well, hence the original Q-Star codename for o1.

@AdamLucek 3 күн бұрын

That’s a good one too! It’s likely a mix of a bunch of things including star methodology. I chose to highlight SCoRe since their experiments showed clear improvements over STaR, but as you’ve pointed out hard to tell exactly what without knowing!

@lavamonkeymc 3 күн бұрын

This is awesome great video man. Can u do a vid on more advanced training embedders for RAG or Graph Rag?

@AdamLucek 3 күн бұрын

🤔

@JJBoi8708 2 күн бұрын

Are we able to do this locally with quants on a MacBook Pro m4 pro?

@tobywoolsey7844 2 күн бұрын

Wouldn’t you think that the likelihood of the OpenAI O series of models using the “search against a verifier method” Is higher than the self refinement method? I understand why the first method “self refinement” sound most probable but your looking at “search against a verifier in a way that doesn’t consider the possibility that training would work something like this: The llm generates multiple thought steps or “actions” for a given state witch can be a previous thought step of a query. Then the verifier likely the llm itself picks the best one, and this happens iteratively until a tree like structure is formed. The llm then checks the tree for errors before picking the best path! Then the llm is rewarded if the response from that path is correct. This way you don’t require a separate verifier. Trust me I started with the Microsoft everything of thoughts paper and I can’t remember the names but similar papers have come out since then that replace the verifier with the llm itself. Also self refinement can also be part of this method if you introduce a graph like structure instead of a tree like structure