o3 Inference Reasoning: How to Build the Training Data Set

Рет қаралды 3,473

Күн бұрын

Пікірлер: 16

@code4AI Ай бұрын

With the automatic audio dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

@AdenFromEden Ай бұрын

no support for arabic dubbing😥😥

@NaveenReddy-p5j Ай бұрын

Impressive deep dive into o3 reasoning! Step-by-step guide on building datasets and aligning SFT/RL is very useful. Kudos! #datascience #o3

@tokenranxomizsr Ай бұрын

I like your theme colors! As always, great presentation!

@AlexJohnson-g4n Ай бұрын

Solid approach to the o3 training shift. Interested in seeing the impacts on LLM safety and performance.

@Apollomuscles 23 күн бұрын

its preventing derragatory statements and isnt even related to safety. this shit is soft.

@CharlotteLopez-n3i Ай бұрын

How-to on building o3 training datasets? Invaluable for implementing in 7B LLMs. Insightful on o3 reasoning! Aligned SFT and RL training procedures are key. Thanks!

@kaio0777 Ай бұрын

man your are on top of things.

@msokokokokokok Ай бұрын

@28:00 We should evolve SFT CoT into RL CoT via rewarding response instead of evolving response generation. Current approach is .. ... < Response>.. .. and content under ... is learnt by SFT and content under . is learnt by RL but we should instead learn content under ... by RL and content under .. should just be used to compute reward as KL divergence from optimal response

@KitcloudkickerJr Ай бұрын

as far as i see it, the test at 30:40, o3 did not fail. that question is ambigous at best. the answer it gave was logical. thatwas one of the 9 it slipped on and it needs to be reevvaluated

@ramiroramirez8503 Ай бұрын

I disagree. For an AGI, ambiguous questions like this shouldn’t cause it to stumble. As humans, we have the ability to see through ambiguity and still come up with logical answers. That’s the whole point of tests like the ARC AGI benchmark, to evaluate whether an AGI can reason abstractly and handle uncertainty in the same way we do. If the question seems ambiguous, it’s even more important for the AGI to step up and demonstrate its ability to interpret and respond intelligently, just like a human would int these context.

@KitcloudkickerJr Ай бұрын

@ramiroramirez8503 it's answer was right but was marked wrong by humans. So who's not general intelligence?

@wwkk4964 Ай бұрын

The question is stupid. We have 3 examples: and they cover 2 types of cases: a pair of points, and two pairs of points. Then a third set is presented to "generalize". Its literally asking us to presume whatever happens in 2 coordinates is what will happen in 3. it's ridiculous, and I am happy that o3 didn't answer a non general solution to a not so well posed problem..

@saiderico Ай бұрын

@code4AI , How do automated theorem proving programs work? I don't know much about this, but it seems to me that this is the direction to look. 1. Can a AI model replace such a program? 2. Is it possible to translate a chain of thoughts into some formal logic and work with this data from a formal point of view?

@davidwynter6856 Ай бұрын

At 23:30 you say that the OpenAI o1 API has access to python etc. to solve for mathematical correctness. My question is how does o1 know to use python is the cases where it would be an advantage in solving for mathematical correctness or any other problem where a programmatic approach is an advantage?

@wwkk4964 Ай бұрын

30:51 You are saying o3 was not intelligent because it noticed that whatever happens with 2 pairs and 1 pair, cannot be generalized to 3 pairs ? common. imagine the case was x, y,z axis. can we easily claim we know what will happen based on planar examples? its ridiculous