o3 Inference Reasoning: How to Build the Training Data Set

  Рет қаралды 3,473

Discover AI

Discover AI

Күн бұрын

Пікірлер: 16
@code4AI
@code4AI Ай бұрын
With the automatic audio dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
@AdenFromEden
@AdenFromEden Ай бұрын
no support for arabic dubbing😥😥
@NaveenReddy-p5j
@NaveenReddy-p5j Ай бұрын
Impressive deep dive into o3 reasoning! Step-by-step guide on building datasets and aligning SFT/RL is very useful. Kudos! #datascience #o3
@tokenranxomizsr
@tokenranxomizsr Ай бұрын
I like your theme colors! As always, great presentation!
@AlexJohnson-g4n
@AlexJohnson-g4n Ай бұрын
Solid approach to the o3 training shift. Interested in seeing the impacts on LLM safety and performance.
@Apollomuscles
@Apollomuscles 23 күн бұрын
its preventing derragatory statements and isnt even related to safety. this shit is soft.
@CharlotteLopez-n3i
@CharlotteLopez-n3i Ай бұрын
How-to on building o3 training datasets? Invaluable for implementing in 7B LLMs. Insightful on o3 reasoning! Aligned SFT and RL training procedures are key. Thanks!
@kaio0777
@kaio0777 Ай бұрын
man your are on top of things.
@msokokokokokok
@msokokokokokok Ай бұрын
@28:00 We should evolve SFT CoT into RL CoT via rewarding response instead of evolving response generation. Current approach is .. ... < Response>.. .. and content under ... is learnt by SFT and content under . is learnt by RL but we should instead learn content under ... by RL and content under .. should just be used to compute reward as KL divergence from optimal response
@KitcloudkickerJr
@KitcloudkickerJr Ай бұрын
as far as i see it, the test at 30:40, o3 did not fail. that question is ambigous at best. the answer it gave was logical. thatwas one of the 9 it slipped on and it needs to be reevvaluated
@ramiroramirez8503
@ramiroramirez8503 Ай бұрын
I disagree. For an AGI, ambiguous questions like this shouldn’t cause it to stumble. As humans, we have the ability to see through ambiguity and still come up with logical answers. That’s the whole point of tests like the ARC AGI benchmark, to evaluate whether an AGI can reason abstractly and handle uncertainty in the same way we do. If the question seems ambiguous, it’s even more important for the AGI to step up and demonstrate its ability to interpret and respond intelligently, just like a human would int these context.
@KitcloudkickerJr
@KitcloudkickerJr Ай бұрын
@ramiroramirez8503 it's answer was right but was marked wrong by humans. So who's not general intelligence?
@wwkk4964
@wwkk4964 Ай бұрын
The question is stupid. We have 3 examples: and they cover 2 types of cases: a pair of points, and two pairs of points. Then a third set is presented to "generalize". Its literally asking us to presume whatever happens in 2 coordinates is what will happen in 3. it's ridiculous, and I am happy that o3 didn't answer a non general solution to a not so well posed problem..
@saiderico
@saiderico Ай бұрын
@code4AI , How do automated theorem proving programs work? I don't know much about this, but it seems to me that this is the direction to look. 1. Can a AI model replace such a program? 2. Is it possible to translate a chain of thoughts into some formal logic and work with this data from a formal point of view?
@davidwynter6856
@davidwynter6856 Ай бұрын
At 23:30 you say that the OpenAI o1 API has access to python etc. to solve for mathematical correctness. My question is how does o1 know to use python is the cases where it would be an advantage in solving for mathematical correctness or any other problem where a programmatic approach is an advantage?
@wwkk4964
@wwkk4964 Ай бұрын
30:51 You are saying o3 was not intelligent because it noticed that whatever happens with 2 pairs and 1 pair, cannot be generalized to 3 pairs ? common. imagine the case was x, y,z axis. can we easily claim we know what will happen based on planar examples? its ridiculous
o3 Inference Time CoT Reasoning: How relevant is SFT and RL?
27:21
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
Smarter Reasoning w/o RAG: SOLUTION for Short-Context LLMs
33:32
The Dark Matter of AI [Mechanistic Interpretability]
24:09
Welch Labs
Рет қаралды 151 М.
AI Is Making You An Illiterate Programmer
27:22
ThePrimeTime
Рет қаралды 292 М.
Scaling AI Reasoning: MCTS in ICL for Small LM
39:41
Discover AI
Рет қаралды 2,8 М.
Google's NEW TITANS: Transformer w/ RNN Memory
25:55
Discover AI
Рет қаралды 7 М.
LCM: The Ultimate Evolution of AI? Large Concept Models
30:13
Discover AI
Рет қаралды 59 М.
Goodbye RAG - Smarter CAG w/ KV Cache Optimization
26:19
Discover AI
Рет қаралды 41 М.
All Machine Learning algorithms explained in 17 min
16:30
Infinite Codes
Рет қаралды 574 М.
AI Inference: The Secret to AI's Superpowers
10:41
IBM Technology
Рет қаралды 23 М.