Learning to Reason with LLMs

Рет қаралды 10,895

Күн бұрын

Noam Brown (OpenAI)
simons.berkele...
Transformers as a Computational Model
Large language models (LLMs) have demonstrated remarkable capabilities in generating coherent text and completing various natural language tasks. Nevertheless, their ability to perform complex, general reasoning has remained limited. In this talk, I will describe OpenAI's new o1 model, an LLM trained via reinforcement learning to generate a hidden chain of thought before its response. We have found that the performance of o1 consistently improves with more reinforcement learning compute and with more inference compute. o1 surpasses previous state-of-the-art models in a variety of benchmarks that require reasoning, including mathematics competitions, programming contests, and advanced science question sets. I will discuss the implications of scaling this paradigm even further.

Пікірлер: 9

@labsanta 2 ай бұрын

00:03 Learning from AI expert on reasoning and game plays 02:06 Evolution of poker Bots and model scaling in competitions 06:20 Improving search scalability in poker algorithms. 08:30 Use of search and planning in improving poker bot performance 12:36 Search algorithm made a 100,000x difference in poker and go 14:44 Scaling up models for performance improvement 18:42 Consensus or majority voting can improve performance on exams using GBD4. 20:35 Scalability of inference compute leads to significant performance improvements. 24:52 Machine learning moving towards reasoning-based models 26:42 Deciphering difficult codes using reasoning 30:41 Games provide ground truth for verifying winning states. 32:40 Different approaches to compute allocation for model training and testing impacts ELO rating 36:28 Effective algorithms leverage increased compute for long-term success. 38:28 OpenAI launching a new multi-agent reasoning team and hiring strong engineers for research. 42:35 Significant impact of scaling up inference compute 44:26 Need for restructuring academic research 48:25 Exploring different approaches for inference compute 50:11 Introducing controllable thinking time for more effective reasoning.

@jasongrig 25 күн бұрын

good presentation

@user-wr4yl7tx3w 2 ай бұрын

But what does it mean for a model to think? Think like humans? “I think, therefore I am” type of think. Think is not defined.

@souvikbhattacharyya2480 Ай бұрын

24:54 sad

@athalais9332 2 ай бұрын

Originally wrote a comment as a review of the talk, but when I re-read it, it felt a bit too mean. Instead, just have my personal recommendation that if you're limited on time, the other talks in this workshop are a great watch!

@nicksohacki7114 2 ай бұрын

@randomuser5237 2 ай бұрын

Literally no one cares about your review or recommendation, stop spamming the comment section.

@user-wr4yl7tx3w 2 ай бұрын

It’s not mean if you provide valid reasons. Otherwise, it’s called woke. This is science. Not a tea party.

@islandfireballkill 2 ай бұрын

Here is a meta review that isnt afraid to be mean. Your review sucks and contains little information and contains no useful information. It's like it was generated by some highly censored LLM.