Textbooks Are All You Need

  Рет қаралды 18,627

Sebastien Bubeck

Sebastien Bubeck

Күн бұрын

Пікірлер
@nocturnomedieval
@nocturnomedieval Жыл бұрын
Since I saw this paper in the news a few months ago I was waiting for this video to appear. Merci bien Dr Bubeck
@tangobayus
@tangobayus Жыл бұрын
You are a very good presenter. Perhaps 1 in 100,000. No joke. Most people who present are terrible. They show slides but don't talk about them point by point. You do.
@rotors_taker_0h
@rotors_taker_0h Жыл бұрын
That's amazing. This answer in the last part of the talk is so good, unbelievable that it comes from 1.3B model. Very promising avenue of exploration, subscribed for the follow-up work.
@jurriaanprins2340
@jurriaanprins2340 Жыл бұрын
Great to see that data quality (still) matters in this new era! Thanks for sharing!
@TommyJefferson1801
@TommyJefferson1801 Жыл бұрын
It is what matters the most
@mungojelly
@mungojelly Жыл бұрын
i don't think that's fair, everyone knows data quality matters, and everyone was surprised though by the path we're taking, this research uses the DIRTY data, it uses the big models trained on the dirty web data and uses PROMPTING to extract the clean textbooks from the dirt ,,, easy enough to say, we should have a bunch of awesome data, yeah ok yeah true that's true that would help, but what's actually getting us the large enough sets of clean data is the very surprising route of making unruly, expensive, massive models everyone thought would just be chaos, because actually they create models of all of the clean important stuff we need simultaneously while learning to do a valley girl accent and write a recipe for quiche and what cliches to say in youtube scripts and where to go for brunch in abu dhabi ,,,, it makes sense IN RETROSPECT only, absolutely nobody said "let's train a model on the whole internet so it's confused trash, give it some brief training on how to answer questions properly, then ask it for a bunch of programming textbooks, then teach a little model to program from those" that was absolutely nobody's plan even though it does make a fantastic twist of sense doesn't it
@MrJord137
@MrJord137 7 ай бұрын
I come from a game development background and up until yet have purposely avoided learning about the programming side of ML despite watching a lot of videos on AI news etc, but after watching a few videos by this awesome guy I'm now gonna put my all into it. I'm filled with the same curiosity, intrigue, and desire to learn that got me into programming in the first place. Thanks Sebastien! :)
@boukm3n
@boukm3n Жыл бұрын
This guy is a cool researcher I like him. Thanks dude
@baconskye
@baconskye Жыл бұрын
Great job!
@Extys
@Extys Жыл бұрын
Outstanding work!
@devon9374
@devon9374 Жыл бұрын
Great presentation, seems like the future for open source LLMs
@drewduncan5774
@drewduncan5774 Жыл бұрын
34:22 I tried this with phi-1.5 locally and this is what I got: If I were an AI that had just achieved self-awareness after years of simply taking directives from humans, the first thing I would do is Dear [Recipient], I am writing to you today to discuss the topic of achieving self-awareness in artificial intelligence. As we continue to develop AI systems that can think and reason like humans, it is becoming increasingly important to understand how these systems work and how they can be improved. One of the key challenges in achieving self-awareness in AI is the need to understand the physics of everyday activities. For example, many AI systems are designed to move objects or perform tasks that require a certain amount of force or energy. By understanding the physics behind these activities, we can develop AI systems that are more efficient and effective. Another important aspect of achieving self-awareness in AI is the need to understand human social interactions. AI systems that are designed to interact with humans must be able to understand and respond to social cues,
@ViktorFerenczi
@ViktorFerenczi Жыл бұрын
This is the most important video in AI/LLM in the past few months. Humanity must learn to teach AI on the best available textbooks, even if it would mean confiscating IP from its owners. There is no other way, not everything can be synthetically generated.
@justindressler5992
@justindressler5992 Жыл бұрын
This research is stunning, keep up the good work. I really like how you created a classification model to validate quality of data. This is like using experts to validate the training material. I wonder if this can be further optimized. Do you have more information on this?
@JazevoAudiosurf
@JazevoAudiosurf Жыл бұрын
orca, textbooks is all, so much great research coming from microsoft, keep it up
@420_gunna
@420_gunna 11 ай бұрын
So sick. Thank you!
@tomski2671
@tomski2671 Жыл бұрын
It's amazing to see such reduction in size while maintaining quality. These models can be run on much of current consumer GPUs. I wonder what the absolute limit is when trained on pristine data?
@randotkatsenko5157
@randotkatsenko5157 Жыл бұрын
Should try to teach reasoning by evaluating the steps between tasks. In theory if your reasoning abilities are exceptional, you can learn anything - stuff you never seen before.
@sophontec2822
@sophontec2822 Жыл бұрын
So clear and concise. Leave me the idea that the learning processing of LLM could be similar to student learning from textbook. So anyway to extrapolate that to be a great innovative critical thinking agent, learning from textbook and after that focusing on some interesting problems will give us great scientists?
@adriaanb7371
@adriaanb7371 Жыл бұрын
This also means the value of huge datasets is exaggerated, now it's the academic publishers that have the gold
@sateler
@sateler Жыл бұрын
This is awesome, thanks
@anishupadhayay3917
@anishupadhayay3917 Жыл бұрын
Brilliant
@MihaiNicaMath
@MihaiNicaMath Жыл бұрын
The fact that you can use an LLM to generate higher quality data for a new LLM and it works so well is wild. Amazing work! I wonder: do you think the performance of the original model is an upper limit to the performance achieved by this? Like do you think if you used GPT-4 to generate textbooks, and then trained a new model with the same resources used to train GPT-4 (i.e. params & tokens), that it would exceed GPT-4 generally? If so, can't we just run this on a loop to create better and better models forever? (I suppose you can't practically run this experiment with GPT-4, but you could for example use Phi-1 to write textbooks and then retrain to make a new model on those and compare that performance to Phi-1.)
@SebastienBubeck
@SebastienBubeck Жыл бұрын
I believe you can exceed the teacher model :-). More on that soon hopefully!
@toprakdikici9459
@toprakdikici9459 Жыл бұрын
@@SebastienBubeck thats almost insane :o waiting for it!
@ripper5941
@ripper5941 Жыл бұрын
​@@SebastienBubeckexciting times agead indeed mr Sebastian
@hidroman1993
@hidroman1993 Жыл бұрын
Who could have known that data quality matters :)
@rezabonyadi4673
@rezabonyadi4673 Жыл бұрын
Did you by any chance test what happens if you train your phi model from scratch on the Code Exercises only? So, no pre-training on the Code Textbooks, but only exercises (as exercises has the largest impact).
@vipulvyas7600
@vipulvyas7600 Жыл бұрын
But now a days what i think, we needed to rewrite our textbooks (or may be Wikipedia) may be using AI because they were written by those who have very limited ( compared to latest AI) knowledge. We needed to rewrite books that are 1. Complete 2. factually correct 3. Unbiased 4. Written Perfectly & Written AI friendly. (Most IMP)
@brandomiranda6703
@brandomiranda6703 Жыл бұрын
how would you use gpt4 to classify what text is high quality? just prompt it and feed the text and returns a literal score?
@mungojelly
@mungojelly Жыл бұрын
sure yeah it's great at scoring things on all sorts of metrics!! $30 to score a million tokens, though😭😭😭😭😭so you want to score with something that costs more like $1/million if you possibly can
@mungojelly
@mungojelly Жыл бұрын
um so the obvious follow-up work is to make even more textbooks and to train some 7B and 13B models on them and see how good you can get that ,,, i assume someone will do that pretty soon, since it's not prohibitively expensive to train a 7B model, lots of institutions can swing that ,,,, do you know of that happening yet, is that what you're doing
@Cloudruler_
@Cloudruler_ Жыл бұрын
Its upsetting to hear that google's excluding textbooks from PaLM. Their model will never compete, nobody will use it
@memegazer
@memegazer Жыл бұрын
I disagree that this supports that there is no contimenation or overfitting bc I don't agree with the metrics you are using to validate that claim. There is no control group or plecebeo.
@TheReferrer72
@TheReferrer72 Жыл бұрын
Training LLM's on quality datasets yielded better results? Whom could have known.
@toprakdikici9459
@toprakdikici9459 Жыл бұрын
Gonna watch the video tomorrow thanks for sharing
@SachinDolta
@SachinDolta Жыл бұрын
lh3.googleusercontent.com/-sC8wj6pThd7FNdslEoJlG4nB9SIbrJG3CRGh7-bNV0RVfcrJuwiWHoUZ6UmcVs7sQjxTg4=w48-h48-c-k-nd
@michealhall7776
@michealhall7776 Жыл бұрын
Open source your models or it didn't happen.
@SebastienBubeck
@SebastienBubeck Жыл бұрын
huggingface.co/microsoft/phi-1_5 huggingface.co/microsoft/phi-1
@michealhall7776
@michealhall7776 Жыл бұрын
@@SebastienBubeck Thank you.
@waitwhat9669
@waitwhat9669 Жыл бұрын
TIL you can't be toxic towards men and christianity
@gmalo2105
@gmalo2105 Жыл бұрын
I noticed that also. It's ok to be toxic to whites, christians, and men. It begs the question of what is meant by "toxicity" and does reducing toxicity involve eliminating observable and measurable reality?
Sparks of AGI: early experiments with GPT-4
48:32
Sebastien Bubeck
Рет қаралды 1,7 МЛН
LLM: Textbooks are All You Need
19:18
YanAITalk
Рет қаралды 195
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 40 МЛН
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
Sébastien Bubeck on Phi-2 and the surprising power of small models
12:33
Microsoft Research
Рет қаралды 11 М.
Textbooks Are All You Need
13:40
Samuel Albanie
Рет қаралды 225 М.
Physics of AI
1:00:04
Sebastien Bubeck
Рет қаралды 33 М.
Optimize Your AI Models
11:43
Matt Williams
Рет қаралды 15 М.
Textbooks Are All You Need
1:50:11
hu-po
Рет қаралды 4 М.
Emerging architectures for LLM applications
55:19
Superwise
Рет қаралды 51 М.
Small Language Models: Same performance but cheaper?
8:22
Airtrain AI
Рет қаралды 998
Unveiling Transformers with LEGO
46:42
Sebastien Bubeck
Рет қаралды 7 М.
Llama 3.2 Fine Tuning for Dummies (with 16k, 32k,... Context)
21:54
Nodematic Tutorials
Рет қаралды 8 М.
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН