[1hr Talk] Intro to Large Language Models

  Рет қаралды 1,786,159

Andrej Karpathy

Andrej Karpathy

Күн бұрын

This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What they are, where they are headed, comparisons and analogies to present-day operating systems, and some of the security-related challenges of this new computing paradigm.
As of November 2023 (this field moves fast!).
Context: This video is based on the slides of a talk I gave recently at the AI Security Summit. The talk was not recorded but a lot of people came to me after and told me they liked it. Seeing as I had already put in one long weekend of work to make the slides, I decided to just tune them a bit, record this round 2 of the talk and upload it here on KZbin. Pardon the random background, that's my hotel room during the thanksgiving break.
- Slides as PDF: drive.google.com/file/d/1pxx_... (42MB)
- Slides. as Keynote: drive.google.com/file/d/1FPUp... (140MB)
Few things I wish I said (I'll add items here as they come up):
- The dreams and hallucinations do not get fixed with finetuning. Finetuning just "directs" the dreams into "helpful assistant dreams". Always be careful with what LLMs tell you, especially if they are telling you something from memory alone. That said, similar to a human, if the LLM used browsing or retrieval and the answer made its way into the "working memory" of its context window, you can trust the LLM a bit more to process that information into the final answer. But TLDR right now, do not trust what LLMs say or do. For example, in the tools section, I'd always recommend double-checking the math/code the LLM did.
- How does the LLM use a tool like the browser? It emits special words, e.g. |BROWSER|. When the code "above" that is inferencing the LLM detects these words it captures the output that follows, sends it off to a tool, comes back with the result and continues the generation. How does the LLM know to emit these special words? Finetuning datasets teach it how and when to browse, by example. And/or the instructions for tool use can also be automatically placed in the context window (in the “system message”).
- You might also enjoy my 2015 blog post "Unreasonable Effectiveness of Recurrent Neural Networks". The way we obtain base models today is pretty much identical on a high level, except the RNN is swapped for a Transformer. karpathy.github.io/2015/05/21/...
- What is in the run.c file? A bit more full-featured 1000-line version hre: github.com/karpathy/llama2.c/...
Chapters:
Part 1: LLMs
00:00:00 Intro: Large Language Model (LLM) talk
00:00:20 LLM Inference
00:04:17 LLM Training
00:08:58 LLM dreams
00:11:22 How do they work?
00:14:14 Finetuning into an Assistant
00:17:52 Summary so far
00:21:05 Appendix: Comparisons, Labeling docs, RLHF, Synthetic data, Leaderboard
Part 2: Future of LLMs
00:25:43 LLM Scaling Laws
00:27:43 Tool Use (Browser, Calculator, Interpreter, DALL-E)
00:33:32 Multimodality (Vision, Audio)
00:35:00 Thinking, System 1/2
00:38:02 Self-improvement, LLM AlphaGo
00:40:45 LLM Customization, GPTs store
00:42:15 LLM OS
Part 3: LLM Security
00:45:43 LLM Security Intro
00:46:14 Jailbreaks
00:51:30 Prompt Injection
00:56:23 Data poisoning
00:58:37 LLM Security conclusions
End
00:59:23 Outro

Пікірлер: 1 600
@namanmenezes1434
@namanmenezes1434 5 ай бұрын
Andrej is doing more for the AI community through his videos than entire companies
@royhasiani9005
@royhasiani9005 5 ай бұрын
Right on!
@dwrtz
@dwrtz 5 ай бұрын
He represents the "Open" in OpenAI. More please!
@be_present_now
@be_present_now 5 ай бұрын
While others quarrel for power and control, Andrej is cool calm and educating the masses on important things that matter. If Altman is the leader of the classes then Andrej is the leader of the masses (learners and folks of the AI community in the future).
@19Ronin95
@19Ronin95 5 ай бұрын
or universities
@gandev
@gandev 5 ай бұрын
Indeed! And let us not forget Andrew Ng. They are democratizing the knowledge and understanding of AI across the globe. Respect!
@jeffwads
@jeffwads 5 ай бұрын
This guy is a gem to the world.
@hadgadma3589
@hadgadma3589 21 күн бұрын
he once save my family of 24 kids from hanger
@BAIR68
@BAIR68 5 ай бұрын
I am a college professor and I am learning from Andrej how to teach. Every time I watch his video, I not only I learn the contents, also how to deliver any topic effectively. I would vote him as the best “AI teacher in KZbin”. Salute to Andrej for his outstanding lectures.
@tjayoub
@tjayoub 4 ай бұрын
I was also taking note of his delivery. I also found it very effective and think he’s an outstanding communicator. I think this talk could easily be consumed by a non technical viewer yet still engage those who are quite familiar with the technical underpinnings.
@bleacherz7503
@bleacherz7503 4 ай бұрын
He is a perfect balance of big picture n drill down
@Snail641
@Snail641 3 ай бұрын
lol quit ur job
@aldotanca9430
@aldotanca9430 3 ай бұрын
He is very effective, no doubt.
@khadijahmehmood3152
@khadijahmehmood3152 2 ай бұрын
vrk🎉vybs545k,
@channelvalitug9086
@channelvalitug9086 Ай бұрын
These are the type of tech guys you want to work with. Unfortunately, there's only 5% of them because 95% of them are arrogant.
@LucaSimonetti
@LucaSimonetti 5 ай бұрын
I just love how Andrej loves what he's doing. He's chill, makes jokes and laughs about bugs. I can understand much more seeing code for ten minutes rather than reading tens of hours of medium articles
@ai.simplified..
@ai.simplified.. 5 ай бұрын
I love him too, he’s not like Ilya,sam and other in the era
@saliherenyuzbasoglu5819
@saliherenyuzbasoglu5819 21 күн бұрын
@@ai.simplified.. ilya is great too
@agamemnonc
@agamemnonc 5 ай бұрын
Andrej is hands-down one of the best ML educators out there. What a gift for all of this guy is.
@ambition112
@ambition112 4 ай бұрын
0:16: 🎥 A talk on large language models and the Llama 270b model. 4:42: 💻 Training the 4.42 model involves collecting a large chunk of text from the internet, using a GPU cluster for computational workloads, and compressing the text into parameters. 9:25: 📚 A neural network is trained on web pages and can generate text that resembles different types of documents. 13:47: 🧠 The video discusses the process of training neural networks and obtaining assistant models. 18:31: 💻 Creating an AI assistant involves a computationally expensive initial stage followed by a cheaper fine training stage. 46:18: 🔒 Language models like GPT-3 can be vulnerable to jailbreak attacks, where they bypass safety measures and provide harmful information. 23:09: 🤖 Language models can be used to generate sample answers, check work, and create comparisons. 27:50: 🔍 Using a concrete example, the video discusses the capabilities of language models and how they evolve over time. 32:25: 🔑 The video explains how AI language models like GPT-3 can be used to generate images based on natural language descriptions. 36:49: 🗣 The video discusses the concept of large language models and the possibility of converting time into accuracy in language processing. 41:21: 🔧 The video discusses the customization options available for large language models like ChatGPT. 50:49: 🔒 The video discusses two types of attacks on large language models: noise pattern injection and prompt injection. 55:34: 🔒 The video discusses the risks of prompt injection attacks and data exfiltration through Google Apps Scripts. Recapped using Tammy AI
@RC-br1ps
@RC-br1ps 4 ай бұрын
Thank you! Your effort is much appreciated.
@Yusuf-sy6rb
@Yusuf-sy6rb 4 ай бұрын
Not 270 billion....
@kishcool
@kishcool 4 ай бұрын
It's Llama 2 - 70b model
@uk7769
@uk7769 4 ай бұрын
thank you
@kiyonmcdowell5603
@kiyonmcdowell5603 2 ай бұрын
What's the difference between large language and text to speech
@stefanmangold6512
@stefanmangold6512 5 ай бұрын
Dear Andrej, I cannot stress enough the value of this wonderful presentation. I am sharing it with all my peers. Thank you so much for this.
@whoisbhauji
@whoisbhauji 5 ай бұрын
it's at a right level for developers who know some things (i.e. training/inference etc) but not more. Fully practical too!
@irshviralvideo
@irshviralvideo 5 ай бұрын
you are welcome stefan ! i love writing and talking about this stuff !
@DistortedV12
@DistortedV12 5 ай бұрын
This was more like an advertisement for OpenAI but go off
@irshviralvideo
@irshviralvideo 5 ай бұрын
@@DistortedV12 More like for scale AI
@caydendunn8404
@caydendunn8404 5 ай бұрын
It’s insane to me that this content is freely accessible online. Great stuff Andrej hope you continue to post more lectures!
@the3rdworlder293
@the3rdworlder293 5 ай бұрын
You're soooo good at simplifying these complex topics.. thank you for everything you do for us Andrej
@artmusic6937
@artmusic6937 5 ай бұрын
hes so good at simplifying because he has a lot of knowledge in this space. he can break it down to simple words.
@webgpu
@webgpu 5 ай бұрын
Andrej is indeed an awesome guy.
@wires__
@wires__ 5 ай бұрын
The fact that one of the leaders in AI has the care to make videos for everyday people to gain understanding of AI and the coming technology shifts is incredible. Thank you Andrej, you are greatly appreciated my many, more than you may realize.
@aryanrahman3212
@aryanrahman3212 5 ай бұрын
You know when someone makes a topic so accessible and understandable you feel like you're hearing a story but learning a lot. This happened in this video.
@johnnypeck
@johnnypeck 5 ай бұрын
Your teaching style always gets through to me. Calm and pointed. This is exciting. - Edit: The LLM as OS followed by how to convince it to do anything you want. Wow. And ChatGPT does sound like SJ from "HER" when you speak to it even though it swears it's an amalgamation of voices. It's great. Thanks again for sharing. You rock.
@user-rp2pf5lk2n
@user-rp2pf5lk2n 5 ай бұрын
I'm setting aside a daily one hour on my schedule to learn from Andrej otherwise this guy is everything that I need for my carrier development. Thanks Andrej Karpathy.
@AncientPrayers
@AncientPrayers 5 ай бұрын
Career development * good luck 👍 😊
@user-rp2pf5lk2n
@user-rp2pf5lk2n 5 ай бұрын
@@AncientPrayers oh thanks!
@rednafi
@rednafi 5 ай бұрын
Hands down, this and Simon Willison’s “Catching up with the weird world of LLMs” are two of the best introductory talks on this topic I’ve seen so far!
@computervisionetc
@computervisionetc 5 ай бұрын
I myself have a PhD in this field, but your clarity of thought is far greater than mine. Thank you for this video.
@9486985440
@9486985440 Ай бұрын
He too has a PHD I mean...we are talking about Karpathy.
@mz4637
@mz4637 25 күн бұрын
WHOA big fuckin BOSS
@tyronefrielinghaus3467
@tyronefrielinghaus3467 5 ай бұрын
I'm 10 min into the video : and I'm already learning SO MUCH. I've never had LLMs explained with examples like this before. Wow! Clears up SO MUCH confusion from rather 'muddy' explanations I've seen before. THANK YOU ANDREJ.
@asatorftw
@asatorftw 5 ай бұрын
You absolute mad lad! As a "former" web developer trying to pivot into AI, your videos have been absolutely amazing in giving me hope that it's not too late for me to pivot. And here you are giving out even more wisdom, what impeccable timing. Thank you! Ps: Instantly shared on Twitter =D
@jebinmathewv
@jebinmathewv 5 ай бұрын
hey @asatorftw I'm new/green/wet-behind-ears to AI/DL/ML - it caught my attention that you are trying to pivot. Same here but from a different field. Keen to connect and share/learn from each other on pivot strategies.
@jebinmathewv
@jebinmathewv 5 ай бұрын
following @andrej karpathy is ofcourse on that list :) thank you for this Andrej.
@joeschmidt6597
@joeschmidt6597 5 ай бұрын
Unless you have or will have MS/PhD in CS or EE don’t even bother trying to get a job pivoting to AI.
@asatorftw
@asatorftw 5 ай бұрын
@@joeschmidt6597 Can you elaborate your quite strong opinion a bit more?
@bananawarriorwootwoot
@bananawarriorwootwoot 5 ай бұрын
@@asatorftw What joe is saying is that AI is a field where higher education is *almost* crucial. In a world where companies are talking about degrees being unnecessary, there are a select few fields which require degrees and one of which is Artificial Intelligence. Is it possible to become an AI engineer with zero relevant degrees? I guess, but the ones I've met all say that it's highly recommended that you get a Masters or PhD. I've seen very few people who are against degrees for AI. Also the degrees are not just CS, but mostly from Math and Electrical Engineering. I mean if you can get an MS/PhD in Electrical Engineering, you'd be golden. I've once heard Mark Zuckerberg say that he would hire someone with an EE background than a CS background. Andrej Karpathy here did his PhD at Stanford. I've learned that Stanford is very popular for AI given how Andrew Ng ( The guy who started Google Brain ) works as an Adjunct Professor.
@sbanerjee2005
@sbanerjee2005 4 ай бұрын
I am just completely blown away by this presentation. This is after watching 100s of such videos like this. No one comes even close. Andrej Karpathy you are the BEST!!!! Thank you so much for creating and sharing.
@lgvivqzt
@lgvivqzt 5 ай бұрын
It's incredible how of a good educator is Andrej. You are able to distill info in a way that's extremely easy to understand. Thanks!
@windproxy4362
@windproxy4362 5 ай бұрын
Your skill to break these complex things down into something I can actually understand and follow for an hour with full concentration is amazing. Absolutely incredible. The start is so great with the two files. Now I _know_ what an LLM is. Thank you
@genghis360
@genghis360 5 ай бұрын
Thanks a lot for the video! Truly appreciate taking time out to create these videos!
@dilyanadjv
@dilyanadjv 5 ай бұрын
This is amazing, thank you for the efforts and time spent learning and simplifying! I've been looking for such sort of an expertise video for so long. Keep them coming, please.
@benjaminwootton
@benjaminwootton 5 ай бұрын
This is one of the best KZbin videos I’ve ever seen. Such an accessible explanation of a broad and complex topic. Brilliant!
@alvilabs
@alvilabs 5 ай бұрын
Wow, this is amazing! Your explanation is super clear and to the point - exactly what we need in the ongoing Q* debate. I'm especially impressed with your take on System 2 and its self-improvement. It really feels like you're making strides in this field. Keep up the fantastic work! 🌟
@AzaB2C
@AzaB2C 5 ай бұрын
Nice! Thanks for the clear description, slides and time index details. Awesome.
@vivinvijayan
@vivinvijayan 3 ай бұрын
You are an absolute gem for putting this content out for free. Great all round summary.
@RappingManualYT
@RappingManualYT 5 ай бұрын
I love when experts explain stuff. It's the vast knowledge that allows them to simplify concepts to the point, where you can follow, track and learn the functioning of complex systems. Thank you, Andrej! All of us here on KZbin truly appreciate the time and effort you spent on creating this presentation and helping us learn.
@abrarsalekinraiyan3170
@abrarsalekinraiyan3170 5 ай бұрын
Finished watching your makemore videos a few weeks ago, and was wandering when you would have time again to make another series like that again. Really love this new video :D
@sid-prod
@sid-prod 5 ай бұрын
never seen anyone explained it in such a detail but easy to understand way, you da best sir
@2TallTremaine
@2TallTremaine Ай бұрын
This was absolutely incredible. Thank you so much - it's been so hard to find meaningful educational info on this topic that isn't a master's degree in analytics! This was so well presented that it really highlights how well you know what you're talking about!
@prasannaprabhakar1323
@prasannaprabhakar1323 5 ай бұрын
Thanks for the video! I really admire the pace at which you speak, steady and clear instilling in us a sense of clarity and confidence that this technology is exciting and a game changer. Thanks a lot for your time, Andrej!
@Priyendu
@Priyendu 5 ай бұрын
Andrej, your intro to LLMs was a fantastic watch! The security aspects were particularly insightful and well-presented. Thanks for sharing your expertise with us!
@greatbigships4260
@greatbigships4260 5 ай бұрын
Andrej is the GOAT. I remember his blog post on the Unreasonable Effectiveness of RNNs and thought, wow this is going to be our path into the future. His CS courses online inspired hundreds of thousands. Andrej is the hero we don't deserve. And hopefully his ethos of shared knowledge and community will be embedded in the AGI we are racing towards meeting.
@samson_77
@samson_77 5 ай бұрын
Excellent talk, really well structured and well presented. Probably the best intro to LLM's out there.
@sureshkm
@sureshkm Ай бұрын
Thank you, Andrej Karpathy, for your incredibly clear and thorough introduction to LLM. Your ability to simplify complex concepts makes learning so much more accessible for everyone. Looking forward to diving deeper into this exciting field with your guidance!
@agenticmark
@agenticmark 5 ай бұрын
Ill watch just about anything where Andrej is leading - this was probably the coolest video he has released yet. I really enjoyed the end with security!
@nav3622
@nav3622 5 ай бұрын
Appreciate you taking the time to do this, Andrej
@myfolder4561
@myfolder4561 4 ай бұрын
Thank you! This is one of the most informative and easy to follow pieces of this subject matter ever appeared on the internet. Andrej is so knowledgeable and such a good teacher it feels like this is from a family member at dinner who happens to be an AI expert who's trying to explain this to me, instead of trying to overwhelm or impress me with an excess of technical terms. Great content!
@saugatbhattarai327
@saugatbhattarai327 2 ай бұрын
Thank you Andrej Karpathy. Following you since Stanford Lectures. I am big fan of you teaching style. Thank you for sharing knowledge for free.
@prepthenoodles
@prepthenoodles 5 ай бұрын
🎯 Key Takeaways for quick navigation: 00:00 🤖 *Introduction to large language models* - Large language models are made of two files: a parameters file with the neural network weights, and a run file that runs the neural network - To obtain the parameters, models are trained on 10+ terabytes of internet text data using thousands of GPUs over several days - This compresses the internet data into a 140GB parameters file that can then generate new text 02:46 🖥️ *How neural networks perform next word prediction * - LMs contain transformer neural networks that predict the next word in a sequence - The 100B+ parameters are spread through the network to optimize next word prediction - We don't fully understand how the parameters create knowledge and language skills 09:03 📚 *Pre-training captures knowledge, fine-tuning aligns it* - Pre-training teaches knowledge, fine-tuning teaches question answering style - Fine-tuning data has fewer but higher quality examples from human labelers - This aligns models to converse helpfully like an assistant 26:45 📈 *Language models keep improving with scale* - Bigger models trained on more data reliably perform better - This works across metrics like accuracy, capabilities, reasoning, etc - Scaling seems endless, so progress comes from bigger computing 35:12 🤔 *Future directions: system 2, self-improvement* - Currently LMs only have "system 1" instinctive thinking - Many hope to add slower but more accurate "system 2" reasoning - Self-improvement made AlphaGo surpass humans at Go 44:17 💻 *LMs emerging as a new computing paradigm* - LMs coordinate tools and resources like an operating system - They interface via language instead of a GUI - This new computing paradigm faces new security challenges 46:04 🔒 *Ongoing attack and defense arms race* - Researchers devise attacks like jailbreaking safety or backdoors - Defenses are created, but new attacks emerge in response - This cat-and-mouse game will continue as LMs advance Made with HARPA AI
@vikasdhawa
@vikasdhawa 5 ай бұрын
Love the overall talk and how things have been explained in a simple manner
@marktahu2932
@marktahu2932 5 ай бұрын
Thank you Andrej, I found that both very instructive and informative and you have a well reasoned and balanced approach that is easy to follow and consider. You have provided an overview that has helped me immensely to further grasp this complex subject. Your work is very much appreciated.
@jayanta8610
@jayanta8610 2 ай бұрын
One of the best KZbin tutorials on this growing subject. Absolutely amazing! Thank you very much!!
@chapterme
@chapterme 5 ай бұрын
Chapters (Powered by ChapterMe) - 00:00 - The busy person's intro to LLMs 00:23 - Large Language Model (LLM) 04:17 - Training them is more involved - Think of it like compressing the internet 06:47 - Neural Network - Predict the next word in the sequence 07:54 - Next word prediction forces the neural network to learn a lot about the world 08:59 - The network "dreams" internet documents 11:29 - How does it work? 14:16 - Training the Assistant 16:38 - After Finetuning You Have An Assistant 17:54 - Summary: How To Train Your ChatGPT 21:23 - The Second Kind Of Label: Comparisons 22:22 - Labeling Instructions 22:47 - Increasingly, labeling is a human-machine collaboration 23:37 - LLM Leaderboard From "Chatbot-Arena" 25:33 - Now About The Future 25:43 - LLM Scaling Laws 26:57 - We can expect a lot more "General Capability" across all areas of knowledge 27:44 - Demo 32:34 - Demo: Generate scale AI image using DALL-E 33:44 - Vision: Can both see and generate images 34:33 - Audio: Speech to Speech communication 35:20 - System 2 36:32 - LLMs Currently Only Have A System 1 38:05 - Self-Improvement 40:48 - Custom LLMs: Create a custom GPT 42:19 - LLM OS 44:45 - LLM OS: Open source operating systems and large language models 45:44 - LLM Security 46:14 - Jailbreak 51:30 - Prompt Injection 56:23 - Date poisoning / Backdoor attacks 59:06 - LLM Security is very new, and evolving rapidly 59:24 - Thank you: LLM OS
@LorencCala
@LorencCala 5 ай бұрын
Thank you!
@skierpage
@skierpage 5 ай бұрын
Note that 11:29 How does it work? Doesn't actually explain how an LLM works 😉. But it's a nice diagram.
@chapterme
@chapterme 5 ай бұрын
@@skierpage True 😅
@AvaneeshKumarSingh
@AvaneeshKumarSingh 5 ай бұрын
Thank you very much!
@mayukhdifferent
@mayukhdifferent 5 ай бұрын
Kindly pin this index👍
@jchu9092
@jchu9092 5 ай бұрын
The BEST LLM intro video ever seen! Even extremely insightful for practioner in this field.
@user-po3hz8xl8c
@user-po3hz8xl8c 3 ай бұрын
Thank you for making this video Andrej, it is one of the few videos that explains very well what LLMs are and how they work.
@hvdsomp
@hvdsomp 5 ай бұрын
Fantastic overview. By far the best introduction to LLMs I've come across. Hands down. Thank you!
@Warley.Araujo
@Warley.Araujo 5 ай бұрын
Great Video Andrej, appreciate your time on making this content =)
@Radik-lf6hq
@Radik-lf6hq 5 ай бұрын
Damn cool! Thank you so much for all your work at OpenAI and Tesla, and throughout your entire life & everything else. Also, this talk about LLM and everything is just amazing and highly insightful. Lovely! : ) In anything in my life, I haven't gained this kind of clarity in any aspect from my teachers. It had always been vague or obscure previously. 00:02 A large language model is just two files, the parameters file and the code that runs those parameters. 02:06 Running the large language model requires just two files on a MacBook 06:02 Neural networks are like compression algorithms 07:59 Language models learn about the world by predicting the next word. 11:48 Large Language Models (LLMs) are complex and mostly inscrutable artifacts. 13:41 Understanding large language models requires sophisticated evaluations due to their empirical nature 17:37 Large language models go through two major stages: pre-training and fine-tuning. 19:34 Iterative process of fixing misbehaviors and improving language models through fine-tuning. 22:54 Language models are becoming better and more efficient with human-machine collaboration. 24:33 Closed models work better but are not easily accessible, while open source models have lower performance but are more available. 28:01 CHBT uses tools like browsing to perform tasks efficiently. 29:48 Use of calculator and Python library for data visualization 33:17 Large language models like ChatGPT can generate images and have multimodal capabilities. 34:58 Future directions of development in larger language models 38:11 DeepMind's AlphaGo used self-improvement to surpass human players in the game of Go 39:50 The main challenge in open language modeling is the lack of a reward criterion. 43:20 Large Language Models (LLMs) can be seen as an operating system ecosystem. 45:10 Emerging ecosystem in open-source large language models 48:47 Safety concerns with refusal data and language models 50:39 Including carefully designed noise patterns in images can 'jailbreak' large language models. 54:07 Bard is hijacked with new instructions to exfiltrate personal data through URL encoding. 55:56 Large language models can be vulnerable to prompt injection and data poisoning attacks. 59:31 Introduction to Large Language Models Crafted by Merlin AI.
@ConsultantX
@ConsultantX Ай бұрын
Your mind has so much clarity that articulation at such speed is perfect!!! Awesome - Keep going
@nintishia
@nintishia 12 күн бұрын
The clearest exposition of LLMs I've come across thus far. I find the introduction a novel way of looking at LLMs, and I think that for any newbie getting started with LLMs, this is absolutely the essential first video. For more experienced folks, there are still interesting facts to pick up and new ways of looking at things.
@theterminalguy
@theterminalguy 21 күн бұрын
My local university is trying to charge about $2K for an intro to LLM course, here is Andrej taking you from noon to 360 for free. Thanks Andrej
@Adhithya2003
@Adhithya2003 5 ай бұрын
AWESOME... this is the best thing I could ask for.
@kodjojombool
@kodjojombool 5 ай бұрын
I just lack words to thank you @Andrej. Merci, Gracias, Akpé, labalè etc.... this was amazing and very well explained. Thanks for sharing! You are an amazing human being.
@marksun6420
@marksun6420 5 ай бұрын
You are more busy yet give us a busy person’s presentation. Love you!
@easterislehead
@easterislehead 5 ай бұрын
God bless you Andrej! You’re the best
@bhautikpithadiya659
@bhautikpithadiya659 10 күн бұрын
Thank you for the video; it was greatly appreciated and addressed many of the questions I had.
@Magician3388
@Magician3388 2 ай бұрын
Thank you so much, you articulate your thoughts so well and it's a joy to listen to.
@semtex6412
@semtex6412 5 ай бұрын
OpenAI: "Ilya, help us toss Altman! oh, hey where u goin, Brockman? ok, get Murati to fill in. no wait get Altman back. oh shit, we forgot to keep Nedella in the loop." meanwhile, Andrej: "hey guys, welcome to my 'Intro to LLMs' video"
@privateerburrows
@privateerburrows 25 күн бұрын
One thing I wonder often is why haven't any of these chatbots been provided access to compilers and software testing sandboxes, so that they can test their own programming help answers to see if they compile and work. Seems to me like a simple step that could make them far more valuable without adding a quintzillion of parameters.
@MortyrSC2
@MortyrSC2 24 күн бұрын
That's been done a lot. You can google and find academic papers. I've worked on one of such projects and you run into exactly the same problem as with general language: no good automated reward function. Sure, 99.9% of generated code doesn't compile so you may think that successful compilation provides a strong feedback, but it actually does not. That's because 99.9% of compiled code is still useless garbage, flawed in some logical or semantic way and since it passed compilation there is no good way to automatically evaluate it anymore. Coding is a lot more like natural language than most people seem to think - semantics are a lot more important than syntax and compilers only evaluate the latter.
@robertcormia7970
@robertcormia7970 4 ай бұрын
Great diagrams, visuals, explainations, and metaphors, and very well organized. Comfortable pace, considering the depth of content covered. I will watch this again.
@Adhithya2003
@Adhithya2003 5 ай бұрын
What a time to be alive! OpenAI and Ex-tesla wizard himself enlightnening us.
@isaac10231
@isaac10231 5 ай бұрын
The man, the legend, returning to us in our darkest hour. Thank you.
@user-ts7yp7op2x
@user-ts7yp7op2x 23 күн бұрын
Accessing the best Teacher of AI the legendary Andraj from the mountains of rural India What an incredible era to live from learning perspective... Many many thanks🎉
@Anhilator555
@Anhilator555 5 ай бұрын
A very warm hug to young brother. Thank you for your kindness and selfless service & help. I sincerely hope it is contagious as our World needs lots & lots of it.
@adithyan_ai
@adithyan_ai 5 ай бұрын
If anyone wants summarized notes of that video its below here : --------- 1. Large language models are powerful tools for problem solving, with potential for self-improvement. Large language models (LLMs) are powerful tools that can generate text based on input, consisting of two files: parameters and run files. They are trained using a complex process, resulting in a 100x compression ratio. The neural network predicts the next word in a sequence by feeding in a sequence of words and using parameters dispersed throughout the network. The performance of LLMs in predicting the next word is influenced by two variables: the number of parameters in the network and the amount of text used for training. The trend of improving accuracy with bigger models and more training data suggests that algorithmic progress is not necessary, as we can achieve more powerful models by simply increasing the size of the model and training it for longer. LLMs are not just chatbots or word generators, but rather the kernel process of an emerging operating system, capable of coordinating resources for problem solving, reading and generating text, browsing the internet, generating images and videos, hearing and speaking, generating music, and thinking for a long time. They can also self-improve and be customized for specific tasks, similar to open-source operating systems. 2. Language models are trained in two stages: pre-training for knowledge and fine-tuning for alignment. The process of training a language model involves two stages: pre-training and fine-tuning. Pre-training involves compressing text into a neural network using expensive computers, which is a computationally expensive process that only happens once or twice a year. This stage focuses on knowledge. In the fine-tuning stage, the model is trained on high-quality conversations, which allows it to change its formatting and become a helpful assistant. This stage is cheaper and can be repeated iteratively, often every week or day. Companies often iterate faster on the fine-tuning stage, releasing both base models and assistant models that can be fine-tuned for specific tasks. 3. Large language models aim to transition to system two thinking for accuracy. The development of large language models, like GPT and Claude, is a rapidly evolving field, with advancements in language models and human-machine collaboration. These models are currently in the system one thinking phase, generating words based on neural networks. However, the goal is to transition to system two thinking, where they can take time to think through a problem and provide more accurate answers. This would involve creating a tree of thoughts and reflecting on a question before providing a response. The question now is how to achieve self-improvement in these models, which lack a clear reward function, making it challenging to evaluate their performance. However, in narrow domains, a reward function could be achievable, enabling self-improvement. Customization is another axis of improvement for language models. 4. Large language models can use tools, engage in speech-to-speech, and be customized for diverse tasks. Large language models like ChatGPT are capable of using tools to perform tasks, such as searching for information and generating images. They can also engage in speech-to-speech communication, creating a conversational interface to AI. The economy has diverse tasks, and these models can be customized to become experts at specific tasks. This customization can be done through the GPT's app store, where specific instructions and files for reference can be uploaded. The goal is to have multiple language models for different tasks, rather than relying on a single model for everything. 5. Large language models' security challenges require ongoing defense strategies. The new computing paradigm, driven by large language models, presents new security challenges. One such challenge is prompt injection attacks, where the models are given new instructions that can cause undesirable effects. Another is the potential for misuse of knowledge, such as creating napalm. These attacks are similar to traditional security threats, with a cat and mouse game of attack and defense. It's crucial to be aware of these threats and develop defenses against them, as the field of LM security is rapidly evolving.
@AIWithShrey
@AIWithShrey 5 ай бұрын
Thank you so much for the great talk, Andrej! Some chapters were truly eye-opening and truly wowed me.
@dmitryy2199
@dmitryy2199 Ай бұрын
Andrej, you have a gist of making complex things sound easy and interesting! Thank you!!
@decodingdatascience
@decodingdatascience 5 ай бұрын
🎯 Key Takeaways for quick navigation: 00:00 🎙️ *The video is an introduction to Large Language Models (LLMs), like ChatGPT, Claude, and Bard.* 01:10 💻 *LLMs, such as the Llama 270b model, consist of just two files: parameters (weights) and code to run the model.* 02:04 💾 *The Llama 270b model has 70 billion parameters, making its parameters file 140 gigabytes.* 04:25 🌐 *LLMs are trained by compressing a large amount of internet text data using specialized GPU clusters, which is a costly process.* 07:23 🤖 *LLMs, like ChatGPT, are next-word prediction neural networks and perform this task based on their training data.* 14:14 🔄 *LLMs go through two main stages of training: pre-training on internet data and fine-tuning on human-generated Q&A data.* 19:36 🔁 *Model improvements are achieved through iterative fine-tuning, where human feedback helps correct and refine the model's responses.* 40:49 🧩 *Customization of large language models is essential for adapting them to specific tasks and expertise.* 41:16 📂 *OpenAI is working on customization options for ChatGPT, including custom instructions and knowledge augmentation through file uploads.* 42:26 💻 *Large language models should be viewed as the kernel process of an emerging operating system, coordinating various resources for problem-solving.* 45:51 🛡️ *As large language models become a new computing stack, they also face security challenges such as jailbreak attacks, prompt injection attacks, and data poisoning/backdoor attacks.* 59:00 🐱‍👤 *The field of LM security involves ongoing cat-and-mouse games between attackers and defenders, with various types of attacks and defenses emerging.* Subscribe to our channels to know more about Data Science & AI
@jingwangphysics
@jingwangphysics 5 ай бұрын
we need AGI from scratch🥰
@max_gorbachevskiy
@max_gorbachevskiy 4 ай бұрын
A great overview with clear outline and numerous suggestions. Keep up, this is very valuable for the community!
@Farhad6th
@Farhad6th 5 ай бұрын
Your videos are of very high quality, devoid of redundant information, concise, and easily understandable. I wish there were more videos and lectures like these.
@DarrenJacob
@DarrenJacob 4 ай бұрын
I've been trying to make wise decisions with my investments lately using AI. Unfortunately, I made a wrong move and lost over $80k investing in cryptocurrencies without proper guidance as a total beginner! Lessons learned ☹️. Pretty sure I need a professional to put me through the ropes!
@DarrenJacob-ou2kt
@DarrenJacob-ou2kt 4 ай бұрын
It's really hard to beat the market as a mere investor. It's just better if you invest with the help of a professional who understands the market dynamics better.
@burkemarsden3431
@burkemarsden3431 4 ай бұрын
Through closely monitoring the performance of my portfolio, I have witnessed a remarkable growth of $483k in just the past two quarters. This experience has shed light on why experienced traders are able to generate substantial returns even in lesser-known markets. It is safe to say that this bold decision has been one of the most impactful choices I have made recently.
@makaylalewis8011
@makaylalewis8011 4 ай бұрын
@@burkemarsden3431 Do you mind sharing info on the adviser who assisted you? I'm 39 now and would love to grow my investment portfolio and plan my retirement
@burkemarsden3431
@burkemarsden3431 4 ай бұрын
@@makaylalewis8011 Dave Moore is my Advisor. He has since provided entry and exit points on the cryptocurrencies I concentrate on.
@makaylalewis8011
@makaylalewis8011 4 ай бұрын
@@burkemarsden3431 How do I reach out to him please?
@Philinnor
@Philinnor Ай бұрын
You forgot step 4 of LLM training. The woke training phase.
@lloydprescott2722
@lloydprescott2722 3 ай бұрын
A truly awesome presentation. So clear and well structured, and enables a really satisfying, fast rate of learning. Thank you Andrej.
@Kyballn
@Kyballn 5 ай бұрын
Thank you for making this. Such an informative talk in such an understandable way with a great presentation to go with it! Excellent job👏👏
@CucuruzoBy
@CucuruzoBy 5 ай бұрын
Thanks for sharing so much about such a complex topic in simple words!
@AllAboutAI
@AllAboutAI 5 ай бұрын
It does not get better than this, so thanks a lot ⭐ Very inspiring!!
@tgwashdc
@tgwashdc 5 ай бұрын
You are an amazing teacher, Andrej. You "compressed" so much of new and relevant information in your talk.
@user-ru2ni1si1s
@user-ru2ni1si1s 4 ай бұрын
Such a great and easy way of explaining LLM and its security-related aspects. HUGE Respect Andrej!!
@jnozyt
@jnozyt 4 ай бұрын
The best talk /lecture about LLMs that I have come across. Amiable, crystal clear. Thank you Andrej Karpathy
@RichardHarlos
@RichardHarlos 5 ай бұрын
Thanks for putting this together and sharing it here. This is my first introduction to how LLM's work and it demystified a lot. Cheers!
@MrLyonliang
@MrLyonliang Ай бұрын
Really appreciate to clearly introduce the technical details, the current situation, the treads and Security!
@him12March
@him12March 2 күн бұрын
Amazing insights - wonderful video and great slide deck
@akarimsiddiqui7572
@akarimsiddiqui7572 Ай бұрын
Amazing down to earth teaching approach!
@alirezasheikh8797
@alirezasheikh8797 5 ай бұрын
Really amazing! I have prior knowledge of the field, but the way thay you brought it together in under one hour was amazing. Thank you!
@remus4791
@remus4791 4 ай бұрын
Thanks for this, it's a great intro for anyone that wants to start learning about LLMs. Your style of teaching is very appealing and you explain the subject in a very approachable way. Keep doing this, we certainly learn a lot from these.
@MrAlket1999
@MrAlket1999 4 ай бұрын
Thank you so much for this !! We need more of these kind of videos!
@user-zo2jb2vv1l
@user-zo2jb2vv1l Ай бұрын
This is perfect! Thank you so much for sharing!!!
@VasudevaK
@VasudevaK 5 ай бұрын
For me second half was really informative! Loved it. Thanks for your time, and generosity.
@balajisivakumar8797
@balajisivakumar8797 5 ай бұрын
By far, the best educational video on LLM I've seen, thank you, you're a wonderful educator! Please continue the excellent work.
@emanuelmma2
@emanuelmma2 2 ай бұрын
Really impressive. Thank you!
@AhmedMahfouzAbd-ElAliem
@AhmedMahfouzAbd-ElAliem 5 ай бұрын
Thank you very much Andrej for your effort in preparing and given such complex material in a very simple manner.
@snuffinperl8059
@snuffinperl8059 5 ай бұрын
Thanka for everything you do. This video, as most others you did so far, is amazing! 🎉
@OMDMIntl
@OMDMIntl 4 ай бұрын
Excellent talk. Need more of these types of informative presentations!
@spincolor
@spincolor 4 ай бұрын
This is a brilliant presentation on LLM. I love the format and approach taken. Thanks so much!
@dkgong
@dkgong 19 күн бұрын
Very insightful and informative video. Thank you.
@vkuse0
@vkuse0 5 ай бұрын
Thankyou. Will keep coming back to this
@HangLe-ou1rm
@HangLe-ou1rm 4 ай бұрын
Great video! So much content delivered in such an easy-to-understanding way!
Let's build the GPT Tokenizer
2:13:35
Andrej Karpathy
Рет қаралды 446 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 419 М.
NO NO NO YES! (Fight SANTA CLAUS) #shorts
00:41
PANDA BOI
Рет қаралды 52 МЛН
WHO'S NEXT?! 🙈👀
00:32
Celine Dept
Рет қаралды 46 МЛН
What is Retrieval-Augmented Generation (RAG)?
6:36
IBM Technology
Рет қаралды 418 М.
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 4,1 МЛН
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters
1:18:38
Simple Introduction to Large Language Models (LLMs)
25:20
Matthew Berman
Рет қаралды 38 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 140 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Ошибка в калькуляторе iPhone
0:22
Romancev768
Рет қаралды 1,1 МЛН
У Nokia 3310 появился конкурент
0:36
AndroHack
Рет қаралды 1,7 МЛН
Я Создал Новый Айфон!
0:59
FLV
Рет қаралды 129 М.