How Large Language Models Work

Рет қаралды 802,344

Күн бұрын

Learn in-demand Machine Learning skills now → ibm.biz/BdK65D
Learn about watsonx → ibm.biz/BdvxRj
Large language models-- or LLMs --are a type of generative pretrained transformer (GPT) that can create human-like text and code. There's a lot of talk about GPTs and LLMs lately, but they've actually been around for years! In this video, Martin Keen briefly explains what a LLM is, how they relate to foundation models, and then covers how they work and how they can be used to address various business problems.
AI news moves fast. Sign up for a monthly newsletter for AI updates from IBM → ibm.biz/BdK65X
#llm #gpt #gpt3 #largelanguagemodel #watsonx #GenerativeAI #Foundationmodels

Пікірлер: 238

@mindofpaul9543 10 ай бұрын

I don't know what is more impressive, LLMs or this guy's ability to write backwards perfectly.

@patmil8314 10 ай бұрын

the whole thing is flipped i guess. He's "writing left handed" and we all know that is impossible

@djham2916 10 ай бұрын

It's mirrors and a screen

@catherinel7718 10 ай бұрын

I have a teacher who can write backwards perfectly. It's creepy lol

@chrismartin9769 9 ай бұрын

There are videos that show you how people do this- it's a visual trick not a dexterity master class ;)

@gatsby66 9 ай бұрын

@@djham2916And smoke!

@dennisash7221 Жыл бұрын

Very nice explanation, short and to the point without getting bogged down in detail that is often misunderstood. I will share this with others

@DilshanBoange Жыл бұрын

Great video presentation! Martin Keen delivers a superbly layman friendly elucidation of what are otherwise very 'high tech talk' to people like me who do not come from a tech based professional background. These types of content are highly appreciable, and in fact motivate further learning on these subjects. Thank you IBM, Mr. Keen & team. Cheers to you all from Sri Lanka.

@AliyaP-z2g Жыл бұрын

P ppl

@pineapple4199 5 ай бұрын

Hi, I'm one English learner, thanks for your comments that express my thought accurately, and your comment is so long and very nice for me to learn English grammar, all the best

@stephenwood4814 Ай бұрын

I think that's about as lucid and informative an explanation of LLMs as you could possibly get in a five-minute video. A superb presentation.

@leighmcdowell 5 күн бұрын

Totally agree; super clear

@saikatnextd Жыл бұрын

Martin keen as awesome as usual...... so natural. I love his talks and somehow I owe to him my understandingof complicated subjects in AI> thanks......

@surfercouple 11 ай бұрын

Nicely done! You explain everything very clearly. This video is concise and informative. I will share with others as an excellent foundational resource for understanding LLMs.

@sivasakthisivagnanam886 4 ай бұрын

I was looking for an intro and what is finetuning. You are a good presenter. And love the presentation. ON point :)

@Whitfield369 4 күн бұрын

Wow… you explained everything, as long as the listener already knows everything.

@hatersgonnalovethis 11 ай бұрын

Wait a minute. Did he really write in mirror handwriting?

@michaelcharlesthearchangel 10 ай бұрын

AI was used to make it appear that he can write on your screen.

@penguinofsky 9 ай бұрын

He writes it normally but the video is flipped horizontally..

@thegamernoobOG 6 ай бұрын

If so he is really good at it

@Nishantmakadiya 3 ай бұрын

@@penguinofskyThese guys are on every video did they really don't have common sense

@ornelaboaventura4078 Ай бұрын

What an amazing explanation, straight to the point and very simple

@Pontie66 Жыл бұрын

Hey, nice job!!! yeah, I'd like to see more of these kinds of subjects in the present and the future as well!!!

@KageManTV 11 ай бұрын

Really really enjoyed this primer. Thank you and great voice and enthusiasm!

@CyberEnlightener Жыл бұрын

The term large can not be referred to as large data; to be precise it is the number of parameters that is large. So slight correction.

@dennisash7221 Жыл бұрын

I do beleive that Large in LLM refers both to the large amount of data as well as the large number of hyper parameters, so both are correct but there is a prerequisite that the data be large not only the paramaters.

@TacoMaster07 Жыл бұрын

There's a lot of params because of the huge dataset

@vexy1987 6 ай бұрын

Seeing Martin here was a pleasant surprise. 🍻

@BalogunAkinbobola 10 күн бұрын

Thank you so much for this vide you made it so easy to understand

@eleazarvillarreal1034 3 күн бұрын

Hi Grate video and excellent explanation. One Question, who tells the model that its answer is right ?

@Senseit5 23 күн бұрын

Very Clear explanation i just get into this llms:)

@MrSouks 6 ай бұрын

Excellent. That did the job for me. Thanks Martin.

@mandyjacotin8321 11 ай бұрын

That's amazing! Our company has a great project that can benefit from this and then use the proceeds to benefit mankind. How can we speak more about this? I am very intrigued.

@sheikhobama3759 Жыл бұрын

1 PB = 1024 TB 1TB = 1024 GB 1GB = 1024 MB 1MB = 1024 KB 1KB = 1024 B 1B = 8 bits So 1 PB = 1024 * 1024 * 1024 * 1024 *1024 Bytes Multiply it again by 8 to get the number of bits. Guys do correct me if I'm wrong!!

@ArgumentumAdHominem Жыл бұрын

Nice explanation! But I am still missing the most important point. How does one control relevance of the produced results? E.g. ChatGPT can answer questions. So far, what you explained is a model that can predict -> generate the next word in a document, given what has already been written. However, given a set of existing sentences, there is a multitude of ways to produce the next sentence, that would be somewhat consistent with the rest of the document. How does one go from plausible text generators to desired text generators?

@Leonhart_93 11 ай бұрын

Statistical likelihood based on the training data. And then there is a random seed so that there a little variation between inputs and outputs, so that the answer isn't always exactly the same for the same prompt.

@vainukulkarni1936 11 ай бұрын

Very nice explanation, are these foundation models are proprietary? How many foundation models exist?

@Jimmy-p9n 14 күн бұрын

I asked CHATGPT 4 the simple question. If i was born on 1st Jan 1900 what date would it be if I added 20,000 days to the day I was born. I couldn't work it out correctly. I showed it how to do it and it produced the correct result. I went out of CHATGPT and back in again and it produced the same 1st wrong answer. If it can't get that right but is able to answer high level university questions it does not show intelligence. It shows that has great memory and able to repeat what is in its memory.

@dmitriyartemyev3329 9 ай бұрын

IBM big thanks to you for all this videos! This videos are really helpfull

@jonniuss 9 ай бұрын

tbh, I just love his voice and ready to listen all his videos 🤗

@decryptifi2265 8 ай бұрын

Very nice and crisp explanation. Love it.. Thanks

@rappresent Жыл бұрын

great presentation, feels like personal asistant, great!

@kuyajon 3 ай бұрын

No matter what progress is made in this space, an LLM won't help me win an argument with my wife.

@mcqudavw 18 күн бұрын

great info.

@jjcsantanna 4 ай бұрын

Thanks for the EXCELLENT content and great education work that you do. I have a question: How the current LLM (eg. GPT o1, LLaMa3.1, Gemma) "decide" what to do? For example, when I ask it to self-judgement the output ("accuracy_level") it is very precise. Another example is when you ask for a text on whatever topic and then my next prompt is asking to export the text to MS Word file. In this last case it will write a python code to generate the MS Word file (but I didn't explicitly said to the LLM to solve my issue by writing a code). How it decided to do it?

@IBMTechnology 4 ай бұрын

I like it!

@SatishDevana Жыл бұрын

Thank you for posting this video. What are the other architectures available apart from Transformer?

@kevnar Жыл бұрын

Imagine a world where wikipedia no longer needs human contributors. You just upload the source material, and an algorithm writes the articles and all sub-pages, listing everything it knows about a certain fictional character because it read the entire book series in half a second. Imagine having a conversation with the world's most eminent Star Wars expert.

@dsharma6694 9 ай бұрын

perfect for learning LLMs

@eddisonlewis8099 11 ай бұрын

Interesting explanation

@GuyHindle Жыл бұрын

What is meant by, when referring to "sequences of words", "understanding"? I mean, what does "understanding" mean in that context?

@Private-qg5il Жыл бұрын

In this presentation, there was not enough detail on Foundation Models as a baseline to then explain what LLMs are.

@Gordin508 Жыл бұрын

The foundation model is trained on a gigantic amount of general text data on a very general task (such as language modeling, which is next-word prediction). The LLM is then created by finetuning a foundation model (a specific case of "pretrained model") on a more specific dataset (e.g. source code), sometimes also for a more specific task. The foundation model is basically a stem cell for LLMs. It does not yet fulfill a specific purspose, but since it saw tons of data can be adapted to (pretty much) everything. Training the foundation model is extremely expensive, but it makes the downstream LLMs much cheaper as they do not need to be trained from scratch.

@Smooth-de7xw Ай бұрын

The only thing I understood was the black backdrop, but I'll be sure to watch it again...cheers

@fawadsheikh4294 4 ай бұрын

Thanks for great video. I like to know more about how we can build business applications using LLM. Like you said we can train LLM on some specific task. Will it be done in cloud where LLM is hosted and training data is uploaded and then we can get useful output from LLM. I hope you get the idea - what me and may be others are also looking for.

@ДмитроПрищепа-д3я 3 ай бұрын

Yep, you can do something like this if you really wanted to (and if you have like a crap ton of usable data to make a dataset out of). You spin up a rather expensive cloud server, then you download the pure, non-quantized weights of the model you like onto that server together with your data and set up a fine-tune of that model on your dataset. The result will ideally be a model that scores higher on your dataset than the base model. Although, if you want a secret, most, if not every single company that uses these chatbots doesn't ever bother with something like this, they just use pure chatgpt api from openAI with their own custom system prompt because it's cheaper in short to mid term. Downside is hilarity that can happen with a chatbot like that because, again, it's just the base model that has been told how to act by the system prompt, not fundamentally changed in its weights to act that way. I've seen a few funny examples of this when one candidate for a job has been "interviewed" by a chatbot and, upon realizing that, he randomly asked the "HR" to write a python script that inverts a binary tree, which the chatbot swiftly did with zero questions and in less than half a minute.

@peterprogress Жыл бұрын

I've liked and subscribed and done it again a thousand times in my mind

@Pontie66 Жыл бұрын

Hi Martin, are you there around? Could you please talk about " Emerging LLM App Stack" ? Thanks in advance!

@vicweast Жыл бұрын

Very nicely done.

@narayanamurthy5397 11 ай бұрын

Knowig about LLM Model Work Mr. Martin Keen. Can you larger focus on LLM Modelling and what exact related stuff(program skills) is requried. Thank you so much it was pleasant video i appreciated.

@SuperRider-RS 9 ай бұрын

Very elaborate explanation. Thank you

@poornimak.r.a7345 5 ай бұрын

I'd like to learn LLM from scratch. Is there any roadmap on how to learn LLM thoroughly ??

@amparoconsuelo9451 Жыл бұрын

Can a subsequent SFT and RTHF with different, additional or lesser contents change the character, improve, or degrade a GPT model?

@NicholasDWilson 8 ай бұрын

Lol. I only knew Martin Keen from Brulosophy. This is sort of mindblowing.

@korgond 8 ай бұрын

I get a remote job offer. The duty is AI training for LLM. Shall i go for it? What do you think?

@hoti47 7 ай бұрын

Go for it!

@1yndonn3u 4 ай бұрын

So, You said this about 4 months ago, what r u doing today? AI training for LLM?

@edchelstephens 4 ай бұрын

Thank you sir!!

@yaguabina Жыл бұрын

Does anyone know what program he uses to sketch on screen like that?

@sebbejohansson Жыл бұрын

It's a glass window. He is physically writing on it. For it to show the correct way (and him not having to write backwards) they just flip the image!

@o__bean__o 6 ай бұрын

@@sebbejohansson how it's glowing?

@brookster23701 5 ай бұрын

Any suggestions on implementing LLMs for RPG AS400 coding?

@ritambharachauhan59 7 ай бұрын

can you guys create some example of usin/ creating llm?

@BAHAGEELAHMED 7 ай бұрын

So LLM is just for text we can't use for Automation staff ?

@hi5wifi-s567 6 ай бұрын

What about customers service with movies searching?

@chetanrawatji 11 ай бұрын

Thank You Sir ❤

@mauricehunter7803 9 ай бұрын

Other than the physical limitation of space like any other computer has, it seems to me that technology like this should be applicable to robotics and allow for creation of much smarter and adaptive robotics projects and creations.

@Chestnut_Ella_Eq 5 ай бұрын

Very good!

Жыл бұрын

Greate explanation ❤

@cushconsultinggroup Жыл бұрын

Intro to LLM’s. Thanks

@kiyonmcdowell5603 Жыл бұрын

What's the difference between large language models and text to speech

@EmpoweredWithZarathos2314 Жыл бұрын

such a great video

@nuwayir Жыл бұрын

so the transformers only for the language and text related things??

@freelancerZohaib Жыл бұрын

no for the image processing too

@freelancerZohaib Жыл бұрын

Transformer models, originally developed for natural language processing tasks, have been extended to computer vision tasks as well. Vision Transformer (ViT) is an example of a transformer model adapted for image processing. Instead of using convolutional layers, ViT uses self-attention mechanisms to capture relationships between different parts of an image.

@She_cooks2023 10 ай бұрын

Amazing!

@ApPillon 11 ай бұрын

Thanks dude

@ethercreatures Ай бұрын

Somehow. I believe the equipment/software take care of this for him. What is the name of the package, i'd love to do this...

@shravanardhapure4961 Жыл бұрын

What is quantized version of models, how it would be created?

@tonyhawk123 Жыл бұрын

A model consists of lots of numbers. Those numbers would be smaller. Fewer bits per number.

@blessingukachukwu 7 ай бұрын

Very nicely

@tekad_ Жыл бұрын

How did you learn to write backwards

@AIpowerment Жыл бұрын

Did you only mirror the screen and it looks like you can write RTL, isnt it?! wow

@IBMTechnology Жыл бұрын

See ibm.biz/write-backwards

@siddhantdeokar 2 ай бұрын

i'm more interested in how this video was recorded

@eregoldamite8739 9 ай бұрын

How are you able to write that way

@Saturn_Enslaved 8 ай бұрын

My chemistry professor does videos with one and explains it in a video: Chemistry with Dr. Steph (thats her Channel), it's the featured video on her page

@SuccessMindset2180 5 ай бұрын

That’s a very handy way to find limits of AI

@jeu198 7 ай бұрын

How does chat GPT make graphs? That's not even language. I got it to make a graph plotting the entropy change of the universe between the Big Bang and entropy heat death. It chose appropriate units, labled the graph with notable events and even put the legend at the bottom right like I asked.

@krishnakishorenamburi9761 11 ай бұрын

@2:15 a different sequence. this is just for fun .

@schonsospaet22 9 ай бұрын

Thank you for explaining! 🪲 Min. 3:37 is the major "bug" 🐞 within the learning system, *it does not start off with a related guess, it's random.* 🌬 I can't wait until the *brain slice chips* can last longer and get trained like a real human brain that is actually learning by feelings and repeating instead of random guessing and then correcting itself until the answer is appropriate. They could soon replace A.I. technology completely, so maybe we shouldn't hype too much about it. After all the effort, energy and money we put into A.I. and new technology, it's no doubt that *we could have educated our children better* instead of creating a fake new world, based on pseudo knowledge extracted from the web. 👨‍👩‍👧‍👦👨‍👩‍👧‍👧 Nobody want's to be r3placed without having the benefit of the machine. General taxes on machines and automated digital services could fund better education for humans. Dear A.I.: You know what is real fun? Planting a tree in real life! 🍒

@TheDunningKrugerEffectisReal 3 ай бұрын

2:43 squeaky sounds

@Laura-sg6ss 6 күн бұрын

🤝🌟

@DjVortex-w Жыл бұрын

How does ChatGPT know about itself and its own behavior? If you ask questions about those topics, it will answer intelligently and accurately about itself and its own behavior. It will not just spout random from patterns from the internet. How does it know this?

@dennisash7221 Жыл бұрын

To start with ChatGPT does not "know itself" it is not self aware, what you are seeing when GPT answers the question "Who are you?" is a pre programmed response that has been put there by the trainers of the model, something like toy with prerecorded messages that you can hear when pressing a button or pulling a string. ChatGPT does not "know" anything it simply responds to your prompts or as you see them your questions with the appropriate answers.

@Joyboyz_YT Жыл бұрын

GPT doesn't possess genuine awareness, but it can certainly mimic it to some extent

@lexiZero-w7n Жыл бұрын

Why does a gigabyte have more words then a petabyte? I am lost already!!! 1 Gig =178 million words, 1 petabyte is 1.8x10^14 words, and there are only 750,000 words in the dictionary?

@turna23 Жыл бұрын

I got this far, stopped the video and searched for a comment like this. Why isn't this the top comment?

@abdulmueed2844 Жыл бұрын

its not total unique words… basically its text from different websites its different sentences … so lets say you want llm to answer you about coding you train it on all the data on stackoverflow, leetcode etc every available resource … so it knows when users asked questions how to run loop in java the replies were x,y,z … its more of glorified and better google search that feels like intelligence …

@dasikakn Жыл бұрын

He said 178m words in a 1 GB sized file. And a petabyte sized file has 1 million _gigabytes_ in it. So, loosely speaking, you multiply 178m with 1 million to get number of words in an LLM. But…It’s not being fed unique words. It’s getting word patterns. Think about how we speak…our sentences are word patterns that we use in mostly predictable structures and then fill in the blank with more rich words as we get older to convey what want to say with synonyms etc.

@intptointp Жыл бұрын

What makes knowledge so complex is not the words, but the way the words are used. Choose any word and you will see that it is linked with hundreds of topics and contexts. If I say draw, I could be talking about drawing water drawing class drawing during class drawing my friend drawing a dog drawing a long time drawing that sold for a lot of money I like drawing And so on. These all code for a different idea. And it is these “ideas” or relationships that foundation models encoded. With these relationships, you now have the probabilistic weights that allow you to construct realistic and correct sounding sentences that are also likely accurate because of the enormous dataset it was trained on. Another context idea. You want to connect fish to swim. This is highly weighted in the llm.

@ernststravoblofeld Жыл бұрын

Typo

@Trey_v3.3 7 ай бұрын

Knowing how these work only makes the idea that companies have started using LLM's to make decisions seem even more stupid than I already thought it was

@echtlahm 6 ай бұрын

Unbelievable how he writes mirrored words so quick

@Netryon 2 ай бұрын

If you see it as a job you would do it's really that complex to get payed. Seeing all the job boards and career advertisement just hides and covers that problem. 2:49 you write an algorithm, let you computer do the job and one of hundreds of tries is successful after witch having a small working sample you do it to all those gigabytes.

@saadanees7989 Жыл бұрын

Is this video mirrored?

@VRchitecture Жыл бұрын

Something tells me “The sky is the limit” here 👀

@flanker909 11 күн бұрын

So fundamentally, a computer will always remain "dumb" since its learning through "guessing".

@niket1231 11 ай бұрын

Need one use case

@Scott-ff2oe 19 күн бұрын

Deep learning to generate LLMs

@kylebrault4414 2 ай бұрын

Yea, but did the LLM use a 30 minute or a 60 minute boil?

@rangarajannarasimhan341 Жыл бұрын

Lucid, thanks

@shshe6515 11 ай бұрын

Still dont get it

@makaveli087 3 ай бұрын

I'm way more impressed with the Digital "Dry-Erase Board" than all the useless AI crap. That's really nice.

@dirkbruenner Жыл бұрын

How does this presentation work? You do are not mirror writing behind a glass pane, do you?

@sebbejohansson Жыл бұрын

Yea, it's a glass window! He is physically writing on it. For it to show the correct way (and him not having to write backwards) they just flip the image!

@ClifCollins-k8d 7 ай бұрын

I am still in the dark as to the purpose of LLM. I can see no practical purpose. Just as in the 70's we had parallel processing (Cray 1) that went nowhere except in a very few uses (GPU). "You need a dictionary", sure you could scab Webster's or Oxford's source code, kind of illegal. The other issue is languages are very dynamic, just as our political boundaries move constantly. The reality is that most companies (IBM, GM, Amazon, USPO, ...) could work internally with maybe 500 words and terms. The rest are simply a "list of" which is specific to a give term (boy names, car parts, products,...). The issue is then who maintains the list. Whether LLM, manual syntax scripts, button, or popup forms, the result is the same "do this action with these qualifiers". LLM is still just another special application on top of conventional applications. We still cannot add two numbers (we add a range of numbers). We still program in 1 dimension, in black and white, in a computer languages that we cannot read or understand. ("A = 1") I do not know what "A" is, I do not know what "1" is, I do not know anything about the why, when, validity, usefulness, or purpose. Static technologies we do not need. Alternative way of saying same thing we do not need. Knowledge is knowledge (1 foot equals 12 inches) much knowledge can never be derived. The IPhone cannot be answered with one hand (Slide to Answser). I cannot set some of my clocks without documentation, and why do I have to set them. Fix the simple. Research is great, fine, but do not propagate sales hype over progress. With 49 years with no progress in software technology I get pissed that we have done nothing. I see LLM as just another application layer, if it helps, great. The real answer is to have user definable context. Absolue access by the user of his/her own information. User access to all source code. User controlled security. User absolute access to the information, communication, and hardware that they own. Not another application that have little or no control over.

@DorisSavage 2 ай бұрын

good

@uhrcru Жыл бұрын

NİCE VİD O7

@pdjhh Жыл бұрын

So LLM based AI is just language not ‘intelligence’? Based on what it’s read it knows or guesses what usually comes next? So zero intelligence?

@mauricehunter7803 9 ай бұрын

From what I can tell of the subject matter it's more of a mimicked intelligence. That's why the analogy of a parrot was used. Cause this technology can learn, repeat back and limitedly guess what's coming next. But there's a certain level of depth and nuance that a human posses that parrots and chat GPT tech do not.

@PhilippeBazinet Ай бұрын

But LLMs will never understand how to brew beer better than you!

@RC19786 Жыл бұрын

could haver been better, most of it was speculative when it came to application building, not to mention the laws governing it

@NihalNelsonD 3 ай бұрын

LLM = Large Language Model 😲

@menacetosociety1471 2 ай бұрын

LLM = Lotta Lil Money Technology

@sankarnatarajan8109 5 ай бұрын

eventually LLM develops LLM so no human needed, This is not far way i guess given rapid speed of this technology. It's really scary for future generation . what type of employment does exists any guess

@Balthazar2242 Жыл бұрын

How is he writing backwards

@IBMTechnology Жыл бұрын

See ibm.biz/write-backwards

@karolinasobczyk-kozowska3717 Жыл бұрын

Wow! It's a clever idea 😊

@cvspvr 8 ай бұрын

@@IBMTechnology oh yeah, then how come your tattoo is the right way round?

@micc1211 7 ай бұрын

Write normally, then mirror the video. Should work. Notice how he is writing with his left hand, yet most people are right handed.

@ryanmacalandag5279 6 ай бұрын

All of you WRONG. All of it was written before they started. As they filmed, he's actually ERASING the text as he goes along. He had to learn how to speak backwards though which I think is impressive.