How Large Language Models Work

  Рет қаралды 437,332

IBM Technology

IBM Technology

10 ай бұрын

Learn about watsonx → ibm.biz/BdvxRj
Large language models-- or LLMs --are a type of generative pretrained transformer (GPT) that can create human-like text and code. There's a lot of talk about GPTs and LLMs lately, but they've actually been around for years! In this video, Martin Keen briefly explains what a LLM is, how they relate to foundation models, and then covers how they work and how they can be used to address various business problems.
#llm #gpt #gpt3 #largelanguagemodel #watsonx #GenerativeAI #Foundationmodels

Пікірлер: 122
@mindofpaul9543
@mindofpaul9543 2 ай бұрын
I don't know what is more impressive, LLMs or this guy's ability to write backwards perfectly.
@patmil8314
@patmil8314 2 ай бұрын
the whole thing is flipped i guess. He's "writing left handed" and we all know that is impossible
@djham2916
@djham2916 2 ай бұрын
It's mirrors and a screen
@catherinel7718
@catherinel7718 Ай бұрын
I have a teacher who can write backwards perfectly. It's creepy lol
@chrismartin9769
@chrismartin9769 Ай бұрын
There are videos that show you how people do this- it's a visual trick not a dexterity master class ;)
@gatsby66
@gatsby66 Ай бұрын
​@@djham2916And smoke!
@dennisash7221
@dennisash7221 9 ай бұрын
Very nice explanation, short and to the point without getting bogged down in detail that is often misunderstood. I will share this with others
@surfercouple
@surfercouple 3 ай бұрын
Nicely done! You explain everything very clearly. This video is concise and informative. I will share with others as an excellent foundational resource for understanding LLMs.
@saikatnextd
@saikatnextd 4 ай бұрын
Martin keen as awesome as usual...... so natural. I love his talks and somehow I owe to him my understandingof complicated subjects in AI> thanks......
@DilshanBoange
@DilshanBoange 8 ай бұрын
Great video presentation! Martin Keen delivers a superbly layman friendly elucidation of what are otherwise very 'high tech talk' to people like me who do not come from a tech based professional background. These types of content are highly appreciable, and in fact motivate further learning on these subjects. Thank you IBM, Mr. Keen & team. Cheers to you all from Sri Lanka.
@user-oq2lz4ux3c
@user-oq2lz4ux3c 6 ай бұрын
P ppl
@KageManTV
@KageManTV 3 ай бұрын
Really really enjoyed this primer. Thank you and great voice and enthusiasm!
@Pontie66
@Pontie66 4 ай бұрын
Hey, nice job!!! yeah, I'd like to see more of these kinds of subjects in the present and the future as well!!!
@rappresent
@rappresent 5 ай бұрын
great presentation, feels like personal asistant, great!
@evgenii.panaite
@evgenii.panaite 28 күн бұрын
tbh, I just love his voice and ready to listen all his videos 🤗
@SuperRider-RS
@SuperRider-RS Ай бұрын
Very elaborate explanation. Thank you
@dmitriyartemyev3329
@dmitriyartemyev3329 Ай бұрын
IBM big thanks to you for all this videos! This videos are really helpfull
@vicweast
@vicweast 3 ай бұрын
Very nicely done.
@SatishDevana
@SatishDevana 4 ай бұрын
Thank you for posting this video. What are the other architectures available apart from Transformer?
@peterprogress
@peterprogress 4 ай бұрын
I've liked and subscribed and done it again a thousand times in my mind
@amparoconsuelo9451
@amparoconsuelo9451 8 ай бұрын
Can a subsequent SFT and RTHF with different, additional or lesser contents change the character, improve, or degrade a GPT model?
7 ай бұрын
Greate explanation ❤
@dsharma6694
@dsharma6694 Ай бұрын
perfect for learning LLMs
@vainukulkarni1936
@vainukulkarni1936 2 ай бұрын
Very nice explanation, are these foundation models are proprietary? How many foundation models exist?
@GuyHindle
@GuyHindle 10 ай бұрын
What is meant by, when referring to "sequences of words", "understanding"? I mean, what does "understanding" mean in that context?
@EmpoweredWithZarathos2314
@EmpoweredWithZarathos2314 6 ай бұрын
such a great video
@chetanrawatji
@chetanrawatji 3 ай бұрын
Thank You Sir ❤
@Pontie66
@Pontie66 4 ай бұрын
Hi Martin, are you there around? Could you please talk about " Emerging LLM App Stack" ? Thanks in advance!
@eddisonlewis8099
@eddisonlewis8099 3 ай бұрын
Interesting explanation
@cushconsultinggroup
@cushconsultinggroup 6 ай бұрын
Intro to LLM’s. Thanks
@Nursultan_karazhigit
@Nursultan_karazhigit 3 ай бұрын
Thanks . How much to build the own LLM
@CyberEnlightener
@CyberEnlightener 9 ай бұрын
The term large can not be referred to as large data; to be precise it is the number of parameters that is large. So slight correction.
@dennisash7221
@dennisash7221 9 ай бұрын
I do beleive that Large in LLM refers both to the large amount of data as well as the large number of hyper parameters, so both are correct but there is a prerequisite that the data be large not only the paramaters.
@TacoMaster07
@TacoMaster07 8 ай бұрын
There's a lot of params because of the huge dataset
@NicholasDWilson
@NicholasDWilson 17 күн бұрын
Lol. I only knew Martin Keen from Brulosophy. This is sort of mindblowing.
@ApPillon
@ApPillon 2 ай бұрын
Thanks dude
@mandyjacotin8321
@mandyjacotin8321 3 ай бұрын
That's amazing! Our company has a great project that can benefit from this and then use the proceeds to benefit mankind. How can we speak more about this? I am very intrigued.
@narayanamurthy5397
@narayanamurthy5397 2 ай бұрын
Knowig about LLM Model Work Mr. Martin Keen. Can you larger focus on LLM Modelling and what exact related stuff(program skills) is requried. Thank you so much it was pleasant video i appreciated.
@korgond
@korgond 10 күн бұрын
I get a remote job offer. The duty is AI training for LLM. Shall i go for it? What do you think?
@kevnar
@kevnar 5 ай бұрын
Imagine a world where wikipedia no longer needs human contributors. You just upload the source material, and an algorithm writes the articles and all sub-pages, listing everything it knows about a certain fictional character because it read the entire book series in half a second. Imagine having a conversation with the world's most eminent Star Wars expert.
@shravanardhapure4961
@shravanardhapure4961 9 ай бұрын
What is quantized version of models, how it would be created?
@tonyhawk123
@tonyhawk123 9 ай бұрын
A model consists of lots of numbers. Those numbers would be smaller. Fewer bits per number.
@She_cooks2023
@She_cooks2023 2 ай бұрын
Amazing!
@ArgumentumAdHominem
@ArgumentumAdHominem 4 ай бұрын
Nice explanation! But I am still missing the most important point. How does one control relevance of the produced results? E.g. ChatGPT can answer questions. So far, what you explained is a model that can predict -> generate the next word in a document, given what has already been written. However, given a set of existing sentences, there is a multitude of ways to produce the next sentence, that would be somewhat consistent with the rest of the document. How does one go from plausible text generators to desired text generators?
@Leonhart_93
@Leonhart_93 3 ай бұрын
Statistical likelihood based on the training data. And then there is a random seed so that there a little variation between inputs and outputs, so that the answer isn't always exactly the same for the same prompt.
@yaguabina
@yaguabina 5 ай бұрын
Does anyone know what program he uses to sketch on screen like that?
@sebbejohansson
@sebbejohansson 4 ай бұрын
It's a glass window. He is physically writing on it. For it to show the correct way (and him not having to write backwards) they just flip the image!
@ChatGPt2001
@ChatGPt2001 2 ай бұрын
Large language models like GPT-3 work by using deep learning techniques, specifically a type of neural network called a transformer. Here's an overview of how they work: 1. **Data Collection**: Large language models are trained on vast amounts of text data from the internet, books, articles, and other sources. This data is used to teach the model about language patterns, grammar, syntax, semantics, and context. 2. **Tokenization**: The text data is tokenized, which means breaking it down into smaller units such as words, subwords, or characters. Each token is assigned a numerical representation. 3. **Training**: The model is trained using a process called supervised learning. During training, the model learns to predict the next word or token in a sequence based on the preceding context. It adjusts its internal parameters (weights and biases) through backpropagation to minimize prediction errors. 4. **Transformer Architecture**: Large language models like GPT-3 use a transformer architecture, which is highly effective for handling sequential data like language. Transformers include attention mechanisms that allow the model to focus on relevant parts of the input sequence while generating output. 5. **Fine-Tuning**: After pre-training on a large dataset, language models can be fine-tuned on specific tasks or domains. This process involves additional training on a smaller dataset related to the target task, which helps the model specialize in that area. 6. **Inference**: Once trained, the language model can generate text by predicting the most likely next tokens given an input prompt. It uses the learned patterns and context from training to generate coherent and contextually relevant responses. 7. **Continual Learning**: Some language models support continual learning, which means they can be updated with new data over time to improve their performance and adapt to changing language patterns. Overall, large language models combine sophisticated neural network architectures, extensive training data, and advanced training techniques to understand and generate human-like text.
@tekad_
@tekad_ 3 ай бұрын
How did you learn to write backwards
@mauricehunter7803
@mauricehunter7803 Ай бұрын
Other than the physical limitation of space like any other computer has, it seems to me that technology like this should be applicable to robotics and allow for creation of much smarter and adaptive robotics projects and creations.
@kiyonmcdowell5603
@kiyonmcdowell5603 3 ай бұрын
What's the difference between large language models and text to speech
@nuwayir
@nuwayir 9 ай бұрын
so the transformers only for the language and text related things??
@user-vo5gv1tk1m
@user-vo5gv1tk1m 6 ай бұрын
no for the image processing too
@user-vo5gv1tk1m
@user-vo5gv1tk1m 6 ай бұрын
Transformer models, originally developed for natural language processing tasks, have been extended to computer vision tasks as well. Vision Transformer (ViT) is an example of a transformer model adapted for image processing. Instead of using convolutional layers, ViT uses self-attention mechanisms to capture relationships between different parts of an image.
@Private-qg5il
@Private-qg5il 10 ай бұрын
In this presentation, there was not enough detail on Foundation Models as a baseline to then explain what LLMs are.
@Gordin508
@Gordin508 10 ай бұрын
The foundation model is trained on a gigantic amount of general text data on a very general task (such as language modeling, which is next-word prediction). The LLM is then created by finetuning a foundation model (a specific case of "pretrained model") on a more specific dataset (e.g. source code), sometimes also for a more specific task. The foundation model is basically a stem cell for LLMs. It does not yet fulfill a specific purspose, but since it saw tons of data can be adapted to (pretty much) everything. Training the foundation model is extremely expensive, but it makes the downstream LLMs much cheaper as they do not need to be trained from scratch.
@rangarajannarasimhan341
@rangarajannarasimhan341 4 ай бұрын
Lucid, thanks
@VRchitecture
@VRchitecture 9 ай бұрын
Something tells me “The sky is the limit” here 👀
@RC19786
@RC19786 8 ай бұрын
could haver been better, most of it was speculative when it came to application building, not to mention the laws governing it
@AIpowerment
@AIpowerment 10 ай бұрын
Did you only mirror the screen and it looks like you can write RTL, isnt it?! wow
@IBMTechnology
@IBMTechnology 10 ай бұрын
See ibm.biz/write-backwards
@niket1231
@niket1231 2 ай бұрын
Need one use case
@hatersgonnalovethis
@hatersgonnalovethis 3 ай бұрын
Wait a minute. Did he really write in mirror handwriting?
@michaelcharlesthearchangel
@michaelcharlesthearchangel 2 ай бұрын
AI was used to make it appear that he can write on your screen.
@penguinofsky
@penguinofsky Ай бұрын
He writes it normally but the video is flipped horizontally..
@krishnakishorenamburi9761
@krishnakishorenamburi9761 2 ай бұрын
@2:15 a different sequence. this is just for fun .
@sheikhobama3759
@sheikhobama3759 4 ай бұрын
1 PB = 1024 TB 1TB = 1024 GB 1GB = 1024 MB 1MB = 1024 KB 1KB = 1024 B 1B = 8 bits So 1 PB = 1024 * 1024 * 1024 * 1024 *1024 Bytes Multiply it again by 8 to get the number of bits. Guys do correct me if I'm wrong!!
@WarpRulez
@WarpRulez 9 ай бұрын
How does ChatGPT know about itself and its own behavior? If you ask questions about those topics, it will answer intelligently and accurately about itself and its own behavior. It will not just spout random from patterns from the internet. How does it know this?
@dennisash7221
@dennisash7221 9 ай бұрын
To start with ChatGPT does not "know itself" it is not self aware, what you are seeing when GPT answers the question "Who are you?" is a pre programmed response that has been put there by the trainers of the model, something like toy with prerecorded messages that you can hear when pressing a button or pulling a string. ChatGPT does not "know" anything it simply responds to your prompts or as you see them your questions with the appropriate answers.
@Joyboy_1044_
@Joyboy_1044_ 6 ай бұрын
GPT doesn't possess genuine awareness, but it can certainly mimic it to some extent
@uhrcru
@uhrcru 3 ай бұрын
NİCE VİD O7
@schonsospaet22
@schonsospaet22 Ай бұрын
Thank you for explaining! 🪲 Min. 3:37 is the major "bug" 🐞 within the learning system, *it does not start off with a related guess, it's random.* 🌬 I can't wait until the *brain slice chips* can last longer and get trained like a real human brain that is actually learning by feelings and repeating instead of random guessing and then correcting itself until the answer is appropriate. They could soon replace A.I. technology completely, so maybe we shouldn't hype too much about it. After all the effort, energy and money we put into A.I. and new technology, it's no doubt that *we could have educated our children better* instead of creating a fake new world, based on pseudo knowledge extracted from the web. 👨‍👩‍👧‍👦👨‍👩‍👧‍👧 Nobody want's to be r3placed without having the benefit of the machine. General taxes on machines and automated digital services could fund better education for humans. Dear A.I.: You know what is real fun? Planting a tree in real life! 🍒
@devperatetechno8151
@devperatetechno8151 10 ай бұрын
but how its possible to an LLM innovate when its being trained with over human knowledge boundaries?
@mauricehunter7803
@mauricehunter7803 Ай бұрын
I'm far from an expert on the matter but the simple answer to your question is that it's programmed to be able to learn and adjust according to many various inputs. Arguable it's probably where robot technology should be headed next. Having an ability to learn and react to that learning.
@eregoldamite8739
@eregoldamite8739 Ай бұрын
How are you able to write that way
@7890tom7890
@7890tom7890 22 күн бұрын
My chemistry professor does videos with one and explains it in a video: Chemistry with Dr. Steph (thats her Channel), it's the featured video on her page
@saadanees7989
@saadanees7989 4 ай бұрын
Is this video mirrored?
@pdjhh
@pdjhh 5 ай бұрын
So LLM based AI is just language not ‘intelligence’? Based on what it’s read it knows or guesses what usually comes next? So zero intelligence?
@mauricehunter7803
@mauricehunter7803 Ай бұрын
From what I can tell of the subject matter it's more of a mimicked intelligence. That's why the analogy of a parrot was used. Cause this technology can learn, repeat back and limitedly guess what's coming next. But there's a certain level of depth and nuance that a human posses that parrots and chat GPT tech do not.
@lmarcelino555
@lmarcelino555 4 ай бұрын
I don’t even know where to begin. 😵‍💫
@dirkbruenner
@dirkbruenner 4 ай бұрын
How does this presentation work? You do are not mirror writing behind a glass pane, do you?
@sebbejohansson
@sebbejohansson 4 ай бұрын
Yea, it's a glass window! He is physically writing on it. For it to show the correct way (and him not having to write backwards) they just flip the image!
@shshe6515
@shshe6515 3 ай бұрын
Still dont get it
@varghesevg5
@varghesevg5 4 ай бұрын
Getting hallucinations!
@boriscrisp518
@boriscrisp518 9 ай бұрын
Ugh corporate videos..... the horror
@user-en4zy4xh7i
@user-en4zy4xh7i 7 ай бұрын
Why does a gigabyte have more words then a petabyte? I am lost already!!! 1 Gig =178 million words, 1 petabyte is 1.8x10^14 words, and there are only 750,000 words in the dictionary?
@turna23
@turna23 7 ай бұрын
I got this far, stopped the video and searched for a comment like this. Why isn't this the top comment?
@abdulmueed2844
@abdulmueed2844 6 ай бұрын
its not total unique words… basically its text from different websites its different sentences … so lets say you want llm to answer you about coding you train it on all the data on stackoverflow, leetcode etc every available resource … so it knows when users asked questions how to run loop in java the replies were x,y,z … its more of glorified and better google search that feels like intelligence …
@dasikakn
@dasikakn 6 ай бұрын
He said 178m words in a 1 GB sized file. And a petabyte sized file has 1 million _gigabytes_ in it. So, loosely speaking, you multiply 178m with 1 million to get number of words in an LLM. But…It’s not being fed unique words. It’s getting word patterns. Think about how we speak…our sentences are word patterns that we use in mostly predictable structures and then fill in the blank with more rich words as we get older to convey what want to say with synonyms etc.
@ereinei
@ereinei 5 ай бұрын
1PB = 1024 TB = 1024 GB
@jks234
@jks234 4 ай бұрын
What makes knowledge so complex is not the words, but the way the words are used. Choose any word and you will see that it is linked with hundreds of topics and contexts. If I say draw, I could be talking about drawing water drawing class drawing during class drawing my friend drawing a dog drawing a long time drawing that sold for a lot of money I like drawing And so on. These all code for a different idea. And it is these “ideas” or relationships that foundation models encoded. With these relationships, you now have the probabilistic weights that allow you to construct realistic and correct sounding sentences that are also likely accurate because of the enormous dataset it was trained on. Another context idea. You want to connect fish to swim. This is highly weighted in the llm.
@Secret4us
@Secret4us 9 күн бұрын
How many 'parameters' does the human brain have, I wonder.
@thiruvetti
@thiruvetti 9 ай бұрын
You could have finished the video by saying an LLM like Chat GPT could have produced the entire explanation for this video.. (I think u hinted the same)
@TheLeppus28
@TheLeppus28 5 ай бұрын
What to do if a Large Language Model after putting all petabytes of data into it is still talking nonsense?
@jonitalia6748
@jonitalia6748 4 ай бұрын
$PLTR
@Balthazar2242
@Balthazar2242 9 ай бұрын
How is he writing backwards
@IBMTechnology
@IBMTechnology 9 ай бұрын
See ibm.biz/write-backwards
@karolinasobczyk-kozowska3717
@karolinasobczyk-kozowska3717 6 ай бұрын
Wow! It's a clever idea 😊
@cvspvr
@cvspvr 3 күн бұрын
@@IBMTechnology oh yeah, then how come your tattoo is the right way round?
@Blazeww
@Blazeww 9 ай бұрын
Isn't it using the most likely thing that humans defined and just uses patterns of what's most expected based on how humans interact and info put in..... that's not complicated. How do they not understand how it works....
@MichaelDomer
@MichaelDomer Ай бұрын
Hire someone next time who can explain it to the average John and Jane. Talk about 7 billion parameters and you already have John and Jane scrathing their head like crazy what the fuck he's talking about. Oh, yeah, some in the comment section understand it... but they're not the average John and Jane... they're often familiar with coding, data, business processes, computers, etc
@AGI-001
@AGI-001 2 ай бұрын
This is scarrry
@spadaacca
@spadaacca 7 ай бұрын
Not a very good video. Really didn't explain much. You could have said so much more in 5:33 than slowly drawing things and talking about business applications.
@ricardog.p2610
@ricardog.p2610 2 ай бұрын
If IBM knows that, why they didnt implement it in the Watson that were useless 😂😂😂
@yaroslavnaidyon
@yaroslavnaidyon 9 ай бұрын
1 petabyte is not 1m gigabytes, it is 1,000 gigabytes. I thought this speech is coming from an engineer but perhaps it is just a hired actor.
@cgoode1057
@cgoode1057 9 ай бұрын
1k terabytes, 1m gigabytes
@Admlass
@Admlass 8 ай бұрын
It's funny how it's always the most ignorant and arrogant one who points out the mistakes of others.
@DrGray-ds6ki
@DrGray-ds6ki 5 ай бұрын
you fool , 1000 GB is one TB not PB
@sunnymon1436
@sunnymon1436 Ай бұрын
Explaining the constituent parts, the end product, is not the same as explaining how something works. Bad video.
GPT - Explained!
9:11
CodeEmporium
Рет қаралды 40 М.
Why Are There So Many Foundation Models?
5:14
IBM Technology
Рет қаралды 25 М.
I Need Your Help..
00:33
Stokes Twins
Рет қаралды 157 МЛН
Эффект Карбонаро и бесконечное пиво
01:00
История одного вокалиста
Рет қаралды 6 МЛН
Can you beat this impossible game?
00:13
LOL
Рет қаралды 58 МЛН
WHY DOES SHE HAVE A REWARD? #youtubecreatorawards
00:41
Levsob
Рет қаралды 41 МЛН
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Simple Introduction to Large Language Models (LLMs)
25:20
Matthew Berman
Рет қаралды 51 М.
SpaceX Falcon 9 Starlink Launch
Reality PC
Рет қаралды 1
RAG for LLMs explained in 3 minutes
3:15
Manny Bernabe
Рет қаралды 14 М.
What are Transformers (Machine Learning Model)?
5:50
IBM Technology
Рет қаралды 351 М.
The most important AI trends in 2024
9:35
IBM Technology
Рет қаралды 206 М.
What Is a Prompt Injection Attack?
10:57
IBM Technology
Рет қаралды 24 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 1,9 МЛН
I Need Your Help..
00:33
Stokes Twins
Рет қаралды 157 МЛН