Transformers for beginners | What are they and how do they work

  Рет қаралды 139,780

AssemblyAI

AssemblyAI

Күн бұрын

This week we’re looking into transformers. Transformers were introduced a couple of years ago with the paper Attention is All You Need by Google Researchers. Since its introduction transformers has been widely adopted in the industry.
Get your Free Token for AssemblyAI Speech-To-Text API 👇
www.assemblyai.com/?...
Models like BERT, GPT-3 made groundbreaking improvements in the world of NLP using transformers. Since then model libraries like hugging face made it possible for everyone to use transformer based models in their projects. But what are transformers and how do they work? How are they different from other deep learning models like RNNs, LSTMs? Why are they better?
In this video, we learn about it all!
Some of my favorite resources on Transformers:
The original paper - arxiv.org/pdf/1706.03762.pdf
If you’re interested in following the original paper with the code - nlp.seas.harvard.edu/2018/04/0...
The Illustrated Transformer - jalammar.github.io/illustrate...
Blog about positional encodings - kazemnejad.com/blog/transform...
About attention - Visualizing A Neural Machine Translation Model - jalammar.github.io/visualizin...
Layer normalization - arxiv.org/abs/1607.06450
Some images used in this video are from:
colah.github.io/posts/2015-08...
jalammar.github.io/visualizin...
/ how-to-easily-build-a-...
/ elegant-intuitions-beh...

Пікірлер: 141
@ashermai2962
@ashermai2962 2 жыл бұрын
This channel deserves more views and likes
@AssemblyAI
@AssemblyAI 2 жыл бұрын
Thank you Asher!
@sarc007
@sarc007 Жыл бұрын
I agree
@pierluigiurru962
@pierluigiurru962 6 ай бұрын
This is clearest explanation of transformers I’ve found so far, and I personally have seen many trying to wrap my head around them. No skimming over details. Very well done!
@testing3562
@testing3562 5 ай бұрын
I am a programmer, I have created many tools that were actually very useful. I even claim that I have 10+ years experience. But I feel very bad to realize that I am so dumb that I did not understand anything after the first 10 minutes of the video.
@sebastianmata9603
@sebastianmata9603 Ай бұрын
They explain it with apples and pears but is actually a very mathematical and elaborated process, if you're not the kind of person that can remember easily how work the sine and cosine functions and do matrix multiplication for fun, is just a little bit harder to get it
@moonlight-td8ed
@moonlight-td8ed 11 күн бұрын
BRUH JUST REWATCH IT AGAIN... THE VIDEO IS A 10/10
@andybrice2711
@andybrice2711 25 күн бұрын
Positional encodings are not that weird when you think of them as being similar to the hands on a clock: It's a way of representing arbitrarily long periods of time, within a confined space, with smooth continuous movement and no sudden jumps. Picture the tips of clock hands. Their vertical position follows a sine wave, their horizontal position follows a cosine wave. And we add precision with more hands moving at different speeds.
@stevemassicotte4068
@stevemassicotte4068 Жыл бұрын
@16:14,, the binary table is wrong, there are two sevens. The second column should start with 8 and not a second 7. Attention is all you need ;) Thanks for the video !
@BCSEbadulIslam
@BCSEbadulIslam 2 ай бұрын
Came here to comment the same 👍
@dooseobkim2100
@dooseobkim2100 9 ай бұрын
You are my savior for being actually able to get ready to read all of those AI related papers which I’m completely unaware of. I was stuck at the part of my thesis which I have to provide theoretical background of ChatGPT. As a business student I’m super grateful to learn these knowledges in computer science through your short lecture👍👍
@PeterKoman
@PeterKoman Жыл бұрын
Finally a transformer video that actually explains the theory in understandable way. Many thanks.
@AssemblyAI
@AssemblyAI Жыл бұрын
That's great to hear, thank you Peter!
@malayali_thaaram
@malayali_thaaram Жыл бұрын
Yes!!! I agree! Finally!
@mudasserqayyom9947
@mudasserqayyom9947 12 күн бұрын
I'm watching lot of videos of Transformers, But that is exactly I want. Thank You So Much Ma'am. And also AssemblyAl.
@moonlight-td8ed
@moonlight-td8ed 11 күн бұрын
cleanest and most informative video ever.. covered whole attention is all you need paper in 19 mins.. damn.. thank you MISRA TURP and assembly ai
@moeal5110
@moeal5110 7 ай бұрын
This is most clear and resourceful video I've seen. Thank you for your hard work and for sharing these resources
@bdoriandasilva
@bdoriandasilva Жыл бұрын
Great video with a clear explanation. thank you!
@Yaddu143
@Yaddu143 Жыл бұрын
I really want you talk about attention. Thank you, shinning in this video.
@reshamgaire4188
@reshamgaire4188 7 ай бұрын
Finally found a perfect video that cleared all my confusions. Thank you so much ma'am, may god bless you 🙏
@nikhilshrestha4711
@nikhilshrestha4711 Жыл бұрын
really love how you described the model. easier to understand 🙌
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad it was helpful!
@Zulu369
@Zulu369 11 ай бұрын
This video is the best technical explanation I have seen in years. Although Transformers are a breakthrough in the field in NLP, I am convinced that they do not describe completely and satisfactorily, the way humans process language. For all civilizations, spoken language predates written language in communications. Those who do not read and write, still communicate clearly with others. This means humans do not represent natural language in their brains in terms of words, syntax and position of tokens but rather in terms of symbols, images and multimedia shows that make up stories we relate to. Written language comes only later as an extra layer of communication to express transparently these internal representations that we carry within ourselves. If AI is able to access and decode these internal representations, then the written language, the extra layer, becomes a lot easier to understand, organize, and put on paper with simple techniques rather than using these intricate Transformers that I consider as temporary and unnatural ways of describing natural languages.
@rokljhui864
@rokljhui864 10 ай бұрын
Your idea is represented above , in words, existing separately from your mind. Surely most intelligence is contained within written language, mathematical expression and images.
@Zulu369
@Zulu369 10 ай бұрын
@@rokljhui864 As I explained above, written words make up THE extra layer that is actually not necessary once you learn more persuasive communications techniques.
@evetsnilrac9689
@evetsnilrac9689 6 ай бұрын
​@@rokljhui864 "Surely" is not how you start an intelligent hypothesis. You must explain the rationale for your belief since it is not at all readily apparent that the intelligence to process written language was not already in our brains so that we could conceive of and learn written language.
@evetsnilrac9689
@evetsnilrac9689 6 ай бұрын
This is a crucial point to understand for all of us interested in fully harnessing what we perceive to be the true potential of this technology. I would start with the Adamic symbol-based language.
@yourshanky
@yourshanky 11 ай бұрын
Excellent explanation !! Sharp and clear. Thanks for sharing this.
@anandanv2361
@anandanv2361 11 ай бұрын
The way you explained the concept was awesome. It is very easy to follow.👍
@sivad2895
@sivad2895 7 ай бұрын
The best video on transformer architecture with great explanations and charming presentation.
@imagnihton2
@imagnihton2 Жыл бұрын
This made the concept sound incredibly simple compared to some other sources... Amazing!
@AssemblyAI
@AssemblyAI Жыл бұрын
Great to hear, thank you!
@mohamadhasanzeinali3674
@mohamadhasanzeinali3674 Жыл бұрын
I saw numerous videos about Transformers architecture. In my opinion, your video is the best among them. Appreciate that.
@AssemblyAI
@AssemblyAI Жыл бұрын
Thank you, that is great to hear. :)
@dannown
@dannown Жыл бұрын
This is a really lovely video -- very specific and detailed, but also followable. Thanks!
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad it was helpful!
@otsogileonalepelo9610
@otsogileonalepelo9610 Жыл бұрын
Just WOW! You broke down these concepts nicely. Thank you. Live long and prosper 🖖🖖
@AssemblyAI
@AssemblyAI Жыл бұрын
Thank you!
@nikhil182
@nikhil182 Жыл бұрын
Thank you so much!💓this has to be the best introduction video to Transformers. We are planning to use Transformers for our Video Processing project.
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad it was helpful!
@amitsingh7684
@amitsingh7684 2 күн бұрын
very nicely explained with clear details
@GeorgeZoto
@GeorgeZoto 7 ай бұрын
Great and both low and high level descprition of transformers, thank you for creating this useful resource :)
@abinav92
@abinav92 Жыл бұрын
Best video on intro to transformers!!!
@carlosroquesuarezgurruchag8681
@carlosroquesuarezgurruchag8681 Жыл бұрын
Thx for the time. Very clear the explanation
@kalyandey5195
@kalyandey5195 3 ай бұрын
Awesome!! crystal clear explanation!!!
@vivekpetrolhead
@vivekpetrolhead 4 ай бұрын
Best explanation for beginners I've seen besides statquest
@geekyprogrammer4831
@geekyprogrammer4831 Жыл бұрын
This high quality video deserves a lot more views!
@AssemblyAI
@AssemblyAI Жыл бұрын
Thank you!
@goelnikhils
@goelnikhils Жыл бұрын
Amazing Explanation. Vow. Thanks a lot
@sanketdeshmukh491
@sanketdeshmukh491 Жыл бұрын
Thank You for in depth explanation. Kudos!!!
@AssemblyAI
@AssemblyAI Жыл бұрын
You're very welcome!
@shubham-pp4cw
@shubham-pp4cw 2 жыл бұрын
clear explanation of quiet complex topic and explained easily in shorted period time
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad to hear you liked it!
@bysedova
@bysedova 7 ай бұрын
Please make a detailed video about self-attantion! Thank you for your explanation! I like you haven't used difficult math terms and you have tried to explain for understanding with easy material supply.
@_joshwalter_
@_joshwalter_ 9 ай бұрын
This is phenomenal!
@pyaephyo3633
@pyaephyo3633 10 ай бұрын
i love it. Your explanation is easy to understand.
@hussainsalih3520
@hussainsalih3520 Жыл бұрын
amazing keep doing this amazing tutorials :)
@maryammoradbeigi6690
@maryammoradbeigi6690 Жыл бұрын
Incredible explanation on the transformer... Amazing video. Thanks a lot
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad you liked it!
@donevo1
@donevo1 Жыл бұрын
very nice presentation! in 12:18 you say that attention is on 8 words. from reading the paper I think that attention is on ALL the words, and 8 is the number of heads: each word vector (D=512) is split to 8, i.e vector dimention in each head is 64.
@rodi4850
@rodi4850 Жыл бұрын
best explanation!
@abrahamowos
@abrahamowos Жыл бұрын
A question @ 11:30 : if for instance the values v are really large and you multiple them by the results from the softmax layer. Won't the resulting weighted be too high after adding them together?
@JackoMcW
@JackoMcW Жыл бұрын
I'm not sure I understand your question or what you mean by "too high," but consider that all of those softmax values will be
@AddisuSeteye
@AddisuSeteye Жыл бұрын
Amazing explanation. I can't wait to watch your explanation on another AI related topic.
@AssemblyAI
@AssemblyAI Жыл бұрын
More to come!
@wasifrock687
@wasifrock687 Жыл бұрын
very well explained. thank you!
@AssemblyAI
@AssemblyAI Жыл бұрын
Glad it was helpful!
@VaibhavPatil-rx7pc
@VaibhavPatil-rx7pc 10 ай бұрын
smile and learn and clean explaniation!!!
@salamander7715
@salamander7715 8 ай бұрын
Seeing all the comments of people saying that this video made things simple just makes me feel stupid ahah! This video is amazing and the explanations are great, but i can't say i've understood more then 35% of the concepts. I'll have to watch this several times for sure
@wenshufan
@wenshufan 10 ай бұрын
Thank you for explaining the transformer in detail. However, I still don't get how do you train the Q,K,V matrix. The attention mechanism is calculated by from them. What type of feedback/truth can one use to train those matrix values then?
@krishnakumarik208
@krishnakumarik208 6 ай бұрын
VERY GOOD EXPLANATION.
@amigospot
@amigospot 2 жыл бұрын
Nice video for a fairly complex architecture!
@AssemblyAI
@AssemblyAI 2 жыл бұрын
Thanks Hyder! - Mısra
@MrTheyosyos
@MrTheyosyos Жыл бұрын
"attentions for beginners" will be great :)
@keithwins
@keithwins 4 ай бұрын
Thank you that was excellent
@juliennoel3061
@juliennoel3061 2 ай бұрын
hi! oh yeah please a specific video on 'attention' 🙂 - And also : 'great job you are doing! Congrats! Thumbs !!'
@jaysoni6674
@jaysoni6674 7 ай бұрын
thank you soo much, damn, love your explainations
@user-fp5sx2vr4z
@user-fp5sx2vr4z 10 ай бұрын
Thank you for the presentation, it has been so insightful. I wish you made a video about the word embeddings of the transformers. Thanks
@AssemblyAI
@AssemblyAI 10 ай бұрын
Great suggestion!
@lexflow2319
@lexflow2319 Жыл бұрын
I don't understand why there are 6 decoders and encoders. The diagram shows 1 each. Also, what is the output as input to the decoder. Is that the last output from final softmax
@thebiggerpicture__
@thebiggerpicture__ Жыл бұрын
Great video. Thanks!
@AssemblyAI
@AssemblyAI Жыл бұрын
You're welcome :)
@rufus9322
@rufus9322 Жыл бұрын
Thank you for your video 🤗 How to understanding more details about word embedding method in Transformer model?
@jayanthkothapalli9.2
@jayanthkothapalli9.2 9 ай бұрын
Great work mam. You made it simple to understand.
@talktovipin1
@talktovipin1 3 ай бұрын
Very nice explanation. Incorporating animations into the images while explaining would enhance comprehension and make it even more beneficial.
@niyatisrivastava4-yearb.te820
@niyatisrivastava4-yearb.te820 4 ай бұрын
best explanation
@devraj241
@devraj241 11 ай бұрын
great video, well explained!
@near_.
@near_. 10 ай бұрын
What's the purpose of output embedding?? What are we feeding in that???
@6001navi
@6001navi 9 ай бұрын
awesome explanation
@0Tyr
@0Tyr 2 жыл бұрын
Very informative channel, and well presented..
@AssemblyAI
@AssemblyAI 2 жыл бұрын
Thank you! - Mısra
@archowdhury007
@archowdhury007 10 ай бұрын
Beautifully explained. Loved it. First time I understood the transformer model so easily. Great work. Please keep creating more such content. Thanks.
@kartikgadad9285
@kartikgadad9285 5 ай бұрын
Thanks for explaining Transformers, can we have a video on Embeddings, seems super interesting. The Positional Encoding part was difficult to understand, as it has been just taken from abstract level, can we find better video on positional encoding?
@nogur9
@nogur9 9 ай бұрын
Thanks :)
@rokljhui864
@rokljhui864 10 ай бұрын
Interesting. Sounds like a Fourier transform; Obtaining a frequency distribution from a time-series, reveals the underlying frequency components and amplitudes. Are you essentially distilling the 'word cycles' from the sentences to obtain meaning from the word patterns across different word combination lengths (from single word to many thousand) And, optimising the predictability of the next word automatically optimises for the appropriate word combination lengths, that align with actual meaning. i.e Understanding 'peaks' are optimised similar to the fundamental frequencies in a Fourier transform. ?
@amparoconsuelo9451
@amparoconsuelo9451 8 ай бұрын
I have read books and watched videos on Transformers. I still don't understand Transformers. I want to order from Amazon an assembly Transformer kit, work on it and have a Transformer I understand the way I undestand how Lotus 123 and Wordstar were created.
@actorjohanmatsfredkarlsson2293
@actorjohanmatsfredkarlsson2293 Жыл бұрын
Great video. I’m missing how the attation layers: queries, keys and values and the output weights are trainee? Also what was the values matrix for?
@MrAmgadHasan
@MrAmgadHasan 11 ай бұрын
They are trained just like any neural network: we have a loss function that compares the model's output with the desired output, and then this loss is propagated backwards to the weights and biases and we use gradient descent to update the weights. Lookup "back propagation" for more info or just look up"how neural networks are trained"
@user-ft2jx8io9g
@user-ft2jx8io9g 7 ай бұрын
geart work, may allah bless you and guide you 🥰🥰😍😍
@user-tt6tg3eb7n
@user-tt6tg3eb7n 7 ай бұрын
I overall liked the video a lot. I just do not thing is enough to understand the whole concept. Especially masked multi head attention layer was missing and how the actually outcome of the model is created (translation etc)
@manjz7hm
@manjz7hm 4 ай бұрын
You explained well , but my brain not digesting it 😂
@ankit9401
@ankit9401 2 жыл бұрын
You are awesome and I appreciate your efforts. After watching your video, I can say now I understand the transformer architecture. I have a query. According to original BERT paper, two objectives used during training: Masked Language Model and Next Sentence Prediction. Are these training objectives present in original or all transformer models or they are specifically used for BERT ? I hope you make video to explain attention and BERT model in future 😊
@AssemblyAI
@AssemblyAI 2 жыл бұрын
Great to hear the video was helpful Ankit! These are not the tasks that were in the original transformer model. But I think they are not specific to BERT. Other architectures also use same/similar tasks to train their models. We have a BERT video in the channel by the way. Here it is: kzbin.info/www/bejne/bJLLqYOKgb-ri5o - Mısra
@strongsyedaa7378
@strongsyedaa7378 Жыл бұрын
@@AssemblyAI So instead of using RNN & LSTM we directly use Transformers?
@kellenswain2049
@kellenswain2049 11 ай бұрын
11:06 from reading the paper, 64 is not the square root of the length of QKV vectors, it looks like it is d_model/h where h is the number of heads used in multihead attention. And so then I assume d_model is the length of the QKV vectors?
@mbrochh82
@mbrochh82 Жыл бұрын
I wish someone would explain how exactly the backpropagation works and what values exactly get nudged and tweaked during learning (and by which means)
@near_.
@near_. 10 ай бұрын
What's the purpose of output embedding?? What are we feeding in that???
@guimaraesalysson
@guimaraesalysson Жыл бұрын
Theres any video about attention mechanism ?
@AssemblyAI
@AssemblyAI Жыл бұрын
Not yet but it's a good idea!
@wp1300
@wp1300 10 ай бұрын
13:35 Positional encoding
@near_.
@near_. 10 ай бұрын
What's the purpose of output embedding?? What are we feeding in that???
@EmanueleOlivetti
@EmanueleOlivetti 4 ай бұрын
Around 16:00 the binary representation repeats twice 7 so the right part of the binary encoded numbers is incorrect
@RAZZKIRAN
@RAZZKIRAN Жыл бұрын
thank u
@AssemblyAI
@AssemblyAI Жыл бұрын
You're welcome!
@andersonsystem2
@andersonsystem2 2 жыл бұрын
Good video
@AssemblyAI
@AssemblyAI 2 жыл бұрын
Glad you enjoyed it :)
@titusfx
@titusfx 10 ай бұрын
I'm still concern how all these papers don't have any mathematical rigour, there isn't one theorem, there is nothing. And it works....🤯 I can't imagine when the rigourosity start coming in, what would be the results. I'm starting to believe that deep learning is Physics for knowledge 😅
@nirmesh44
@nirmesh44 3 ай бұрын
make attention video
@M7mdal7aj
@M7mdal7aj 7 ай бұрын
thanks but the explanation is not detailed enough. but nice explanation for the positional embedding. thanks
@user-bu3ds4dc6d
@user-bu3ds4dc6d 7 ай бұрын
easiest explanation.
@NielsSwimberghe
@NielsSwimberghe 2 ай бұрын
"You might need to watch this multiple times". You don't say. 😅
@denwo1982
@denwo1982 2 ай бұрын
Chatgpt “explain this video to me as if I was an 8 year old”
@JaredEdwardsPM
@JaredEdwardsPM Жыл бұрын
I feel like you just described how a ouija board works…
@AssemblyAI
@AssemblyAI Жыл бұрын
Comment of the year :D
@marcfruchtman9473
@marcfruchtman9473 Жыл бұрын
Very interesting and informative. Thank you for providing a very detailed explanation of Transformers. One note: The word "Query" is pronounced like Qw-eerie (USA English). The beginning sounds sort of like the sound of "Quarry", or "Quack" but rhyming with dearie.
@frizzsupertramp6434
@frizzsupertramp6434 Жыл бұрын
At 16:44 the binary representations on the right side are wrong (number 7 comes twice, should start with 8 on the right side). (Sorry for being anal 😀)
@AssemblyAI
@AssemblyAI Жыл бұрын
Thanks for the heads up! Video editing gets tedious sometimes :)
@robl39
@robl39 Жыл бұрын
What is disappointing about this video is that you have to know about or understand 50 other concepts first
@roshanverma1123
@roshanverma1123 9 ай бұрын
Great simplified content! Thanks! Btw, you look beautiful!
@prdeshnaser5297
@prdeshnaser5297 2 ай бұрын
needs pauses in speech, after 15min all i hear is a vector of blblblahs. a good video and human illustration of how text gets generated by machines, but that is not the point.
@strongsyedaa7378
@strongsyedaa7378 Жыл бұрын
Why YOU divided by 8?
@homeboundrecords6955
@homeboundrecords6955 Жыл бұрын
knowledgable but not exactly 'beginner' level lol
@AssemblyAI
@AssemblyAI Жыл бұрын
To some it sounds too simple and to some too complex. 🤷‍♀️ Problem with AI topics these days. :D
@verystablegenius4720
@verystablegenius4720 8 ай бұрын
bad ... just bad. you need to put 100X the time if you want to do this right. No real understanding here.
@open_source
@open_source Жыл бұрын
The presentation is nice but are you really trying to compress video time by talking faster? Had to stop the video multiple times to Focus on each concept
@sevovo
@sevovo 9 ай бұрын
Why is the first think I thought is that she must be Turkish? :D
@dlbattle100
@dlbattle100 Жыл бұрын
Carry, kiw, and matresses.
@berkk1993
@berkk1993 11 ай бұрын
Idk who she is but I am sure she is Turkish.
@PrabhatKumar-fn4vy
@PrabhatKumar-fn4vy Жыл бұрын
You are beautiful
What is Transfer Learning? | With code in Keras
10:52
AssemblyAI
Рет қаралды 22 М.
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 89 М.
ВИРУСНЫЕ ВИДЕО / Виноградинка 😅
00:34
Светлый Voiceover
Рет қаралды 8 МЛН
INO IS A KIND ALIEN😂
00:45
INO
Рет қаралды 10 МЛН
MINHA IRMÃ MALVADA CONTRA O GADGET DE TREM DE DOMINÓ 😡 #ferramenta
00:40
Ages 1 - 100 Decide Who Wins $250,000
40:02
MrBeast
Рет қаралды 120 МЛН
Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!
36:15
StatQuest with Josh Starmer
Рет қаралды 533 М.
Transformers for beginners | What are they and how do they work
22:48
Code With Aarohi
Рет қаралды 23 М.
Transformers explained | The architecture behind LLMs
19:48
AI Coffee Break with Letitia
Рет қаралды 15 М.
A Complete Overview of Word Embeddings
17:17
AssemblyAI
Рет қаралды 92 М.
The math behind Attention: Keys, Queries, and Values matrices
36:16
Serrano.Academy
Рет қаралды 185 М.
Transformers, explained: Understand the model behind GPT, BERT, and T5
9:11
Attention Is All You Need
27:07
Yannic Kilcher
Рет қаралды 606 М.
What are Transformer Neural Networks?
16:44
Ari Seff
Рет қаралды 158 М.
ВИРУСНЫЕ ВИДЕО / Виноградинка 😅
00:34
Светлый Voiceover
Рет қаралды 8 МЛН