Transformers for beginners | What are they and how do they work

  Рет қаралды 35,706

Code With Aarohi

Code With Aarohi

Жыл бұрын

Over the past five years, Transformers, a neural network architecture, have completely transformed state-of-the-art natural language processing.
*************************************************************************
For queries: You can comment in comment section or you can mail me at aarohisingla1987@gmail.com
*************************************************************************
The encoder takes the input sentence and converts it into a series of numbers called vectors, which represent the meaning of the words. These vectors are then passed to the decoder, which generates the translated sentence.
Now, the magic of the transformer network lies in how it handles attention. Instead of looking at each word one by one, it considers the entire sentence at once. It calculates a similarity score between each word in the input sentence and every other word, giving higher scores to the words that are more important for translation.
To do this, the transformer network uses a mechanism called self-attention. Self-attention allows the model to weigh the importance of each word in the sentence based on its relevance to other words. By doing this, the model can focus more on the important parts of the sentence and less on the irrelevant ones.
In addition to self-attention, transformer networks also use something called positional encoding. Since the model treats words as individual entities, it doesn't have any inherent understanding of word order. Positional encoding helps the model to understand the sequence of words in a sentence by adding information about their position.
Once the encoder has calculated the attention scores and combined them with positional encoding, the resulting vectors are passed to the decoder. The decoder uses a similar attention mechanism to generate the translated sentence, one word at a time.
Transformers are the model behind GPT, BERT, and T5
#transformers #naturallanguageprocessing #nlp

Пікірлер: 108
@lyeln
@lyeln 4 ай бұрын
This is the only video around that REALLY EXPLAINS the transformer! I immensely appreciate your step by step approach and the use of the example. Thank you so much 🙏🙏🙏
@CodeWithAarohi
@CodeWithAarohi 4 ай бұрын
Glad it was helpful!
@eng.reemali9214
@eng.reemali9214 Ай бұрын
exactly
@mdfarhadhussain
@mdfarhadhussain 5 ай бұрын
Very nice high level description of Transformer
@CodeWithAarohi
@CodeWithAarohi 5 ай бұрын
Glad you think so!
@exoticcoder5365
@exoticcoder5365 11 ай бұрын
Very well explained ! I can instantly grab the concept ! Thank you Miss !
@CodeWithAarohi
@CodeWithAarohi 11 ай бұрын
Glad it was helpful!
@PallaviPadav
@PallaviPadav 2 ай бұрын
Accidentally I came across this video, very well explained. You are doing an excellent job .
@CodeWithAarohi
@CodeWithAarohi 2 ай бұрын
Glad it was helpful!
@BharatK-mm2uy
@BharatK-mm2uy 3 ай бұрын
Great Explanation, Thanks
@CodeWithAarohi
@CodeWithAarohi 3 ай бұрын
Glad it was helpful!
@MrPioneer7
@MrPioneer7 Ай бұрын
I had watched 3 or 4 videos about transformers before this tutorial. Finally, this tutorial made me understand the concept of transformers. Thanks for your complete and clear explanations and your illustrative example. Specially, your description about query, key and value was really helpful.
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
You're very welcome!
@aditichawla3253
@aditichawla3253 6 ай бұрын
Great explanation! Keep uploading such nice informative content.
@CodeWithAarohi
@CodeWithAarohi 6 ай бұрын
Thank you, I will
@user-kx1nm3vw5s
@user-kx1nm3vw5s Ай бұрын
Its great. I have only one query as whats the input of the masked multi-head attention as its not clear to me kindly guide me about it?
@MAHI-kj5tg
@MAHI-kj5tg 7 ай бұрын
Just amazing explanation 👌
@CodeWithAarohi
@CodeWithAarohi 7 ай бұрын
Thanks a lot 😊
@harshilldaggupati
@harshilldaggupati 10 ай бұрын
Very well explained, even with such niche viewer base, keep making more of these please
@CodeWithAarohi
@CodeWithAarohi 10 ай бұрын
Thank you, I will
@VishalSingh-wt9yj
@VishalSingh-wt9yj 5 ай бұрын
Well explained. before watching this video i was very confused in understanding how transformers works but your video helped me alot
@CodeWithAarohi
@CodeWithAarohi 5 ай бұрын
Glad my video is helpful!
@servatechtips
@servatechtips 11 ай бұрын
This is a fantastic, Very Good explanation. Thank you so much for good explanation
@CodeWithAarohi
@CodeWithAarohi 11 ай бұрын
Glad it was helpful!
@animexworld6614
@animexworld6614 27 күн бұрын
Great Content
@CodeWithAarohi
@CodeWithAarohi 26 күн бұрын
Thanks!
@user-mv5bo4vf2v
@user-mv5bo4vf2v 6 ай бұрын
Hello and Thank you so much. 1 question: I don't realize where the numbers in word embedding and positional encoding come from?
@user-oo2co6xb8u
@user-oo2co6xb8u Ай бұрын
Wow.. you are amazing. Thank you for the clear explanation
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
You're very welcome!
@soravsingla6574
@soravsingla6574 8 ай бұрын
Hello Ma’am Your AI and Data Science content is consistently impressive! Thanks for making complex concepts so accessible. Keep up the great work! 🚀 #ArtificialIntelligence #DataScience #ImpressiveContent 👏👍
@CodeWithAarohi
@CodeWithAarohi 8 ай бұрын
Thank you!
@imranzahoor387
@imranzahoor387 4 ай бұрын
best explanation i saw multiple video but this provide the clear concept keep it up
@CodeWithAarohi
@CodeWithAarohi 4 ай бұрын
Glad to hear that
@debarpitosinha1162
@debarpitosinha1162 3 ай бұрын
Great Explanation mam
@CodeWithAarohi
@CodeWithAarohi 3 ай бұрын
Glad you liked it
@pandusivaprasad4277
@pandusivaprasad4277 5 ай бұрын
excellent explanation madam... thank you so much
@CodeWithAarohi
@CodeWithAarohi 5 ай бұрын
Thanks and welcome
@soravsingla6574
@soravsingla6574 8 ай бұрын
Very well explained
@CodeWithAarohi
@CodeWithAarohi 8 ай бұрын
Thanks for liking
@satishbabu5510
@satishbabu5510 Ай бұрын
thank you very much for explaining and breaking it down 😀 comparatively so far, your explanation is easy to understand compared to other channels thank you very much for making this video and sharing to everyone❤
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
Glad it was helpful!
@TheMayankDixit
@TheMayankDixit 9 ай бұрын
Nice explanation Ma'am.
@CodeWithAarohi
@CodeWithAarohi 9 ай бұрын
Thank you! 🙂
@AbdulHaseeb091
@AbdulHaseeb091 3 ай бұрын
Ma'am, we are eagerly hoping for a comprehensive Machine Learning and Computer Vision playlist. Your teaching style is unmatched, and I truly wish your channel reaches 100 million subscribers! 🌟
@CodeWithAarohi
@CodeWithAarohi 3 ай бұрын
Thank you so much for your incredibly kind words and support!🙂 Creating a comprehensive Machine Learning and Computer Vision playlist is an excellent idea, and I'll definitely consider it for future content.
@bijayalaxmikar6982
@bijayalaxmikar6982 5 ай бұрын
excellent explanation
@CodeWithAarohi
@CodeWithAarohi 5 ай бұрын
Glad you liked it!
@user-dl4jq2yn1c
@user-dl4jq2yn1c Ай бұрын
Best video ever explaining the concepts in really lucid way maam,thanks a lot,pls keep posting,i subscribed 😊🎉
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
Thanks and welcome
@blindprogrammer
@blindprogrammer 23 күн бұрын
very high level but perfect!
@CodeWithAarohi
@CodeWithAarohi 22 күн бұрын
Thanks!
@vasoyarutvik2897
@vasoyarutvik2897 8 ай бұрын
Very Good Video Ma'am, Love from Gujarat, Keep it up
@CodeWithAarohi
@CodeWithAarohi 8 ай бұрын
Thanks a lot
@minalmahala5260
@minalmahala5260 2 ай бұрын
Really very nice explanation ma'am!
@CodeWithAarohi
@CodeWithAarohi 2 ай бұрын
Glad my video is helpful!
@manishnayak9759
@manishnayak9759 7 ай бұрын
Thanks Aaroh i😇
@CodeWithAarohi
@CodeWithAarohi 7 ай бұрын
Glad it helped!
@vimalshrivastava6586
@vimalshrivastava6586 Жыл бұрын
Thanks for making such an informative video. Please could you make a video on the transformer for image classification or image segmentation applications.
@CodeWithAarohi
@CodeWithAarohi Жыл бұрын
Will cover that soon
@afn8370
@afn8370 Ай бұрын
your video is good, explanation is excellent , only negative I felt was the bg noise. pls use a better mic with noise cancellation. thankyou once again for this video
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
Noted! I will take care of the noise :)
@sahaj2805
@sahaj2805 3 ай бұрын
The best explanation of transformer that I have got on the internet , can you please make a detailed long video on transformers with theory , mathematics and more examples. I am not clear about linear and softmax layer and what is done after that , how training happens and how transformers work on the test data , can you please make a detailed video on this?
@CodeWithAarohi
@CodeWithAarohi 3 ай бұрын
I will try to make it after finishing the pipelined work.
@sahaj2805
@sahaj2805 3 ай бұрын
@@CodeWithAarohi Thanks will wait for the detailed transformer video :)
@_Who_u_are
@_Who_u_are Ай бұрын
Thank you so much
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
Welcome!
@thangarajerode7971
@thangarajerode7971 11 ай бұрын
Thanks. Concept explained very well. Could you please add one custom example (e.g finding similarity questions)using Transformers?
@CodeWithAarohi
@CodeWithAarohi 11 ай бұрын
Will try
@sairampenjarla
@sairampenjarla 28 күн бұрын
hi, Good explanation but at the end, when you explained what would be the input to the decoder's masked multi-head attention, you fumbled and didn't explain clearly. But the rest of the video was very good.
@CodeWithAarohi
@CodeWithAarohi 28 күн бұрын
Thank you for the feedback!
@akshayanair6074
@akshayanair6074 11 ай бұрын
Thank you. The concept has been explained very well. Could you please also explain how these query, key and value vectors are calculated?
@CodeWithAarohi
@CodeWithAarohi 11 ай бұрын
Sure, Will cover that in a separate video.
@digambar6191
@digambar6191 2 күн бұрын
Thank you mam
@CodeWithAarohi
@CodeWithAarohi Күн бұрын
Most welcome 😊
@burerabiya7866
@burerabiya7866 4 ай бұрын
can you please upload the presentation
@user-wh8vy9ol8w
@user-wh8vy9ol8w Ай бұрын
Can you please let us know I/p for mask multi head attention. You just said decoder. Can you please explain. Thanks
@_seeker423
@_seeker423 4 ай бұрын
Can you also talk about the purpose of the 'feed forward' layer. looks like its only there to add non-linearity. is that right?
@abirahmedsohan3554
@abirahmedsohan3554 3 ай бұрын
Yes you can say that..but mayb also for make key, quarry and value trainable
@nikhilrao20
@nikhilrao20 7 ай бұрын
Didn't understand what is the input to the masked multi head self attention layer in the decoder, Can you please explain me?
@CodeWithAarohi
@CodeWithAarohi 7 ай бұрын
In the Transformer decoder, the masked multi-head self-attention layer takes three inputs: Queries(Q), Keys(K) and Values(V) Queries (Q): These are vectors representing the current positions in the sequence. They are used to determine how much attention each position should give to other positions. Keys (K): These are vectors representing all positions in the sequence. They are used to calculate the attention scores between the current position (represented by the query) and all other positions. Values (V): These are vectors containing information from all positions in the sequence. The values are combined based on the attention scores to produce the output for the current position. The masking in the self-attention mechanism ensures that during training, a position cannot attend to future positions, preventing information leakage from the future. In short, the masked multi-head self-attention layer helps the decoder focus on relevant parts of the input sequence while generating the output sequence, and the masking ensures it doesn't cheat by looking at future information during training.
@sahaj2805
@sahaj2805 3 ай бұрын
Can you please make a detailed video explaining the Attention is all you need research paper line by line, thanks in advance :)
@CodeWithAarohi
@CodeWithAarohi 3 ай бұрын
Noted!
@_seeker423
@_seeker423 4 ай бұрын
Question about query, key, value dimensionality Given that query is a word that is looking for other words to pay attention to key is a word that is being looked at by other words shouldn't query and word be a vector of size the same as number of input tokens? so that when there is a dot product between query and key the word that is querying can be correctly (positionally) dot product'd with key and get the self attention value for the word?
@CodeWithAarohi
@CodeWithAarohi 4 ай бұрын
The dimensionality of query, key, and value vectors in transformers is a hyperparameter, not directly tied to the number of input tokens. The dot product operation between query and key vectors allows the model to capture relationships and dependencies between tokens, while positional information is often handled separately through positional embeddings.
@sukritgarg3175
@sukritgarg3175 4 ай бұрын
Great Video ma'am could you please clarify what you said at <a href="#" class="seekto" data-time="1340">22:20</a> once again... I think there was a bit confusion there.
@AyomideFagoroye-oe2hd
@AyomideFagoroye-oe2hd 2 ай бұрын
same here
@techgirl6451
@techgirl6451 7 ай бұрын
hello maa is this transform concept same for transformers in NLP?
@CodeWithAarohi
@CodeWithAarohi 7 ай бұрын
The concept of "transform" in computer vision and "transformers" in natural language processing (NLP) are related but not quite the same.
@akramsyed3628
@akramsyed3628 7 ай бұрын
can you please explain <a href="#" class="seekto" data-time="1327">22:07</a> onward
@mahmudulhassan6857
@mahmudulhassan6857 10 ай бұрын
maam can you please make one video of classification using multi-head attention with custom dataset
@CodeWithAarohi
@CodeWithAarohi 10 ай бұрын
Will try
@KavyaDabuli-ei1dr
@KavyaDabuli-ei1dr 4 ай бұрын
Can you please make a video on bert?
@CodeWithAarohi
@CodeWithAarohi 4 ай бұрын
I will try!
@user-gf7kx8yk9v
@user-gf7kx8yk9v 9 ай бұрын
how to get pdfs mam
@palurikrishnaveni8344
@palurikrishnaveni8344 Жыл бұрын
Could you make a video on image classification for vision transformer, madam ?
@CodeWithAarohi
@CodeWithAarohi Жыл бұрын
Sure, soon
@kadapallavineshnithinkumar2473
@kadapallavineshnithinkumar2473 Жыл бұрын
Could you explain with python code which would be more practical. Thanks for sharing your knowledge
@CodeWithAarohi
@CodeWithAarohi Жыл бұрын
Sure, will cover that soon.
@saeed577
@saeed577 4 ай бұрын
I thought it's transformers in CV. all explanations were in NLP
@CodeWithAarohi
@CodeWithAarohi 4 ай бұрын
I recommend you to understand this video first and then check this video: kzbin.info/www/bejne/qpy9fp1ni7B3b9U After watching these 2 videos, you will understand properly the concept of transformers used in computer vision. Transformers in CV are based on the idea of transformers in NLP. SO its better for understanding if you learn the way I told you.
@Red_Black_splay
@Red_Black_splay 2 ай бұрын
Gonna tell my kids this was optimus prime.
@CodeWithAarohi
@CodeWithAarohi 2 ай бұрын
Haha, I love it! Optimus Prime has some serious competition now :)
@jagatdada2.021
@jagatdada2.021 8 ай бұрын
Use mic, background noise irritate
@CodeWithAarohi
@CodeWithAarohi 8 ай бұрын
Noted! Thanks for the feedback.
@_Who_u_are
@_Who_u_are Ай бұрын
Speaking in Hindi would be more better
@CodeWithAarohi
@CodeWithAarohi Ай бұрын
Sorry for inconvenience
@niluthonte45
@niluthonte45 8 ай бұрын
thank you mam
@CodeWithAarohi
@CodeWithAarohi 8 ай бұрын
Most welcome 😊
Vision Transformers explained
13:44
Code With Aarohi
Рет қаралды 29 М.
Transformers, explained: Understand the model behind ChatGPT
24:07
A pack of chips with a surprise 🤣😍❤️ #demariki
00:14
Demariki
Рет қаралды 52 МЛН
ОДИН ДЕНЬ ИЗ ДЕТСТВА❤️ #shorts
00:59
BATEK_OFFICIAL
Рет қаралды 7 МЛН
Transformers for beginners | What are they and how do they work
19:59
Introduction to Transformers | Transformers Part 1
1:00:05
CampusX
Рет қаралды 37 М.
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 101 М.
Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!
36:15
StatQuest with Josh Starmer
Рет қаралды 602 М.
Transformer Neural Networks - EXPLAINED! (Attention is all you need)
13:05
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 201 М.
Transformers, explained: Understand the model behind GPT, BERT, and T5
9:11
Image Classification Using Vision Transformer | ViTs
34:13
Code With Aarohi
Рет қаралды 34 М.
Игровой Комп с Авито за 4500р
1:00
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 163 М.
Нашел еще 70+ нововведений в iOS 18!
11:04
Secret Wireless charger 😱 #shorts
0:28
Mr DegrEE
Рет қаралды 1,7 МЛН