Intuition Behind Self-Attention Mechanism in Transformer Networks

Рет қаралды 214,854

Ark (ark)

Күн бұрын

Пікірлер

@MinsungsLittleWorld 10 ай бұрын

Me - Scrolls through MSA's playlist Also me - finds this video in the playlist that is not even related to MSA

@ZarifaahmedNoha-ey9kh 7 ай бұрын

bro i am too i was just scrolling and founf this

@amanlohia6399 Жыл бұрын

Bro this might be the best video available on the entire internet for explaining transformers. I have studied, worked upon and implemented transformers but never have I been able to grasp it as simply and intuitively as you made it. You should really make more videos about anything you do or like, explain more algorithms, papers, implement them from scratch and stuff. Big thanks man.

@AiDreamscape2364 2 ай бұрын

Exactly 💯! I agree 💯

@totallynotreese Жыл бұрын

anyone here from msa’s playlist edit: i left this cmt almost 2 years ago and i still get likes from it crazy how msa still has not found out this was in their playlist

@thishmi2010 Жыл бұрын

@dacine_larabe Жыл бұрын

Same

@Irene-jj3ri Жыл бұрын

I wonder why msa put this in there playlist..

@sjplays2 Жыл бұрын

I am too 🤔ᴴᴹᴹᴹ

@Tasssshhhh Жыл бұрын

Same

@zorqis 3 жыл бұрын

This goes in my favorites list to recommend for others. You have the gift of teaching at a level rarely seen, distilling key concepts and patiently explaining every step, even from multiple angles. This teaches not only the subject, but to think in the domain of the subject. Please use this gift as much as you can :). Respect!

@grayboywilliams Жыл бұрын

Please do a part 2! The second half of the architecture is never covered as much in depth. Your explanation of the first half is the best I've seen.

@twyt108 Жыл бұрын

Agreed! I've yet to find a video on the decoder that's as clear as this video.

@ednrl Жыл бұрын

What a gem, I feel like a kid again discovering something beautiful about the world. Teaching is certainly an art and you are a gifted teacher

@AiDreamscape2364 2 ай бұрын

Gifted teacher indeed

@hansrichter5227 Жыл бұрын

In a landscape where lots of 5 to 15 minute videos exist where some weird dude stutters and stammers technical terms failing at both the attempt to hide their own lack of understanding about a machine learning topic and "summarizing" a vastly complex subject in a ridiculously short amount of time, you managed to explain a topic amazingly clean. That's how it should be done. Keep it up! Great work!

@kvazau8444 Жыл бұрын

Despite the demand for it, competence is a rarity

@nirajabcd 2 жыл бұрын

I am still waiting for part II. I haven't yet found the explanation better than this. The way you built the intuition on query, key and value which is the heart and soul of self attention mechanism is impeccable.

@kks8142 Жыл бұрын

This is good as well - kzbin.info/www/bejne/anPHlGhrn51jopo

@tvcomputer1321 Жыл бұрын

i've been trying to wrap my head around this stuff and between this video and chatgpt itself explaining and answering my questions i think i'm starting to get it. i dont think i will ever have the ability to calculate the derivatives of a loss function for gradient descent myself though

@vmarzein Жыл бұрын

@@tvcomputer1321usually, u dont have to worry about calculating derivatives (not saying anyone shouldnt learn derivatives). but tools such as pytorch and tensorflow has autograd that does all that for you

@skipintro9988 3 жыл бұрын

Wondering why he doesn't have a million subscribers. By far the best video on self-attention.

@siddharth-gandhi 10 ай бұрын

The BEST source of information I've come across on the internet about the intuition behind the Q,K and V stuff. PLEASE do part 2! You are an amazing teacher!

@hamzaarslan6490 3 жыл бұрын

BEST EXPLANATION I HAVE EVER SEEN ABOUT ATTENTIONS, keep going mate,

@AmerDiwan Жыл бұрын

I've seen many videos on transformers that parrot the steps in the original paper with no explanation of why those steps matter. In contrast, this video actually gives an excellent intuitive explanation! Hands down, the best video explaining self-attention that I have seen...by a long shot.

@nirajabcd 3 жыл бұрын

This is by far the best explanation on Attention Mechanism I have seen. Finally a true 'Aha' moment in this paper. Absolutely loved it!

@CallBlofD Жыл бұрын

Can you please do a part 2? I'm usually not commenting on youtube videos but the way you explained the intuition of the first part was the best I've seen. Thank you so much, you gave me a lot of intuition!!

@ucsdvc 11 ай бұрын

This is the most intuitive but non-hand-wavy explanation of self-attention mechanism I’ve seen! Thanks so much!

@kalpeshsolanki4715 2 жыл бұрын

I am half way through the video and I am already in awe. I finally understood the attention mechanism. Thanks a lot.

@pravingirase6464 Жыл бұрын

What an explanation. I read through other articles but couldnt figure out why they are doing what they are doing. But you nailed it with explaation for everything from dot product to weights and most importantly the meaning of Query and key values. Thanks a ton!

@XhensilaPoda Жыл бұрын

I came here after the Andrej Karpathys building GPT from scratch video. I have looked at many other videos, but this explains the self-attention mechanism best. Amazing work.

@sibyjoseplathottam4828 Жыл бұрын

This is undoubtedly one of the best and most intuitive explanations of the Self-attention mechanism. Thank you very much!

@sethjchandler 3 жыл бұрын

Thanks for a spectacularly lucid explanation of a complicated subject that tends to be explained poorly elsewhere. Fantastic!

@shankaranandahv 2 жыл бұрын

This Video is a master piece. I really loved this video and explains in a very effective and simplified way. the complexities hidden behind the architecture is peeled layer by layer.. Hats off..

@basharM79 Жыл бұрын

I haven't come across a better intuitive explanation of Transformers before.!! Well Done!!!!

@spartacusnobu3191 2 жыл бұрын

Having just read the attention is all you need paper with the intention to tackle a work problem with BERT and some specialized classification layers, your explanation here illuminates totally the self-attention mechanism component. Thanks a million times.

@wuhaipeng Жыл бұрын

the best explaination for self-attention I've seen. Than you so much

@michelevaccaro865 Жыл бұрын

THANKS ! Out of so many videos on the Attention Mechanism, this is by far the best and the more intuitive which explains very well how the score is calculated. THANKS !

@hubertnguyen8855 Жыл бұрын

This is the most interesting and intuitive explanation about Attention I've ever seen. Thanks

@razodactyl Жыл бұрын

You hit the "intuitively explained" part. Great work.

@khubaibraza8446 3 жыл бұрын

One of the best explanations on the internet , just simply crystal and clear He should have tens of thousands of subscribers at the very least.

@idris.adebisi Жыл бұрын

This is the best explanation I have gotten on the concept of attention in the transformer network.Thanks for this wonderful video.

@AnshulKanakia 2 жыл бұрын

This was a fantastic video. I really hope you do the whole series on the "Attention is all you need" paper. It would be fantastic to cover the other parts of the architecture, as you said.

@hahustatisticalconsultancy8869 Жыл бұрын

This is the first video that I have ever seen in the deep architecture with clear and detailed under hood of the box. Thank you very much.

@pavan1006 Жыл бұрын

I owe you a lot, for this level of clear explanation on math involved.

@bhaskarswaminathan9998 Жыл бұрын

One of the *BEST* videos on the topic of self-attention - PERIOD !!!

@rdgopal 11 ай бұрын

This is by far the most intuitive explanation that I have come across. Great job!

@benwillslee2713 5 ай бұрын

Thanks bro, it's the BEST explanation of Attenction I had seen so far ( I have to say that I had seen many others ), looking forward the other Parts eventhough it's been almost 4 years since this Part1 !

@shivangitomar5557 Жыл бұрын

Best video on this topic!! Looking forward to one on transformers!

@RajeevSharma-c1j Жыл бұрын

This is amazing. Very nicely explained self-attention mechanism. It seems the you are gifted with amazing teaching qualities. Thanks for sharing the information.

@kngjln Жыл бұрын

I can't agree more. Pure display of the art of explaining complex topics in simple and complete words. Please add part 2 as you mentioned.

@AlexPunnen Жыл бұрын

Finally a video which talks about the learning part in transformers which plugs a big hole in all the other videos. Great, I am finally able to understand this. Thank you

@willpearson Жыл бұрын

So much better than defaulting to the 'key', 'query', 'value' terminology. Confused me at first but now I have seen this, I fully understand.

@darylgraf9460 Жыл бұрын

I'd just like to add my voice to the chorus of praise for your teaching ability. Thank you for offering this intuition for the scaled dot product attention architecture. It is very helpful. I hope that you'll have the time and inclination to continue providing intuition for other aspects of LLM architectures. All the best to you and yours!

@proshno Жыл бұрын

By far the best and most intuitive introduction to the concept of Self-Attention I've ever found anywhere! Really looking forward to watching more of your amazing videos.

@saiyeswanth2157 Жыл бұрын

"THE ONLY VIDEO THAT EXPLAINS SELF ATTENTION CLEARLY" !!!!, thank you so much !!.

@koladearisekola3650 Жыл бұрын

This is the best explanation of the attention mechanism I have seen on the internet. Great Job !

@ChrisHalden007 Жыл бұрын

Excellent video. Thanks. Just can't wrap my head around how this works with sentences of different sizes.

@deepakkumarpoddar 2 жыл бұрын

Really Nice. I am going to suggest this video to the people who are still in search of intuition of transformers

@prishangabora7303 Жыл бұрын

Probably the best explanation I have found on Attention here. Thank you so much. Implementation and coding these will still be a task, but at least now have enough knowledge to know exactly what is happening under the hood of Transformers.

@JonathanUllrich Жыл бұрын

this video solved a month long understanding problem I had with attention. thank you so much for this educational and didactic master piece!

@AriadnesClew82 Жыл бұрын

I wish I could give you five thumbs up for this video. The diagrams along with the commentary provided the representations needed to comprehend the different aspects / steps to needed to understand the inner workings behind multi-headed attention while not delving too deep into the weeds. This is the best video i've ever watched in terms of explaining a more complex technical topic; it is like a gold standard for technical education videos in my book. Thank you.

@michaelringer5644 3 жыл бұрын

Pretty much the best explanation that you can find.

@Jaybearno Жыл бұрын

Sir, you are an excellent instructor. Thank you for making this.

@youtube1o24 Жыл бұрын

Very decent work, please have more part 2, part 3 of this series.

@RahimPasban 11 ай бұрын

this is one of the greatest videos that I have ever watched about Transformers, thank you!

@alexandertachkovskyi705 2 жыл бұрын

You did a great job! Please don't stop!!!

@humanshapedblob Жыл бұрын

i hope you teach for a living, because that was amazing, so much better than everything else i've read and seen on this topic.

@wolfwalker_ 2 жыл бұрын

Well explained. Clearer than most University Online Lectures. Rare Burmese Talent. Looking forward to more videos.

@ruchikmishra5177 2 жыл бұрын

This tutorial is probably the best I have seen so far on this topic. Really appreciate.

@nikhilanjpv8377 Жыл бұрын

Thank you so much ! I went through dozens of videos before finding this one, and I don't need any other video to understand attention anymore !

@arkaung Жыл бұрын

This video is all you need :D

@BuddingAstroPhysicist Жыл бұрын

Wow , this is one of the most intuitive intuition I have found on transformer . Please make the second part as well , eagerly awaiting. Thanks a lot for this. :)

@MrFurano Жыл бұрын

I watched many videos explaining deep learning concepts. This one is without doubt one of the best. Keep up the great work! You have just earned another subscriber.

@brenorb Жыл бұрын

I've been looking for explanations like this one for a long time! Please, continue this work. Great job!

@aayushsaxena1316 Жыл бұрын

Best video describing the attention mechanism in transformers so far !!

@RM-bv6dl Жыл бұрын

Best explanation on this topic as well as probably one of the best explanations on a complicated topic in general. Hats off to you sir.

@yingguo3683 Жыл бұрын

This is the only video that make me feel like i understand, at least some part of it. Please make part 2, 3...

@Phobos221B Жыл бұрын

Please make a 2nd Part, This is the most detailed and simple explanation i have seen on multi head attention and it's intuition

@paveltolmachev1898 9 ай бұрын

By the way, I watched about 10 videos on attention, this is the best video so far. Trust me

@raul825able Жыл бұрын

Such a complex topic explained so effortlessly. Thanks man!!!

@punitpatel5565 Жыл бұрын

this guy has nailed it down how to explain complex subject in easy term. I liked RASA series but this video is so easy to understand. This video has achieved level of 3blue1brown.

@srinivasansubramaniam5876 4 жыл бұрын

One of the best introductory explanation

@tribaldose 2 жыл бұрын

This explanation deserves at least 1 million views. Amazing! THANKS FOR IT

@zaidengokhool8085 Жыл бұрын

Beautiful! Congrats to Ark, this video is wonderful. I’ve read many papers and seen different videos but this one is a head above the rest in explaining each component AND the intuitions about WHY we are using these, which is the part often skipped in other videos which just cover structure and formulas but are missing the big picture simplicity of what is the purpose of each component. Please keep up this good work!

@bandaralrooqi5459 3 жыл бұрын

I have seen many vids explaining self-attention already but this one is the best, Thank you :)

@albertlee9592 2 жыл бұрын

Oh my God. I finally understood how transformer works now. Thank you so much for this amazing tutorial

@173_sabbirhossain9 Жыл бұрын

You are great , you are amazing at teaching.and you totally know how to teach to someone.Really appreciable.

@trin1721 2 жыл бұрын

Dude you need to make more videos. You have a gift. If you do a full series on some key deep learning concepts and things take off you could be onto a very lucrative channel with a lot of social good

@rollingstone1784 8 ай бұрын

@arkaung, @ark_aung : there is an error at 13:00. s_1 is a row-VECTOR, so should be written in bold (just as v_1); s_1 represents the first row in the histogram. The components (s_11, ..., s_1n are scalars (normal font)) (the small boxes in the histogram) 14:00 again s_1 is a vector 15:00; the weighs w_i are vectors as well 17:45: y_i are vectors 20:50: maybe Matrixnotation would help here: V, the set of alle vectors v_i, is a matrix of dimension 3x50. The vector v_2 has dimensiont 1x50. Matrixmultiplication v_2 * V^T leads to (1x50)*(50x3) = s_2(1x3). Normalization leads to w_2(1x3). Matrixmultiplication y_2=w_2* V leads to (1x3)*(3x50)=(1x50) Remark: it should be noted that the last step is a "right multiplication" (matrix x vector) so it is, in matrix notation: V^T * w_2^T resulting in a vector y_2^T of dimension (50x3)*(3x1)=50x1. By transposiing this vector we get y_2(1x50).

@gemini_537 Жыл бұрын

This is absolutely the best explanation of self-attention mechanism! Keep up the great work!

@vaibhavhiwase5462 Жыл бұрын

Hey !! This is by far the best explanation. Please create a series.

@zahrafayyaz9539 2 жыл бұрын

Amazing video. Please make part 2. That explanation saved me a lot of time and head scratching

@anujlahoty8022 2 жыл бұрын

Each and every single word of this video is pearl even stop words(Just kidding). I would encourage each and everyone to view this video. The best explanation ever.

@mytechnotalent Жыл бұрын

Finally someone who actually explains this with a real functional example. Thank you!

@TTTrouble Жыл бұрын

This was absolutely one of the better explanations that I've come across and I feel like i've watched a hundred different videos explaining attention. Thanks so much for putting in the time to make it, I look forward to the next one if you can get around to it!

@Alex-uc8co Жыл бұрын

Awesome video! This is really the best video about the topic. I truly hope you will make the second video asap.

@kristianmamforte4129 Жыл бұрын

the best explanation on transformer all over KZbin indeed!

@YorukaValorant 2 жыл бұрын

Thank you, after 2 Hours of looking for the right explanation I finally found it. Thank you. Now I understand how Self-Attention and Attention works.

@millenniumbismay382 2 жыл бұрын

Truely the most awesome explanation. You made sure "Attention is all you get"! Waiting for more videos... Cheers!

@anjumanoj2131 2 жыл бұрын

Superb Explanation... i have watched other videos about self attention, but this video stands out ... thanks for making this video

@osamutsuchiyatokyo Жыл бұрын

I believe it is at least one of the most clear presentations of multi-head attention.

@Predre_Amrna Жыл бұрын

Wait! Your this video is in the playlist of MSA(biggest channel of youtube)🤯🤯🤯 I found this from there

@binig.4591 Жыл бұрын

We need part 2 for this video. Good job

@jaiminjariwala5 Жыл бұрын

Thank you so much sir, you genuinely explained much better than any other videos I saw on Attention Mechanism in transformer!

@NadaaTaiyab Жыл бұрын

This is truly the best video I've seen on this topic. Thank you so much. And, please make more videos for us!

@AbubakerMahmoudshangab Жыл бұрын

The first time I understand transformers self attention million thanks bro

@DanteNoguez 2 жыл бұрын

Wow, you're the only one that has managed to make it truly simple. Thanks a lot for this!

@ababoo99 Жыл бұрын

What an excellent explanation. Thank you. I really like how you carefully trace the meaning and structure of each term through the process.

@programmingwithmangaliso Жыл бұрын

This is perhaps the best I have seen. Elegant!

@gihan_liyanage 2 жыл бұрын

Best explanation on the self attention mechanism on the internet. Please explain the other concepts in the paper if possible. Thanks for the intuition!