Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention

Рет қаралды 104,573

Күн бұрын

Пікірлер: 141

@deudaux Жыл бұрын

The guy in the video not just understands the concept, but also understands what it is that understanding of others might lack so he can fill in the gaps.

@joliver1981 4 жыл бұрын

I have watched tons of videos and finally an original video that actually teaches these concepts. There are so many KZbinrs that simply make a video regurgitating something they read somewhere but they don’t really teach anything because they themselves don’t really understand the idea. Bravo. Well done. I actually learned something. Thank you!

@RasaHQ 4 жыл бұрын

(Vincent here) I just wanted to mention that I certainly sympathize. It's hard to find proper originality out there when it comes to teaching data science.

@WIFI-nf4tg 3 жыл бұрын

@@RasaHQ Hi Rasa, can you also explain "how" we should express words into numbers for the vector v ? For example, is there a preferred word embedding?

@RasaHQ 3 жыл бұрын

@@WIFI-nf4tg Rasa doesn't prefer a word embedding, but a common choice is spaCy. Note that technically, countvectors are also part of the feature space that does into DIET.

@briancase6180 Жыл бұрын

Yeah, and some of those videos use a script generated by an AI that "read" the relevant sections of a paper. Get used to having to wade through tons of AI-generated content. We need laws that require AI-generated content to be labeled as such. But, it's probably unenforceable. How much original content is enough to avoid the "Ai-generated" label? ☺️

@homeboundrecords6955 Жыл бұрын

TOTALLY agree most 'education' vids really add to confusion and just regurgitate jargon over and over

@guanxi99 4 жыл бұрын

After dozens of papers and videos studied, that‘s the first one really make me understand the context. Many thanks fornthat!!! It also highlighted for me one fact: Self attention is a smart idea, but the real magic souce is the way word embeddings are created. That will decide on if the contexts created by self attention make sense or do not.

@deoabhijit5935 3 жыл бұрын

agree

@davidlanday2647 3 жыл бұрын

A good thought! That makes me wonder if there are metrics we can add to a loss function to assess how well a mechanism "attends" to words in a sentence. Like, if we look at the embedding space, we would probably want to see words that are contextually proximal/similar and cluster close together. So I guess, some metric to assess how well an attention mechanism captures all contexts.

@parryhotter18 Жыл бұрын

This. If a bit late 😊. Yes the creation of an embedding, i.e. the creation of a vector for each word seems to be the main storage of semantics of each word. This video IS the best i have seen so far in that he always explains firstly WHY and then How the best step works. Great approach!

@edouardthomasset6683 Жыл бұрын

When the student understands its teacher, it means the teacher understood what he explains. I understood everything contrary to the majority of other youtubers on the same topic. Thanks !

@MrLazini 11 ай бұрын

I love how you use different colors to represent different dynamics between relationships. Such a simple idea, yet so good at conveying meaning

@stanislawcronberg3271 2 жыл бұрын

Only 4 minutes in and I can tell this series will be a banger. Don't stop teaching, this is pure quality content, much appreciated!

@alirezakashani3092 2 жыл бұрын

mind blowing how simple self-attention is explained - thank you

@gomogovo4966 Жыл бұрын

I've been looking for a clear explanation for so so so long. First one I've found. I think all the people that made explanatory videos so far, have 0 understanding of the subject. Thank you.

@azurewang 4 жыл бұрын

the most intuitive explaination I have ever seem!!! excellent drawing and accent

@SiliconValleyRunner 4 жыл бұрын

Best ever explanation of "self-attention". Awesome job.

@DaveJ6515 Жыл бұрын

Yes sir, this is it. You have nailed it: not only you know the subject; also know he art of creating the condition for everyone else to go into it gradually and logically. Great.

@mohammedmaamari9210 2 жыл бұрын

The clearest explanation of attention mechanisms I've ever seen. Thank you

@timohear 3 жыл бұрын

Can't believe I've only stumbled up this now. Fantastic original explanation.

@fallingintofilm 4 жыл бұрын

This was a absolutely eye-opening. Congratulations sir! You win the Internet for a while.

@ferneutron 3 жыл бұрын

Thank you so much for your explanation! When you said: "This is known as SELF ATTENTION". I just thought: BAM! Awesome job Rasa!

@rommeltito123 4 жыл бұрын

I had so many doubts about the actual operation that happens in self attention. This video just cleared it. Excellent delivery in such a short time.

@brianyoon815 3 жыл бұрын

This is incredible after like 10 attention explanations i finally get it here

@foxwalks588 3 жыл бұрын

This is the best explanation of attention mechanism so far for a regular person like me! I came here after going through Coursera NLP spec and several papers, but only now I am actually able to see how that works. Seems like embeddings themselves are the secret sauce indeed. Thank you.

@magpieradio 3 жыл бұрын

This is the best video I have seen so far as to explain things so clearly. Well done.

@seanh1591 2 жыл бұрын

This is the best explanation of Self-Attention mechanism I've encountered after combing through the internet! Thank you!

@binishjoshi1126 4 жыл бұрын

I've known self attention for some time, this is by far the most intuitive video I've ever see, thank you.

@blochspin 3 жыл бұрын

best video hands down on the self attention mechanism. Thank you!!!

@thongnguyen1292 4 жыл бұрын

I've read dozens of papers and blog posts about this topic, but all they do were mostly walking through the math without showing any intuition. This video is the best I've ever seen, thank you very much!

@simranjoharle4220 Жыл бұрын

This is the best explaination for the topic I have come across! Thanks!

@mokhtarawwad6291 2 жыл бұрын

Thanks for sharing I have watched based on recommendations from a friend on Facebook I will watch the whole playlist. Thanks for sharing, good bless you 🙏 😊

@galenw6833 Жыл бұрын

At 11:29, the presenter says "cross product", but I think it's the dot product, so that each of the weights (W_11, etc.) are numbers (otherwise using cross product they would be vectors). Thus we can build a new vector from W_11, W_12, ... Great videos, exactly what I was looking for.

@DrJohnnyStalker 3 жыл бұрын

Best Self Attention Intuition i have ever seen. Andrew Ng Level stuff!

@johnhutton5491 3 жыл бұрын

Since there isn't a like/dislike ratio anymore, for those wondering, this video is great

@dinoscheidt 3 жыл бұрын

Love the style. The more talent takes the time to teach new talent, the better. Very appealing style! Subscribed 🦾

@trantandat2699 3 жыл бұрын

I have read a lot about this: paper, medium, video, this one make me the best understanding. Very nice!

@mmpcse 4 жыл бұрын

Have gone through some 10-12 videos on Self Attention. This Attention Series 1,2 &3 are by FAR THE BEST EVER. Many Thanks for these Videos. [came back and updated this comment ;-) ]

@avinashpaul1665 4 жыл бұрын

on of the best example on the web that explains attention mechanism , after reading many blogs i still had my doubts , the way attention is explained between time series and text data is brilliant and helped me understand better.

@tatiana7581 Жыл бұрын

Thank you sooo much for this video! Finally, someone explained what the self-attention is!

@briancase6180 3 жыл бұрын

OMG, this helped me immeasurably. Thanks so much. I just couldn't quite get it from the other explanations I've seen. Now I can go back and probably understand them better. Yay!

@akashmalhotra4787 3 жыл бұрын

This is really an amazing explanation! Liked how you build up from time-series and go to text. Keep up the good work :)

@oritcoh 4 жыл бұрын

Best Attention explanation, by far.

@vijayabhaskar-j 4 жыл бұрын

This attention series is the best clear and intuitive explanation of self-attention out there! Great work!

@sowmiya_rocker 2 жыл бұрын

Beautiful explanation sir. I'm not sure if i got it all but i could tell you that I've got better idea about self-attention from your video compared to the other ones i watched. Thanks a lot 🙏

@dan10400 Жыл бұрын

This is an exceptionally good explanation! Thank you so much. It is easy to see why the thumbs-up count is so high wrt views.

@giyutamioka9437 2 жыл бұрын

Best explaination I have seen so far .... Thanks!

@hiteshnagothu887 4 жыл бұрын

Never have I ever seen such a great concept explanation. You just made my life easier,@Vincent!!

@punk3900 9 ай бұрын

The best explanations you can get in the world. Thanks! BTW, were you aware at the time of making these videos that transformers will be so revolutionary?

@pranjalchaubey 4 жыл бұрын

This is one of the best videos on the topic, if not the best!

@TheGroundskeeper Жыл бұрын

Still the best explanation 3 years later

@uniqueaakash14 2 жыл бұрын

best video i have found in self-attention.

@jhumdas4613 3 жыл бұрын

Amazing explanation!! The best I have come across till date. Thank you so much!

@benjaminticknor2967 3 жыл бұрын

Incredible video! Did you mean to say dot product instead of cross product at 11:30?

@ParniaSh 3 жыл бұрын

Yes, I think so

@andyandurkar7814 2 жыл бұрын

A very simple explanation .. the best one!

@ArabicCompetitiveProgramming 4 жыл бұрын

Great series about attention!

@suewhooo7390 3 жыл бұрын

best explanation of attention mechanism out there!! thank you a lot!

@Anushkumar-lq6hv Жыл бұрын

The best video on self-attention. No debates

@maker72460 2 жыл бұрын

Awesome explanation! It takes great skills to explain such concepts. Looking forward!

@sebastianp4023 3 жыл бұрын

please link this video to the tf doc. I tried a whole day to get behind the concept of attention and this explanation is just beautiful!

@pi5549 Жыл бұрын

Your whiteboarding is beautiful. How are you doing it? I'd love to be able to present in this manner for my students.

@luisvasquez5015 2 жыл бұрын

Finally somebody explicitly saying that the distributional hypothesis makes no linguistic sense

@shivani404sheth4 4 жыл бұрын

This was so interesting! Thank you for this amazing video.

@adrianramirez9729 2 жыл бұрын

Amazing explanation ! , it did not find too much sense to the comparison between time series, but the second part was really good :)

@vikramsandu6054 2 жыл бұрын

Loved it. Very clear explanation.

@sgt391 3 жыл бұрын

Crazy useful video!

@louiseti4883 2 жыл бұрын

Great stuff in here. Super clear and efficient for begginers ! Thanks

@timholdsworth1 3 жыл бұрын

Why did you use cross product at 11:31? Wouldn't that be making the weights small when the word embedding vectors are similar, which would then mean the related words in the sequence would be unable to influence the current state?

@Erosis 2 жыл бұрын

I think he meant dot product? I don't know.

@devanshamin5554 4 жыл бұрын

Very informative and simple explanation of a complicated topic. 👍🏻

@skauddy755 Жыл бұрын

By far, the most intuitive explanation of self-attention. DISAPPOINTED However, with the number of Likes:(

@sachinshelar8810 3 жыл бұрын

amazing stuff . Thanks so much Rasa Team :)

@QuangNguyen-jz5nl 3 жыл бұрын

Thank you for sharing, great tutorial, looking forward to watching more and more great ones.

@ashh3051 3 жыл бұрын

You are a great teacher. Thanks for this content.

@siemdual8026 3 жыл бұрын

This video is the KEY for my QUERY! Pun intended. Thank you so much!

@hanimahdi7244 3 жыл бұрын

Thanks a lot! Really amazing , awesome and very clear explanation.

@distrologic2925 Жыл бұрын

love the format

@kevind.shabahang 4 жыл бұрын

Excellent introduction

@Tigriszx 3 жыл бұрын

SOTA explanation. that's what i was exactly looking for. [tr] okuyan varsa, bu herifi takibe alın, efsane anlatıyor.

@Deddiward 2 жыл бұрын

Wow this video is so well done

@timmat 6 ай бұрын

Hi. This is a really great visualisation of weightings - thank you! I have a question though: at 11:30 you say you're going to calculate the cross product between the first token's vector and all the other vectors. Should this instead be the dot product, given that you are looking for similarity?

@yacinerouizi844 4 жыл бұрын

Thank you for sharing, great tutorial!

@bootagain 4 жыл бұрын

Thank you for posting this educational and useful video. though I can not undetstand everything yet, I'll keep seeing the rest of series and trying to understand :) I mean it.

@ishishir 3 жыл бұрын

Brilliant explanation

@RaoBlackWellizedArman 2 жыл бұрын

Fantastic explanations‌‌ ^_^ Already subscribed!

@norhanahmed5116 4 жыл бұрын

thanks alot, that was very simple and useful. hoping all the best for you

@saianishmalla2646 2 жыл бұрын

This was extremely helpful !!

@offthepathworks9171 Жыл бұрын

Solid gold, thank you.

@mohajeramir 4 жыл бұрын

this was an excellent explanation. Thank you

@ashokkumarj594 4 жыл бұрын

I love your tutorial 😙😙 Best explanation

@nurlubanu 3 жыл бұрын

Well explained! Thank you!

@arvindu9344 8 ай бұрын

Best explanation, that you so much.

@MohamedSayed-et7lf 4 жыл бұрын

Perfectly explained

@vulinhle8343 4 жыл бұрын

amazing video, thank you very much

@alexanderskusnov5119 Жыл бұрын

To filter (in signals (Low Frequency) and programming (filter predicate vector)) means to hold, not to throw away.

@mohammadelghandour1614 2 жыл бұрын

Thanks for the easy and thorough explanation. I just have one question. How is "Y" now is more representative or useful (more context) than "V"? can you give an example ?

@luisluiscunha 2 жыл бұрын

very well explained: thank you

@ankitmars Жыл бұрын

Best Explanation

@jmarcio51 3 жыл бұрын

I got the idea, thanks for the explaination.

@rayaay3095 3 жыл бұрын

Just Awesome... thank you

@injysarhan672 3 жыл бұрын

Great video! thanks

@williamstorey5024 Жыл бұрын

what is the reweigh method that you used in the beginning? i would like to look that up and get more details on it.

@thelastone1643 3 жыл бұрын

You are the best ....

@fadop3156 2 жыл бұрын

11:28 is it the cross product and not the dot product of the vector?

@ansharora3248 3 жыл бұрын

Beauty!

@zeroheisenburg3480 3 жыл бұрын

At 11:33, do you mean dot product instead of cross product? If it's cross product, isnt W11*V1 will be 0 since they are perpendicular?

@RasaHQ 3 жыл бұрын

(Vincent here) In general w_ij and v_k are non perpendicular. But you are correct that the multiplications here could be more explicitly written as a dot product.

@zeroheisenburg3480 3 жыл бұрын

@@RasaHQ Appreciate replying. I brought it up since the video "verbally" said it was doing cross product. So each w_ij value should be a scalar value in this case? Thanks for the clarification.