The guy in the video not just understands the concept, but also understands what it is that understanding of others might lack so he can fill in the gaps.
@joliver19814 жыл бұрын
I have watched tons of videos and finally an original video that actually teaches these concepts. There are so many KZbinrs that simply make a video regurgitating something they read somewhere but they don’t really teach anything because they themselves don’t really understand the idea. Bravo. Well done. I actually learned something. Thank you!
@RasaHQ4 жыл бұрын
(Vincent here) I just wanted to mention that I certainly sympathize. It's hard to find proper originality out there when it comes to teaching data science.
@WIFI-nf4tg3 жыл бұрын
@@RasaHQ Hi Rasa, can you also explain "how" we should express words into numbers for the vector v ? For example, is there a preferred word embedding?
@RasaHQ3 жыл бұрын
@@WIFI-nf4tg Rasa doesn't prefer a word embedding, but a common choice is spaCy. Note that technically, countvectors are also part of the feature space that does into DIET.
@briancase6180 Жыл бұрын
Yeah, and some of those videos use a script generated by an AI that "read" the relevant sections of a paper. Get used to having to wade through tons of AI-generated content. We need laws that require AI-generated content to be labeled as such. But, it's probably unenforceable. How much original content is enough to avoid the "Ai-generated" label? ☺️
@homeboundrecords6955 Жыл бұрын
TOTALLY agree most 'education' vids really add to confusion and just regurgitate jargon over and over
@guanxi994 жыл бұрын
After dozens of papers and videos studied, that‘s the first one really make me understand the context. Many thanks fornthat!!! It also highlighted for me one fact: Self attention is a smart idea, but the real magic souce is the way word embeddings are created. That will decide on if the contexts created by self attention make sense or do not.
@deoabhijit59353 жыл бұрын
agree
@davidlanday26473 жыл бұрын
A good thought! That makes me wonder if there are metrics we can add to a loss function to assess how well a mechanism "attends" to words in a sentence. Like, if we look at the embedding space, we would probably want to see words that are contextually proximal/similar and cluster close together. So I guess, some metric to assess how well an attention mechanism captures all contexts.
@parryhotter18 Жыл бұрын
This. If a bit late 😊. Yes the creation of an embedding, i.e. the creation of a vector for each word seems to be the main storage of semantics of each word. This video IS the best i have seen so far in that he always explains firstly WHY and then How the best step works. Great approach!
@edouardthomasset6683 Жыл бұрын
When the student understands its teacher, it means the teacher understood what he explains. I understood everything contrary to the majority of other youtubers on the same topic. Thanks !
@MrLazini11 ай бұрын
I love how you use different colors to represent different dynamics between relationships. Such a simple idea, yet so good at conveying meaning
@stanislawcronberg32712 жыл бұрын
Only 4 minutes in and I can tell this series will be a banger. Don't stop teaching, this is pure quality content, much appreciated!
@alirezakashani30922 жыл бұрын
mind blowing how simple self-attention is explained - thank you
@gomogovo4966 Жыл бұрын
I've been looking for a clear explanation for so so so long. First one I've found. I think all the people that made explanatory videos so far, have 0 understanding of the subject. Thank you.
@azurewang4 жыл бұрын
the most intuitive explaination I have ever seem!!! excellent drawing and accent
@SiliconValleyRunner4 жыл бұрын
Best ever explanation of "self-attention". Awesome job.
@DaveJ6515 Жыл бұрын
Yes sir, this is it. You have nailed it: not only you know the subject; also know he art of creating the condition for everyone else to go into it gradually and logically. Great.
@mohammedmaamari92102 жыл бұрын
The clearest explanation of attention mechanisms I've ever seen. Thank you
@timohear3 жыл бұрын
Can't believe I've only stumbled up this now. Fantastic original explanation.
@fallingintofilm4 жыл бұрын
This was a absolutely eye-opening. Congratulations sir! You win the Internet for a while.
@ferneutron3 жыл бұрын
Thank you so much for your explanation! When you said: "This is known as SELF ATTENTION". I just thought: BAM! Awesome job Rasa!
@rommeltito1234 жыл бұрын
I had so many doubts about the actual operation that happens in self attention. This video just cleared it. Excellent delivery in such a short time.
@brianyoon8153 жыл бұрын
This is incredible after like 10 attention explanations i finally get it here
@foxwalks5883 жыл бұрын
This is the best explanation of attention mechanism so far for a regular person like me! I came here after going through Coursera NLP spec and several papers, but only now I am actually able to see how that works. Seems like embeddings themselves are the secret sauce indeed. Thank you.
@magpieradio3 жыл бұрын
This is the best video I have seen so far as to explain things so clearly. Well done.
@seanh15912 жыл бұрын
This is the best explanation of Self-Attention mechanism I've encountered after combing through the internet! Thank you!
@binishjoshi11264 жыл бұрын
I've known self attention for some time, this is by far the most intuitive video I've ever see, thank you.
@blochspin3 жыл бұрын
best video hands down on the self attention mechanism. Thank you!!!
@thongnguyen12924 жыл бұрын
I've read dozens of papers and blog posts about this topic, but all they do were mostly walking through the math without showing any intuition. This video is the best I've ever seen, thank you very much!
@simranjoharle4220 Жыл бұрын
This is the best explaination for the topic I have come across! Thanks!
@mokhtarawwad62912 жыл бұрын
Thanks for sharing I have watched based on recommendations from a friend on Facebook I will watch the whole playlist. Thanks for sharing, good bless you 🙏 😊
@galenw6833 Жыл бұрын
At 11:29, the presenter says "cross product", but I think it's the dot product, so that each of the weights (W_11, etc.) are numbers (otherwise using cross product they would be vectors). Thus we can build a new vector from W_11, W_12, ... Great videos, exactly what I was looking for.
@DrJohnnyStalker3 жыл бұрын
Best Self Attention Intuition i have ever seen. Andrew Ng Level stuff!
@johnhutton54913 жыл бұрын
Since there isn't a like/dislike ratio anymore, for those wondering, this video is great
@dinoscheidt3 жыл бұрын
Love the style. The more talent takes the time to teach new talent, the better. Very appealing style! Subscribed 🦾
@trantandat26993 жыл бұрын
I have read a lot about this: paper, medium, video, this one make me the best understanding. Very nice!
@mmpcse4 жыл бұрын
Have gone through some 10-12 videos on Self Attention. This Attention Series 1,2 &3 are by FAR THE BEST EVER. Many Thanks for these Videos. [came back and updated this comment ;-) ]
@avinashpaul16654 жыл бұрын
on of the best example on the web that explains attention mechanism , after reading many blogs i still had my doubts , the way attention is explained between time series and text data is brilliant and helped me understand better.
@tatiana7581 Жыл бұрын
Thank you sooo much for this video! Finally, someone explained what the self-attention is!
@briancase61803 жыл бұрын
OMG, this helped me immeasurably. Thanks so much. I just couldn't quite get it from the other explanations I've seen. Now I can go back and probably understand them better. Yay!
@akashmalhotra47873 жыл бұрын
This is really an amazing explanation! Liked how you build up from time-series and go to text. Keep up the good work :)
@oritcoh4 жыл бұрын
Best Attention explanation, by far.
@vijayabhaskar-j4 жыл бұрын
This attention series is the best clear and intuitive explanation of self-attention out there! Great work!
@sowmiya_rocker2 жыл бұрын
Beautiful explanation sir. I'm not sure if i got it all but i could tell you that I've got better idea about self-attention from your video compared to the other ones i watched. Thanks a lot 🙏
@dan10400 Жыл бұрын
This is an exceptionally good explanation! Thank you so much. It is easy to see why the thumbs-up count is so high wrt views.
@giyutamioka94372 жыл бұрын
Best explaination I have seen so far .... Thanks!
@hiteshnagothu8874 жыл бұрын
Never have I ever seen such a great concept explanation. You just made my life easier,@Vincent!!
@punk39009 ай бұрын
The best explanations you can get in the world. Thanks! BTW, were you aware at the time of making these videos that transformers will be so revolutionary?
@pranjalchaubey4 жыл бұрын
This is one of the best videos on the topic, if not the best!
@TheGroundskeeper Жыл бұрын
Still the best explanation 3 years later
@uniqueaakash142 жыл бұрын
best video i have found in self-attention.
@jhumdas46133 жыл бұрын
Amazing explanation!! The best I have come across till date. Thank you so much!
@benjaminticknor29673 жыл бұрын
Incredible video! Did you mean to say dot product instead of cross product at 11:30?
@ParniaSh3 жыл бұрын
Yes, I think so
@andyandurkar78142 жыл бұрын
A very simple explanation .. the best one!
@ArabicCompetitiveProgramming4 жыл бұрын
Great series about attention!
@suewhooo73903 жыл бұрын
best explanation of attention mechanism out there!! thank you a lot!
@Anushkumar-lq6hv Жыл бұрын
The best video on self-attention. No debates
@maker724602 жыл бұрын
Awesome explanation! It takes great skills to explain such concepts. Looking forward!
@sebastianp40233 жыл бұрын
please link this video to the tf doc. I tried a whole day to get behind the concept of attention and this explanation is just beautiful!
@pi5549 Жыл бұрын
Your whiteboarding is beautiful. How are you doing it? I'd love to be able to present in this manner for my students.
@luisvasquez50152 жыл бұрын
Finally somebody explicitly saying that the distributional hypothesis makes no linguistic sense
@shivani404sheth44 жыл бұрын
This was so interesting! Thank you for this amazing video.
@adrianramirez97292 жыл бұрын
Amazing explanation ! , it did not find too much sense to the comparison between time series, but the second part was really good :)
@vikramsandu60542 жыл бұрын
Loved it. Very clear explanation.
@sgt3913 жыл бұрын
Crazy useful video!
@louiseti48832 жыл бұрын
Great stuff in here. Super clear and efficient for begginers ! Thanks
@timholdsworth13 жыл бұрын
Why did you use cross product at 11:31? Wouldn't that be making the weights small when the word embedding vectors are similar, which would then mean the related words in the sequence would be unable to influence the current state?
@Erosis2 жыл бұрын
I think he meant dot product? I don't know.
@devanshamin55544 жыл бұрын
Very informative and simple explanation of a complicated topic. 👍🏻
@skauddy755 Жыл бұрын
By far, the most intuitive explanation of self-attention. DISAPPOINTED However, with the number of Likes:(
@sachinshelar88103 жыл бұрын
amazing stuff . Thanks so much Rasa Team :)
@QuangNguyen-jz5nl3 жыл бұрын
Thank you for sharing, great tutorial, looking forward to watching more and more great ones.
@ashh30513 жыл бұрын
You are a great teacher. Thanks for this content.
@siemdual80263 жыл бұрын
This video is the KEY for my QUERY! Pun intended. Thank you so much!
@hanimahdi72443 жыл бұрын
Thanks a lot! Really amazing , awesome and very clear explanation.
@distrologic2925 Жыл бұрын
love the format
@kevind.shabahang4 жыл бұрын
Excellent introduction
@Tigriszx3 жыл бұрын
SOTA explanation. that's what i was exactly looking for. [tr] okuyan varsa, bu herifi takibe alın, efsane anlatıyor.
@Deddiward2 жыл бұрын
Wow this video is so well done
@timmat6 ай бұрын
Hi. This is a really great visualisation of weightings - thank you! I have a question though: at 11:30 you say you're going to calculate the cross product between the first token's vector and all the other vectors. Should this instead be the dot product, given that you are looking for similarity?
@yacinerouizi8444 жыл бұрын
Thank you for sharing, great tutorial!
@bootagain4 жыл бұрын
Thank you for posting this educational and useful video. though I can not undetstand everything yet, I'll keep seeing the rest of series and trying to understand :) I mean it.
@ishishir3 жыл бұрын
Brilliant explanation
@RaoBlackWellizedArman2 жыл бұрын
Fantastic explanations ^_^ Already subscribed!
@norhanahmed51164 жыл бұрын
thanks alot, that was very simple and useful. hoping all the best for you
@saianishmalla26462 жыл бұрын
This was extremely helpful !!
@offthepathworks9171 Жыл бұрын
Solid gold, thank you.
@mohajeramir4 жыл бұрын
this was an excellent explanation. Thank you
@ashokkumarj5944 жыл бұрын
I love your tutorial 😙😙 Best explanation
@nurlubanu3 жыл бұрын
Well explained! Thank you!
@arvindu93448 ай бұрын
Best explanation, that you so much.
@MohamedSayed-et7lf4 жыл бұрын
Perfectly explained
@vulinhle83434 жыл бұрын
amazing video, thank you very much
@alexanderskusnov5119 Жыл бұрын
To filter (in signals (Low Frequency) and programming (filter predicate vector)) means to hold, not to throw away.
@mohammadelghandour16142 жыл бұрын
Thanks for the easy and thorough explanation. I just have one question. How is "Y" now is more representative or useful (more context) than "V"? can you give an example ?
@luisluiscunha2 жыл бұрын
very well explained: thank you
@ankitmars Жыл бұрын
Best Explanation
@jmarcio513 жыл бұрын
I got the idea, thanks for the explaination.
@rayaay30953 жыл бұрын
Just Awesome... thank you
@injysarhan6723 жыл бұрын
Great video! thanks
@williamstorey5024 Жыл бұрын
what is the reweigh method that you used in the beginning? i would like to look that up and get more details on it.
@thelastone16433 жыл бұрын
You are the best ....
@fadop31562 жыл бұрын
11:28 is it the cross product and not the dot product of the vector?
@ansharora32483 жыл бұрын
Beauty!
@zeroheisenburg34803 жыл бұрын
At 11:33, do you mean dot product instead of cross product? If it's cross product, isnt W11*V1 will be 0 since they are perpendicular?
@RasaHQ3 жыл бұрын
(Vincent here) In general w_ij and v_k are non perpendicular. But you are correct that the multiplications here could be more explicitly written as a dot product.
@zeroheisenburg34803 жыл бұрын
@@RasaHQ Appreciate replying. I brought it up since the video "verbally" said it was doing cross product. So each w_ij value should be a scalar value in this case? Thanks for the clarification.
@gainai_r273 жыл бұрын
this is awesome. what tool do you use to create this whiteboard?