The guy in the video not just understands the concept, but also understands what it is that understanding of others might lack so he can fill in the gaps.
@joliver19814 жыл бұрын
I have watched tons of videos and finally an original video that actually teaches these concepts. There are so many KZbinrs that simply make a video regurgitating something they read somewhere but they don’t really teach anything because they themselves don’t really understand the idea. Bravo. Well done. I actually learned something. Thank you!
@RasaHQ4 жыл бұрын
(Vincent here) I just wanted to mention that I certainly sympathize. It's hard to find proper originality out there when it comes to teaching data science.
@WIFI-nf4tg3 жыл бұрын
@@RasaHQ Hi Rasa, can you also explain "how" we should express words into numbers for the vector v ? For example, is there a preferred word embedding?
@RasaHQ3 жыл бұрын
@@WIFI-nf4tg Rasa doesn't prefer a word embedding, but a common choice is spaCy. Note that technically, countvectors are also part of the feature space that does into DIET.
@briancase6180 Жыл бұрын
Yeah, and some of those videos use a script generated by an AI that "read" the relevant sections of a paper. Get used to having to wade through tons of AI-generated content. We need laws that require AI-generated content to be labeled as such. But, it's probably unenforceable. How much original content is enough to avoid the "Ai-generated" label? ☺️
@homeboundrecords6955 Жыл бұрын
TOTALLY agree most 'education' vids really add to confusion and just regurgitate jargon over and over
@edouardthomasset6683 Жыл бұрын
When the student understands its teacher, it means the teacher understood what he explains. I understood everything contrary to the majority of other youtubers on the same topic. Thanks !
@MrLazini10 ай бұрын
I love how you use different colors to represent different dynamics between relationships. Such a simple idea, yet so good at conveying meaning
@guanxi994 жыл бұрын
After dozens of papers and videos studied, that‘s the first one really make me understand the context. Many thanks fornthat!!! It also highlighted for me one fact: Self attention is a smart idea, but the real magic souce is the way word embeddings are created. That will decide on if the contexts created by self attention make sense or do not.
@deoabhijit59353 жыл бұрын
agree
@davidlanday26473 жыл бұрын
A good thought! That makes me wonder if there are metrics we can add to a loss function to assess how well a mechanism "attends" to words in a sentence. Like, if we look at the embedding space, we would probably want to see words that are contextually proximal/similar and cluster close together. So I guess, some metric to assess how well an attention mechanism captures all contexts.
@parryhotter18 Жыл бұрын
This. If a bit late 😊. Yes the creation of an embedding, i.e. the creation of a vector for each word seems to be the main storage of semantics of each word. This video IS the best i have seen so far in that he always explains firstly WHY and then How the best step works. Great approach!
@stanislawcronberg3271 Жыл бұрын
Only 4 minutes in and I can tell this series will be a banger. Don't stop teaching, this is pure quality content, much appreciated!
@gomogovo4966 Жыл бұрын
I've been looking for a clear explanation for so so so long. First one I've found. I think all the people that made explanatory videos so far, have 0 understanding of the subject. Thank you.
@alirezakashani30922 жыл бұрын
mind blowing how simple self-attention is explained - thank you
@DaveJ6515 Жыл бұрын
Yes sir, this is it. You have nailed it: not only you know the subject; also know he art of creating the condition for everyone else to go into it gradually and logically. Great.
@rommeltito1233 жыл бұрын
I had so many doubts about the actual operation that happens in self attention. This video just cleared it. Excellent delivery in such a short time.
@fallingintofilm3 жыл бұрын
This was a absolutely eye-opening. Congratulations sir! You win the Internet for a while.
@SiliconValleyRunner4 жыл бұрын
Best ever explanation of "self-attention". Awesome job.
@foxwalks5883 жыл бұрын
This is the best explanation of attention mechanism so far for a regular person like me! I came here after going through Coursera NLP spec and several papers, but only now I am actually able to see how that works. Seems like embeddings themselves are the secret sauce indeed. Thank you.
@azurewang4 жыл бұрын
the most intuitive explaination I have ever seem!!! excellent drawing and accent
@mohammedmaamari92102 жыл бұрын
The clearest explanation of attention mechanisms I've ever seen. Thank you
@ferneutron3 жыл бұрын
Thank you so much for your explanation! When you said: "This is known as SELF ATTENTION". I just thought: BAM! Awesome job Rasa!
@thongnguyen12924 жыл бұрын
I've read dozens of papers and blog posts about this topic, but all they do were mostly walking through the math without showing any intuition. This video is the best I've ever seen, thank you very much!
@binishjoshi11264 жыл бұрын
I've known self attention for some time, this is by far the most intuitive video I've ever see, thank you.
@seanh15912 жыл бұрын
This is the best explanation of Self-Attention mechanism I've encountered after combing through the internet! Thank you!
@timohear3 жыл бұрын
Can't believe I've only stumbled up this now. Fantastic original explanation.
@johnhutton54913 жыл бұрын
Since there isn't a like/dislike ratio anymore, for those wondering, this video is great
@avinashpaul16654 жыл бұрын
on of the best example on the web that explains attention mechanism , after reading many blogs i still had my doubts , the way attention is explained between time series and text data is brilliant and helped me understand better.
@mmpcse4 жыл бұрын
Have gone through some 10-12 videos on Self Attention. This Attention Series 1,2 &3 are by FAR THE BEST EVER. Many Thanks for these Videos. [came back and updated this comment ;-) ]
@magpieradio3 жыл бұрын
This is the best video I have seen so far as to explain things so clearly. Well done.
@brianyoon8152 жыл бұрын
This is incredible after like 10 attention explanations i finally get it here
@vijayabhaskar-j4 жыл бұрын
This attention series is the best clear and intuitive explanation of self-attention out there! Great work!
@trantandat26993 жыл бұрын
I have read a lot about this: paper, medium, video, this one make me the best understanding. Very nice!
@galenw6833 Жыл бұрын
At 11:29, the presenter says "cross product", but I think it's the dot product, so that each of the weights (W_11, etc.) are numbers (otherwise using cross product they would be vectors). Thus we can build a new vector from W_11, W_12, ... Great videos, exactly what I was looking for.
@timholdsworth13 жыл бұрын
Why did you use cross product at 11:31? Wouldn't that be making the weights small when the word embedding vectors are similar, which would then mean the related words in the sequence would be unable to influence the current state?
@Erosis2 жыл бұрын
I think he meant dot product? I don't know.
@briancase61803 жыл бұрын
OMG, this helped me immeasurably. Thanks so much. I just couldn't quite get it from the other explanations I've seen. Now I can go back and probably understand them better. Yay!
@tatiana7581 Жыл бұрын
Thank you sooo much for this video! Finally, someone explained what the self-attention is!
@akashmalhotra47873 жыл бұрын
This is really an amazing explanation! Liked how you build up from time-series and go to text. Keep up the good work :)
@blochspin2 жыл бұрын
best video hands down on the self attention mechanism. Thank you!!!
@TheGroundskeeper Жыл бұрын
Still the best explanation 3 years later
@sowmiya_rocker2 жыл бұрын
Beautiful explanation sir. I'm not sure if i got it all but i could tell you that I've got better idea about self-attention from your video compared to the other ones i watched. Thanks a lot 🙏
@mokhtarawwad6291 Жыл бұрын
Thanks for sharing I have watched based on recommendations from a friend on Facebook I will watch the whole playlist. Thanks for sharing, good bless you 🙏 😊
@dinoscheidt3 жыл бұрын
Love the style. The more talent takes the time to teach new talent, the better. Very appealing style! Subscribed 🦾
@DrJohnnyStalker3 жыл бұрын
Best Self Attention Intuition i have ever seen. Andrew Ng Level stuff!
@benjaminticknor29673 жыл бұрын
Incredible video! Did you mean to say dot product instead of cross product at 11:30?
@ParniaSh3 жыл бұрын
Yes, I think so
@hiteshnagothu8874 жыл бұрын
Never have I ever seen such a great concept explanation. You just made my life easier,@Vincent!!
@dan10400 Жыл бұрын
This is an exceptionally good explanation! Thank you so much. It is easy to see why the thumbs-up count is so high wrt views.
@simranjoharle4220 Жыл бұрын
This is the best explaination for the topic I have come across! Thanks!
@pi5549 Жыл бұрын
Your whiteboarding is beautiful. How are you doing it? I'd love to be able to present in this manner for my students.
@oritcoh4 жыл бұрын
Best Attention explanation, by far.
@Anushkumar-lq6hv Жыл бұрын
The best video on self-attention. No debates
@skauddy755 Жыл бұрын
By far, the most intuitive explanation of self-attention. DISAPPOINTED However, with the number of Likes:(
@sebastianp40233 жыл бұрын
please link this video to the tf doc. I tried a whole day to get behind the concept of attention and this explanation is just beautiful!
@luisvasquez5015 Жыл бұрын
Finally somebody explicitly saying that the distributional hypothesis makes no linguistic sense
@siemdual80263 жыл бұрын
This video is the KEY for my QUERY! Pun intended. Thank you so much!
@maker724602 жыл бұрын
Awesome explanation! It takes great skills to explain such concepts. Looking forward!
@uniqueaakash142 жыл бұрын
best video i have found in self-attention.
@jhumdas46132 жыл бұрын
Amazing explanation!! The best I have come across till date. Thank you so much!
@giyutamioka94372 жыл бұрын
Best explaination I have seen so far .... Thanks!
@pranjalchaubey4 жыл бұрын
This is one of the best videos on the topic, if not the best!
@adrianramirez97292 жыл бұрын
Amazing explanation ! , it did not find too much sense to the comparison between time series, but the second part was really good :)
@suewhooo73903 жыл бұрын
best explanation of attention mechanism out there!! thank you a lot!
@punk39007 ай бұрын
The best explanations you can get in the world. Thanks! BTW, were you aware at the time of making these videos that transformers will be so revolutionary?
@Tigriszx2 жыл бұрын
SOTA explanation. that's what i was exactly looking for. [tr] okuyan varsa, bu herifi takibe alın, efsane anlatıyor.
@ArabicCompetitiveProgramming4 жыл бұрын
Great series about attention!
@andyandurkar78142 жыл бұрын
A very simple explanation .. the best one!
@shivani404sheth43 жыл бұрын
This was so interesting! Thank you for this amazing video.
@louiseti48832 жыл бұрын
Great stuff in here. Super clear and efficient for begginers ! Thanks
@timmat4 ай бұрын
Hi. This is a really great visualisation of weightings - thank you! I have a question though: at 11:30 you say you're going to calculate the cross product between the first token's vector and all the other vectors. Should this instead be the dot product, given that you are looking for similarity?
@QuangNguyen-jz5nl3 жыл бұрын
Thank you for sharing, great tutorial, looking forward to watching more and more great ones.
@hanimahdi72443 жыл бұрын
Thanks a lot! Really amazing , awesome and very clear explanation.
@devanshamin55544 жыл бұрын
Very informative and simple explanation of a complicated topic. 👍🏻
@fadop3156 Жыл бұрын
11:28 is it the cross product and not the dot product of the vector?
@alexanderskusnov5119 Жыл бұрын
To filter (in signals (Low Frequency) and programming (filter predicate vector)) means to hold, not to throw away.
@vikramsandu60542 жыл бұрын
Loved it. Very clear explanation.
@zzzyout Жыл бұрын
11:25 stumble? Cross product or dot product?
@bootagain4 жыл бұрын
Thank you for posting this educational and useful video. though I can not undetstand everything yet, I'll keep seeing the rest of series and trying to understand :) I mean it.
@norhanahmed51164 жыл бұрын
thanks alot, that was very simple and useful. hoping all the best for you
@zeroheisenburg34803 жыл бұрын
At 11:33, do you mean dot product instead of cross product? If it's cross product, isnt W11*V1 will be 0 since they are perpendicular?
@RasaHQ3 жыл бұрын
(Vincent here) In general w_ij and v_k are non perpendicular. But you are correct that the multiplications here could be more explicitly written as a dot product.
@zeroheisenburg34803 жыл бұрын
@@RasaHQ Appreciate replying. I brought it up since the video "verbally" said it was doing cross product. So each w_ij value should be a scalar value in this case? Thanks for the clarification.
@ashh30513 жыл бұрын
You are a great teacher. Thanks for this content.
@sgt3913 жыл бұрын
Crazy useful video!
@sachinshelar88103 жыл бұрын
amazing stuff . Thanks so much Rasa Team :)
@mohammadelghandour16142 жыл бұрын
Thanks for the easy and thorough explanation. I just have one question. How is "Y" now is more representative or useful (more context) than "V"? can you give an example ?
@kevind.shabahang4 жыл бұрын
Excellent introduction
@raunakkbanerjee9016 Жыл бұрын
at 11:28 you say cross product but do you mean dot product?
@williamstorey5024 Жыл бұрын
what is the reweigh method that you used in the beginning? i would like to look that up and get more details on it.
@RaoBlackWellizedArman Жыл бұрын
Fantastic explanations ^_^ Already subscribed!
@yacinerouizi8444 жыл бұрын
Thank you for sharing, great tutorial!
@gainai_r273 жыл бұрын
this is awesome. what tool do you use to create this whiteboard?
@mohajeramir3 жыл бұрын
this was an excellent explanation. Thank you
@vulinhle83434 жыл бұрын
amazing video, thank you very much
@ashokkumarj5944 жыл бұрын
I love your tutorial 😙😙 Best explanation
@clivefernandes54354 жыл бұрын
So this is different from the one we use a feedforward network rite? Used by Bahdanau
@subhamkundu5043 Жыл бұрын
I have a query in the video there is one sentence called "Bank of the river", now suppose there is another sentence called " I love youtube videos a lot". Here the no of words are more so does the number of words matter ?
@jmarcio513 жыл бұрын
I got the idea, thanks for the explaination.
@kumardeepankar2 жыл бұрын
@Rasa Didn't the curve will go up and down instead of increasing continuous.
@Deddiward2 жыл бұрын
Wow this video is so well done
@SubhamKumar-eg1pw4 жыл бұрын
But in general the weights are trained right?
@krishnachauhan28504 жыл бұрын
First time I m getting intuition if attention seriously...but sir I am confused are ppl using this only in speech analysis as it's time series data...like you introduced
@roncahlon Жыл бұрын
Why do you ned to multiply by the word vectors for a second time? I.E why couldn't you just say Y1 = w11 + w12 + w13 + w14. What is the value of having it normalized?
@TimKaseyMythHealer Жыл бұрын
Trying to wrap my brain around LLM processing. It would be great if someone were to create a 3D flow chart of all layers, all attention heads. Zooming into each section as a single word and/or sentence is being processed.
@ishishir3 жыл бұрын
Brilliant explanation
@23232323rdurian Жыл бұрын
the content for the word vectors are all the OTHER words seen to statistically co-occur in corpora, weighted by their frequencies....so stopwords [the, a, is] cuz they are so frequent, hence dont contribute much topic/semantics, while contentwords are less frequent so contribute more. the word vector ('meaning') for is just all the N words most frequently observed nearby CAT in corpora, discounting for frequency.... works great for cases like [king, queen] cuz they occur in similar contexts in copora... but not for [Noah, cat] cuz that's peculiar/local to this instance..... and also not for co-references [cat, she] which are harder to resolve....you gotta keep a STORY context....where presumably you mighta already seen some reference to ...... and for the co-reference.....well, they're just harder to resolve, tho in this example *HAS* to resolve to either Noa or cat, cuz those are the ONLY choices, and by chance (we assume) all three co-refer..... ==> after all, there's a legit chance that isnt the cat in the example, but the cat's MOM, who can be an ANNOYING MOM, yet nevertheless Noa is still a great cat.....