Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass

  Рет қаралды 165,151

Pi School

Pi School

Күн бұрын

Пікірлер: 75
@autripat
@autripat 4 жыл бұрын
Starting @ 15:45, in well under 2 minutes, attention explained! Only a true master can do it. Love.
@Scranny
@Scranny 4 жыл бұрын
K is a matrix representing the T previously seen words and V is the matrix representing the full dictionary of words of the target language, right? But what are K and V exactly? What values do these matrices hold? Are they learned?
@lmaes
@lmaes 3 жыл бұрын
The passion that he transmits is priceless
@tylersnard
@tylersnard 3 жыл бұрын
I love how excited he is.
@Marcos10PT
@Marcos10PT 4 жыл бұрын
This is the best explanation of attention I have seen so far! And I have been looking :)
@ksrajavel
@ksrajavel Жыл бұрын
Bcoz, he is one of the co-author of the revolutionary paper which introduced it
@mosicr
@mosicr 6 жыл бұрын
Great lecture. Best explanation of attention in just a few words.
@itshgirish
@itshgirish 5 жыл бұрын
Great presentation, he's having fun explaining the bits....great camera work- it was fun watching a moving cam than a boring still view.
@igorcherepanov4765
@igorcherepanov4765 5 жыл бұрын
"there is this guy, he never got his bachelor but he wrote most of these papers" - appreciation
@threeMetreJim
@threeMetreJim 5 жыл бұрын
Where experience and 'thinking outside the box' can beat education in some cases. He should be getting an 'honorary' bachelor degree, if he hasn't already.
@MucciciBandz
@MucciciBandz 5 жыл бұрын
Excuse me? that's fake news! Even his linked in profile says Duke 1998 (yes it's the same noam shazeer from this exact same paper)... "Noam Shazeer is an Engineer at Google. He graduated from Duke in 1998 with a double major in Mathematics and Computer Science"
@MrLacker
@MrLacker 4 жыл бұрын
I think he meant that Noam doesn't have a PhD. Noam does have a bachelors degree, but he started working at Google pretty soon after graduating (literally decades ago) and has contributed to many important Google technologies in his time there. Noam was a Google old-timer back when I started working there in 2005.
@brandomiranda6703
@brandomiranda6703 3 жыл бұрын
where is the library he talks about to get the details of training the DL "right"?
@kvsnoufal
@kvsnoufal 3 жыл бұрын
31:55
@brandomiranda6703
@brandomiranda6703 3 жыл бұрын
@@kvsnoufal is there one for pytorch?
@nabinchaudhary73
@nabinchaudhary73 2 жыл бұрын
does embedding gets trained or key or query or value gets trained i am confused. please help
@FranckDernoncourt
@FranckDernoncourt 4 жыл бұрын
Thanks for sharing! It'd be great if the video could pay more attention to the slides though.
@pischool6210
@pischool6210 4 жыл бұрын
Thank you for your comment, Franck! You can download the slides here: picampus-school.com/open-day-2017-presentations-download/
@FranckDernoncourt
@FranckDernoncourt 4 жыл бұрын
@@pischool6210 perfect, thanks!
@elliotwaite
@elliotwaite 5 жыл бұрын
Great talk, Łukasz.
@CharlesVanNoland
@CharlesVanNoland Жыл бұрын
I just wish he hadn't stood right in front of what he was trying to show people, but I love his passion for explaining what he's talking about.
@TheGodSaw
@TheGodSaw 7 жыл бұрын
Is there a way to get the slides?
@pischool6210
@pischool6210 7 жыл бұрын
You can download them here: picampus-school.com/open-day-2017-presentations-download/
@khanzorbo
@khanzorbo 6 жыл бұрын
Pi School I have just checked and it seems the slides linked to the presentation is "tensorflow workshop", can you please double-check?
@pischool6210
@pischool6210 6 жыл бұрын
Dear Vladimir, have a look here: drive.google.com/file/d/0B8BcJC1Y8XqobGNBYVpteDdFOWc/view
@kadamparikh8421
@kadamparikh8421 4 жыл бұрын
Great content in this video. Would love if you had the multi-headed devil covered! Though, great video to get the overall view..
@KartoffelnSalatMitAlles
@KartoffelnSalatMitAlles 6 жыл бұрын
What model is that at the beginning? Can I somehow get the machine produced texts which where shown at the beginning of the presentation? "
@Cropinky
@Cropinky 7 ай бұрын
very interesting of him to call deep learning a trade :)
@rinkagamine9201
@rinkagamine9201 6 жыл бұрын
Can I somehow get the machine produced texts which where shown at the beginning of the presentation?
@ahmedb2559
@ahmedb2559 Жыл бұрын
Thank you !
@yacinebenaffane6535
@yacinebenaffane6535 5 жыл бұрын
Nice explain about position and multihead ...
@jayantpriyadarshi9266
@jayantpriyadarshi9266 4 жыл бұрын
Great talk. Something very useful.
@mrvishwjeetkumar
@mrvishwjeetkumar 6 жыл бұрын
very nice lecture ...enjoyed it lot.
@threeMetreJim
@threeMetreJim 5 жыл бұрын
"He didn't put a trophy into the suitcase because it was too small." is an ambiguous statement. "it" could refer to either the trophy or the suitcase. It seems like the answer is mainly decided on probability from past experience, rather than the intended (ambiguous) meaning, similar to a survey or experiment with too small a sample size. It is also possible that he didn't want to put a too small a trophy into the suitcase in case it ended up being jostled about too much, and became damaged; although that is a less likely, but still a possible explanation and would need a thought process to come to that conclusion, or some further context, to clarify the intended meaning. People on the Autistic spectrum (HFA / Asperger's) have that same problem when phrasing thoughts (ambiguous meaning), and are often misunderstood because of it. When a statement has two (or more) possible meanings, then it's probably unfair to judge the performance of a system in 'getting the answer right' as there isn't a definite correct answer to begin with, just a more likely one. A word for word translation, with grammatical correction applied would probably achieve a better result in a case like this. Google translate seems to somewhat agree. Original: He didn't put a trophy into the suitcase because it was too small Google translate: Er hat keine Trophäe in den Koffer gesteckt, weil er zu klein war. Back to english: He did not put a trophy in his suitcase because he was too small. Word for word translation (incorrect, but probably still understandable if you speak German): er nicht stellen ein Trophäe in das koffer da es was auch klein. Google translate of word to word to english (much better but still wrong - where did the 'also' come from?):he does not put a trophy in the suitcase as it is also small.
@nsuryapa1
@nsuryapa1 5 жыл бұрын
Nice explanation!!!!
@someone_518
@someone_518 Жыл бұрын
ChatGPT gave me link to this video)
@RobertElliotPahel-Short
@RobertElliotPahel-Short 4 жыл бұрын
math majors/ graduate math students skip to 15:36
@HimanshuGhadigaonkar
@HimanshuGhadigaonkar 4 жыл бұрын
Best expaination!!
@intelligenttrends8935
@intelligenttrends8935 5 жыл бұрын
Here I get it. Thank u
@vast634
@vast634 4 жыл бұрын
They should invent a device that can always tell the time of day when the user wants.
@ramyaneekashyap4356
@ramyaneekashyap4356 4 жыл бұрын
Is there any way i could get the ppts for reference?
@pischool6210
@pischool6210 4 жыл бұрын
Hi, sure! You can download it here: picampus-school.com/open-day-2017-presentations-download/
@ramyaneekashyap4356
@ramyaneekashyap4356 4 жыл бұрын
@@pischool6210 thankyou so much!!!!
@rishabhshirke1175
@rishabhshirke1175 5 жыл бұрын
nothing beats GPT 2 TL;DR summarization trick
@homeroni
@homeroni 5 жыл бұрын
Are the talks he is referring to (as the previous talks) available on KZbin?
@pischool6210
@pischool6210 5 жыл бұрын
Hello! Sure. You can find all the Masterclasses from our Open Day here 👉kzbin.info/aero/PLU3hjga27ZUiuL8V0CVlidBK27CDxWf-F
@IExSet
@IExSet Жыл бұрын
Strange thing, he mention "attention" term before explaining what it is. What was EXACT meaning of this Query Key Value magic ??? I suspect speakers just copy thoughts of another people mechanically, not understaning real meaning of operations !
@gilgarad1
@gilgarad1 6 жыл бұрын
Nice lecture. I enjoyed it
@louerleseigneur4532
@louerleseigneur4532 3 жыл бұрын
Thanks buddy
@pankajtiwari12
@pankajtiwari12 4 жыл бұрын
great explanation !
@pankajtiwari12
@pankajtiwari12 4 жыл бұрын
@27:14 multitasking
@josy26
@josy26 5 жыл бұрын
Slides?
@SubhamKumar-eg1pw
@SubhamKumar-eg1pw 5 жыл бұрын
drive.google.com/file/d/0B8BcJC1Y8XqobGNBYVpteDdFOWc/view
@TheAIEpiphany
@TheAIEpiphany 4 жыл бұрын
47:55 "We tried it on images it didn't work so well". 2020, Visual Transformer: am I a joke to you?
@souhamghosh8714
@souhamghosh8714 4 жыл бұрын
In VIT, it is clearly stated that a "small dataset" like imagenet doesnt show promising results but a larger dataset like the jft gives amazing results, so this maybe a start, but it is far from perfection. Btw, I am not contradicting your statement. 😁. and also JFT is not an open source dataset(yet)
@TheAIEpiphany
@TheAIEpiphany 4 жыл бұрын
@@souhamghosh8714 True Google folks ^^
@souhamghosh8714
@souhamghosh8714 4 жыл бұрын
“Hi, I am from google, you know what i got, TPUs..more than you can imagine”😂
@kingenking9303
@kingenking9303 3 жыл бұрын
the video image is too poor, you need to fix it more
@sajjadayobi688
@sajjadayobi688 4 жыл бұрын
Transformers learned translation without language dependency O_o
@alexandrogomez5493
@alexandrogomez5493 Жыл бұрын
Tarea 6
@uhmerikuhn
@uhmerikuhn 3 жыл бұрын
...comes from Google - Check. ...TensorFlow T-shirt - Check. Most viewers therefore rate this lecture highly - Check. This is very hand-wavy throughout with relatively no rigor shown. There are many lectures/presentations online which actually explain the nuts and bolts and wider use cases of Attention mechanisms. Maybe the title of this video should be something else, like "Our group's success with one use case (language translation) of Attention." Frankly, the drive-by treatment of the technical details of language translation case was almost terrible and should have probably been omitted.
@georgemaratos1122
@georgemaratos1122 3 жыл бұрын
which lectures do you like that explain attention mechanisms and their wider use?
@ytubeanon
@ytubeanon 3 ай бұрын
is this guy one of the father's of modern AI? is he a primary reason for chatGPT?
@mathforai-j5y
@mathforai-j5y 3 ай бұрын
yes
@ShadowD2C
@ShadowD2C 8 ай бұрын
good video but his and the camera placements are subobtimal
@clray123
@clray123 4 жыл бұрын
Most I gather from this talk is that "attention" is a pretty terrible term. Something like "fuzzy lookup" or "matching" or "mapping" would have been much more descriptive, but oh well, which researcher needs to think about terminology before unleashing it on the world.
@aojing
@aojing 6 жыл бұрын
can't believe this guy was one of the authors of Transformer. He just can not explain what he was doing!
@mauricet910
@mauricet910 6 жыл бұрын
I thought it was a really insightful talk. I'm preparing a talk about Transformer myself, and this talk was super inspiring :)
@haiyangsun8344
@haiyangsun8344 5 жыл бұрын
I also couldn't understand.. The architecture diagram is not very intuitive, and I was expecting some elaborations.. However, the explanation was not clear...
@NicholasAmpazis
@NicholasAmpazis 5 жыл бұрын
If you don’t already know something about attention then it’s impossible to follow the presentation. Everything is explained very poorly...
@clray123
@clray123 4 жыл бұрын
His communication skill are like a runner who is tripping over his shoelaces. Unfortunately, it seems to be quite a common ailment of even "brilliant" coders (or shall I say, scientists) that they can't explain their ideas to others clearly using natural language. It's like they have no model of someone else's knowledge and take so many things for granted that their attempts at "explanation" just sound like gobbledygook to those who expect to be taught something. That's why we have technical writers, teachers, popular science books etc.
@clray123
@clray123 2 жыл бұрын
@Yancy Stevens Yes, to communicate you have to model in your head whoever you are communicating to, what they know, don't know, and foremost what they want to know. Otherwise it's just a fail, no matter how much knowledge you have.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
LSTM is dead. Long Live Transformers!
28:48
Seattle Applied Deep Learning
Рет қаралды 530 М.
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН
Lukasz Kaiser at AI Frontiers 2017: One Model to Learn It All
17:39
AI Frontiers
Рет қаралды 1,3 М.
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 1,9 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 792 М.
CS480/680 Lecture 19: Attention and Transformer Networks
1:22:38
Pascal Poupart
Рет қаралды 351 М.
11. Introduction to Machine Learning
51:31
MIT OpenCourseWare
Рет қаралды 1,6 МЛН
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,1 МЛН
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.