MIT 6.S191 (2022): Recurrent Neural Networks and Transformers

  Рет қаралды 253,148

Alexander Amini

Alexander Amini

Күн бұрын

MIT Introduction to Deep Learning 6.S191: Lecture 2
Recurrent Neural Networks
Lecturer: Ava Soleimany
January 2022
For all lectures, slides, and lab materials: introtodeeplearning.com
Lecture Outline
0:00​ - Introduction
1:59​ - Sequence modeling
4:16​ - Neurons with recurrence
10:09 - Recurrent neural networks
11:42​ - RNN intuition
14:44​ - Unfolding RNNs
16:43 - RNNs from scratch
19:49 - Design criteria for sequential modeling
21:00 - Word prediction example
27:49​ - Backpropagation through time
30:02 - Gradient issues
33:53​ - Long short term memory (LSTM)
35:35​ - RNN applications
40:22 - Attention fundamentals
43:12 - Intuition of attention
44:53 - Attention and search relationship
47:16 - Learning attention with neural networks
54:52 - Scaling attention and applications
56:09 - Summary
Subscribe to stay up to date with new deep learning lectures at MIT, or follow us @MITDeepLearning on Twitter and Instagram to stay fully-connected!!

Пікірлер: 111
@andreas.karatzas
@andreas.karatzas 2 жыл бұрын
I'm still trying to figure out how did you manage to perfectly describe the logic behind attention mechanisms in 10 minutes ...
@MahJohn
@MahJohn 2 жыл бұрын
It was absolutely amazing, especially how she connect the notion of search to attention. Thanks for the intuitive connection.
@ramlalramlal8072
@ramlalramlal8072 2 жыл бұрын
@@MahJohn .aaaaaaakaaaaaaa
@jacktrainer4387
@jacktrainer4387 2 жыл бұрын
Everything that comes out of MIT is pure gold. You'd think that the concepts would be described at a high, inaccessible level, but that's not the case. The lectures are student friendly & homeworks are challenging and doable.
@kevinbissinger
@kevinbissinger 2 жыл бұрын
Yeah unlike the Harvards of the world MIT always seemed to focus on having the richest education, not the richest students
@usrehman5046
@usrehman5046 Жыл бұрын
just wanted to know is this just theoretical stuff or they also walk through any code for its implementation ??
@chiranjeevisagi
@chiranjeevisagi 2 жыл бұрын
Thank you to Alexander Amini and Ava soleimany for making this course accessible to everyone, which otherwise is a distant dream for many people like myself to learn such high quality content.
@mohammadalaaelghamry8010
@mohammadalaaelghamry8010 2 жыл бұрын
Just amazing how well those two lectures are layed out, structured and explained, nothing comes close to them in my experience so far, thank you so much Alexander and Ava, heading for the first lab now.
@SteveSperandeo
@SteveSperandeo 2 жыл бұрын
Who needs GPT-3 when we have Ava? Amazingly clear, succinct, and enjoyable presentation. Thank you Ava!
@robertoooooooooo
@robertoooooooooo Жыл бұрын
this is genius. This lecture is pure gold. Such difficult concepts like transformers explained in a 15 minutes seems to be impossible but she did it. Thank you MIT!
@laminsesay1435
@laminsesay1435 2 жыл бұрын
if you are watching, learning and practicing this video, you have be granted a visa to the future. Alexander Amini, Ava Solemany and the rest of the team thanks. you guy are amazing
@mtamjidhossain
@mtamjidhossain 2 жыл бұрын
Set reminder, patiently waiting. It's a great initiative, cant thank the organizers and instructors enough!
@argc
@argc Жыл бұрын
This is by far the best explanation of Transformers that I have ever seen. It all makes more sense now. Thanks
@tantzer6113
@tantzer6113 2 жыл бұрын
Excellent lecture. Very well designed, clear, intuitive, well balanced. A lot was accomplished in one hour! I learned a lot.
@ImtithalSaeed
@ImtithalSaeed Жыл бұрын
unable to describe how amazing is this ... thank you Ava
@user-th7jy6zy1f
@user-th7jy6zy1f 2 жыл бұрын
The best explanation of attention mechanism I have ever seen. It is very intuitive and easy to understand.
@alekseimikhalev1807
@alekseimikhalev1807 Жыл бұрын
This is just the best, brilliant explanation of RNN and attention mechanism I've ever seen! Thank you guys for such a hard work!
@arnavraina2615
@arnavraina2615 2 жыл бұрын
Feels like I'm waiting for a much awaited movie trailer! This is quality.
@bobbymajor8122
@bobbymajor8122 2 жыл бұрын
God bless MIT
@vantagepointmoon
@vantagepointmoon Жыл бұрын
Takes a really bright mind and a lot of practice together with hands-on experience to pack so much information about a complex subject so briefly and elegantly. I bet it took some back propagation through time to arrive to that.
@HarshithBachimanchi
@HarshithBachimanchi Жыл бұрын
Excellent explanation! This is perhaps the best description about the roots of the attention mechanism, and the intuition behind it. People who follow the route of CNNs -> GANs -> ViTs in their deep learning journey have trouble in understanding the self-attention (without having much knowledge about RNNs). This is like an excellent "bridge" video that fills all the gaps! Great effort by Ava!
@hyewoncho4797
@hyewoncho4797 Жыл бұрын
This is definitely the best video for describing attention mechanisms and the logic behind them. Many videos only try to review as it is written in the paper. Thank you so much! It really helped me a lot to get the attention even more clearly!
@AAmini
@AAmini Жыл бұрын
Thank you very much!
@asokakarunananda
@asokakarunananda Жыл бұрын
Excellent presentation on the transition from RNN to Attention-based Transformer networks. Thank you
@281019641
@281019641 Жыл бұрын
Precise and very well explained. Thank you for making this course accessible.
@dianakapralova6025
@dianakapralova6025 2 жыл бұрын
Thanks for detailed explanations. Especially, attention!And finally attention all that we need and additionally understand thanks to you:-)
@jovijsdsilva
@jovijsdsilva Жыл бұрын
Really intuitive way of teaching. The concepts are explained really well.
@TheKuhlschrank
@TheKuhlschrank Жыл бұрын
Amazing intuition behind Transformers, thank you!
@doctorrobot9757
@doctorrobot9757 2 жыл бұрын
I am happy to be able to access this course. Your job is much appreciated. THANKS
@livesinging3924
@livesinging3924 Жыл бұрын
Figuring out attention mechanism in minutes is super cool and intuitive. Thanks ava for such a clarity of your train of thoughts throughout the lecture. God bless.
@SinkingPoint
@SinkingPoint 2 жыл бұрын
This is by far the best explanation of attention that I've seen. It definitely deserves its own video. Maybe a video on transformers that covers attention and some more detail on the other components of the architecture?
@tantzer6113
@tantzer6113 2 жыл бұрын
Yes, I too would love to see another lecture on attention and transformers, with a more detailed treatment of the Q, K, V matrices and how they function. I understand what was said but can’t derive it in my own.
@siak2910
@siak2910 Жыл бұрын
God bless you, Alexander and co
@tricialobo9233
@tricialobo9233 Жыл бұрын
Ava is a very talented lecturer - thanks for the cogent explanation of RNNs.
@abdullahalsefat9439
@abdullahalsefat9439 2 жыл бұрын
Yes I really appreciate the intuition of attention very much.
@ShaidaMuhammad
@ShaidaMuhammad 2 жыл бұрын
Finally, I understood the self attention mechanism completely.
@Fordance100
@Fordance100 2 жыл бұрын
Good explanation on self-attention. It gives me better intuitions on the topic.
@XBJASHITHABATULA
@XBJASHITHABATULA 2 жыл бұрын
this is one of the best lectures ever
@tantzer6113
@tantzer6113 2 жыл бұрын
Ava Khanum, you put together an excellent lecture. Thank you very much!
@soheilparsa1
@soheilparsa1 Жыл бұрын
Such a great explanation! Thanks for sharing!!!
@kenedymatiassoportella808
@kenedymatiassoportella808 Жыл бұрын
What a lecture! Thanks a lot and keep up the great work.
@shubham-pp4cw
@shubham-pp4cw 2 жыл бұрын
nice and clear video explantion on RNNs and Attention
@user-tc7gu8yo2e
@user-tc7gu8yo2e Жыл бұрын
Thank you for this lecture!
@abdjahdoiahdoai
@abdjahdoiahdoai 2 жыл бұрын
This is very well made. Thanks a lot!
@arielcarvalhodados
@arielcarvalhodados Жыл бұрын
Thanks for the content, all the details were amazing!
@Freddychao
@Freddychao Жыл бұрын
Oh Ava became your wife! congrats!
@mihaitudor8924
@mihaitudor8924 2 жыл бұрын
Nicee... Glad to see you guys are back.
@seth.photongraphy2673
@seth.photongraphy2673 Жыл бұрын
The "deep learning couple" is at it again! and congrats!
@anikettayade6000
@anikettayade6000 2 жыл бұрын
Very great explanation, completely understood. thanks to team MIT 😁
@jzhuo0119
@jzhuo0119 Жыл бұрын
This is so well explained - thanks a lot
@Gerin-Benoit
@Gerin-Benoit Жыл бұрын
Impressive presentation, thanks a lot for sharing !
@ZorroNumber1
@ZorroNumber1 2 жыл бұрын
Passion for science and technology is just oozing out of Ava !! Persian Passion.
@user-vq8bk1qi5z
@user-vq8bk1qi5z 2 жыл бұрын
the single most fascinating and entertaining video for me.... let me just express my appreciation YYDS (means best of ever)
@ansariyusuf4774
@ansariyusuf4774 2 жыл бұрын
I struggled to wrap my head around neural networks for sequential data. The intuition provided for each concept is perfect, it was so engaging that the timeline of this lecture is now embedded in my head (pun intended ;))
@josedemetriosanchezchamorr8247
@josedemetriosanchezchamorr8247 2 жыл бұрын
excellent lecture, I'm excited!
@dougb70
@dougb70 Жыл бұрын
43:26 - The Transformers: More than Meets the Eye. Missed it on the news, but saw it on Netflix.
@waibeeYB
@waibeeYB Жыл бұрын
Thanks a lot for uploading these videos. I have already done the amazing DLS by Andrew Ng, but this video is a perfect summary and a revision. By the way, at 32:30, it's the derivative that is assigned the value 1 whenever the argument is greater than 0. I guess that's a speaking error, but just wanted to put the forward.
@floopybits8037
@floopybits8037 2 жыл бұрын
Loved the explanation of Transformers you made such a complicated topic so much elegant and easy to understand
@style24_7
@style24_7 2 жыл бұрын
Reminder on,
@subashchandrapakhrin3537
@subashchandrapakhrin3537 6 ай бұрын
Wonderful !!! Ava
@carlotonydaristotile7420
@carlotonydaristotile7420 Жыл бұрын
Very cool lecture.
@yegounkim1840
@yegounkim1840 Жыл бұрын
Great lecture
@jijie133
@jijie133 Жыл бұрын
Great video!
@naveenrs7742
@naveenrs7742 2 жыл бұрын
Awesome loved this knowledge shower from India
@poojadurgi5650
@poojadurgi5650 2 жыл бұрын
This is goldddd
@dariovicenzo8139
@dariovicenzo8139 Жыл бұрын
Ava forces herself to be clear splitting in little piece complex concepts to be understandable by us.
@masoudakhgar8544
@masoudakhgar8544 Жыл бұрын
well done. proud of you when I see there are such persian speaking people.
@ahmedchaoukichami9345
@ahmedchaoukichami9345 2 жыл бұрын
good work thanks a lot but how we can get the linear layer or how we calculate it to get attention
@ibrahimadiallo1020
@ibrahimadiallo1020 2 жыл бұрын
thank you
@MarcJSchmidt
@MarcJSchmidt Жыл бұрын
43:00 "long memory" is a bit misleading as transformer has O(n^2) in its attention layer, which is not scalable either (there are variations with better O, but they are less poweful). so it might work better than rnn/lstm, it is not a silver bullet and only slighty increases the context size (to few thousand tokens). we still can't have "long memory" in the sense of hundreds of thousands or millions of tokens.
@caiomar
@caiomar 2 жыл бұрын
I love these series! Thank you for sharing the knowledge! I am listening to very word! Now I am getting Instagram ads for MIT Full AI course for the hefty price of $3300 USD , I wish I could afford it ;/
@thosedreams
@thosedreams Жыл бұрын
Which course is this?
@shashankrao8791
@shashankrao8791 10 ай бұрын
The attention mechanism is being used widely in computer vision (vit). In an example of image classification, what type of image data is treated as query, key and value ?
@leonardooliveirapaschoal552
@leonardooliveirapaschoal552 Жыл бұрын
Awesome explanation about attention mechenism in a very shot time hahhahaha!
@mechanicsmechatronicsandro3514
@mechanicsmechatronicsandro3514 Жыл бұрын
Bravo!
@ummantekutti7479
@ummantekutti7479 2 жыл бұрын
I'm waiting ....
@Jkauppa
@Jkauppa 2 жыл бұрын
you can directly use FIR and IIR filter methods to neural networks, and laplace transforms
@Jkauppa
@Jkauppa 2 жыл бұрын
you just have multiple output and inputs, maybe like image filtering
@Jkauppa
@Jkauppa 2 жыл бұрын
or video filtering, with the time data with frames included
@andrewlane7233
@andrewlane7233 Жыл бұрын
This is absolutely amazing! Does anyone know of any programs that can be used to generate such great looking neural network/code slides? Or the one that was used to create these?
@AAmini
@AAmini Жыл бұрын
Thanks! We use Powerpoint for the slides.
@aidanmclaughlin5279
@aidanmclaughlin5279 Жыл бұрын
Why does the naive concatenated approach lose a notion of sequence? (I'm confused about the red X next to No Order at 42:40)
@jingxinzhao878
@jingxinzhao878 2 жыл бұрын
Anyone tried Lab1 Part2? I wonder whether anyone get any songs in the last cell? Even the sample_song does not work.
@mahdica5879
@mahdica5879 2 жыл бұрын
Is it possible to get a certificate from the related website? I'd be grateful if you guide me.
@vent_srikar7360
@vent_srikar7360 Жыл бұрын
at time step 42:00 is she referring to encoder and decoder model
@thosedreams
@thosedreams Жыл бұрын
What's really "self" in self-attention? If it's the "input" as mentioned in this video, how is it different from attention?
@newbie8051
@newbie8051 Жыл бұрын
30:28 What exactly is gradient clipping ?
@allenhsu2033
@allenhsu2033 2 жыл бұрын
HI. i stll can not understand the encoding bottleneck about limitations of RNN, what exactly means about encoding bottleneck? can anyone explain it more clearly?
@MahJohn
@MahJohn 2 жыл бұрын
Since you get some intuition here, consider this lecture here kzbin.info/www/bejne/iWOaoXuBd6qjaZI I found his explanation about one of the shortcomings "encoding bottleneck" of the lstms that motivated the need for the attention mechanism useful. I hope you will find it helpful.
@OBHACKER
@OBHACKER 2 жыл бұрын
Where i can find the link of attention mechanisme lab? i did check the link..
@forheuristiclifeksh7836
@forheuristiclifeksh7836 Ай бұрын
57:51
@satviksrivastava6632
@satviksrivastava6632 2 жыл бұрын
Why we have same weights for different inputs??
@hamidkhan1433
@hamidkhan1433 Жыл бұрын
I need to get some help regarding the process of getting admission in MIT EECS for under graduation that how can I apply and what are the requirements?? If someone has any idea kindly help me out with that I will really appreciate your effort...
@avoidprogress6002
@avoidprogress6002 2 жыл бұрын
I dont really understand the „many to many“ example in the sequence modeling applications. It looks to me that the output sequence has always the same length as the input sequence, which is rare for language translation. Also the first element of the output sequence is produced without any feedback from later timestamps and is not reevaluated later on.
@avoidprogress6002
@avoidprogress6002 2 жыл бұрын
Maybe as an example: the network translates english to german and the input sequence starts with „The“. The Network might then have a first output sequence element of „Das“, which is fine. Then the second element of the Input sequence is „Human“, the network gets „Mensch“ as second output element. Now it would have to change the first output from „Das“ to „Der“, which i dont think would ever happen with the presented Modeling.
@saurabhchopra
@saurabhchopra 2 жыл бұрын
16:19 Shouldn't the last loss be Lt, rather than L3?
@MahJohn
@MahJohn 2 жыл бұрын
I suppose it's just some typo.
@AAmini
@AAmini 2 жыл бұрын
Yes, thanks for pointing that out
@brown8350
@brown8350 2 жыл бұрын
Too good to be true! But why? Thanks a ton!
@mertbozkurt329
@mertbozkurt329 2 жыл бұрын
💯
@JPwyckoff
@JPwyckoff Жыл бұрын
10:46
@helloansuman
@helloansuman 2 жыл бұрын
Very fast videos. Need to slow down and explain key concepts clearly. Otherwise it's like a sweet story.
@adarshdayalan9056
@adarshdayalan9056 Жыл бұрын
How is this available for free
@abdikafihashihersi4728
@abdikafihashihersi4728 2 жыл бұрын
What is linear layer at 50:03?
@AAmini
@AAmini 2 жыл бұрын
A dense (i.e., fully connected) layer with a linear activation function (i.e., no activation function).
@abdikafihashihersi4728
@abdikafihashihersi4728 2 жыл бұрын
@@AAmini Got it. By the way, I was not expecting this urgent response. Thank you sir Mr. Amini.
@AAmini
@AAmini 2 жыл бұрын
Thanks for the question!
@mosca204
@mosca204 2 жыл бұрын
Can I give this video more than 1 like?
@amielwexler1165
@amielwexler1165 Жыл бұрын
Comment for algorithm
@ajaytaneja111
@ajaytaneja111 2 жыл бұрын
"Attention Is All you Need" - The intuition of Query, Key and Value is one of the best from what I've read or watched (in other courses) until now....Excellent job Ava Soleimany, thank you
@keshavmaheshwari521
@keshavmaheshwari521 2 жыл бұрын
I'm waiting ....
MIT 6.S191 (2022): Convolutional Neural Networks
49:23
Alexander Amini
Рет қаралды 119 М.
MIT 6.S191 (2021): Recurrent Neural Networks
1:00:31
Alexander Amini
Рет қаралды 296 М.
We Got Expelled From Scholl After This...
00:10
Jojo Sim
Рет қаралды 23 МЛН
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 11 МЛН
26. Chernobyl - How It Happened
54:24
MIT OpenCourseWare
Рет қаралды 2,8 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 258 М.
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 99 М.
Decoder-Only Transformers, ChatGPTs specific Transformer, Clearly Explained!!!
36:45
StatQuest with Josh Starmer
Рет қаралды 100 М.
MIT 6.S191: Building AI Models in the Wild
54:57
Alexander Amini
Рет қаралды 6 М.
1. Introduction, Financial Terms and Concepts
1:00:30
MIT OpenCourseWare
Рет қаралды 7 МЛН
Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition
57:57
Stanford University School of Engineering
Рет қаралды 2,5 МЛН
iPhone 12 socket cleaning #fixit
0:30
Tamar DB (mt)
Рет қаралды 33 МЛН
Хотела заскамить на Айфон!😱📱(@gertieinar)
0:21
Взрывная История
Рет қаралды 781 М.
😱НОУТБУК СОСЕДКИ😱
0:30
OMG DEN
Рет қаралды 3,5 МЛН
Apple watch hidden camera
0:34
_vector_
Рет қаралды 60 МЛН