Self Attention in Transformer Neural Networks (with Code!)

  Рет қаралды 112,485

CodeEmporium

CodeEmporium

Күн бұрын

Пікірлер: 170
@CodeEmporium
@CodeEmporium Жыл бұрын
If you think I deserve it, please consider liking the video and subscribing for more content like this :)
@tomoki-v6o
@tomoki-v6o Жыл бұрын
do have any idea how transformers generates new data ?
@15jorada
@15jorada Жыл бұрын
You are amazing man! Of course you deserve it! You are building transformers from the ground up! That's insane!
@vipinsou3170
@vipinsou3170 Жыл бұрын
​@@tomoki-v6ousing decoder 😮😮😊
@radityawaliulu
@radityawaliulu 18 күн бұрын
yeah off course
@marktahu2932
@marktahu2932 Жыл бұрын
I have learnt so much between yourself, ChatGPT, and Alexander & Ava Amini iat MIT 6.S191. Thank you all.
@jeffrey5602
@jeffrey5602 Жыл бұрын
What's important is that for every token generation step we always feed the whole sequence of previously generated tokens into the decoder, not just the last one. So you start with the token and generate now new token, then feed + into the decoder, so basically just appending the generated token to the sequence of decoder inputs. That might have not been clear in the video. Otherwise great work. Love your channel!
@tonywang7933
@tonywang7933 Жыл бұрын
Thank you so much, I searched so many places, this is the first place finally have a nice person willing to spend time really dig in step by step. I'm going to value this channel as good as Fireship now.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the compliments and glad you are sticking around!
@MaksymLeshchenko-d7k
@MaksymLeshchenko-d7k Жыл бұрын
I usually don't write comments, but this channel really deserves one! Thank you so much for such a great tutorial. I watched your first video about Transformers and the Attention mechanism, which was really informative, but this one is even more detailed and useful.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for the compliments! This is the first in a series of videos called “Transformers from scratch “. Hope you’ll check the rest of the playlist out
@nikkilin4396
@nikkilin4396 10 ай бұрын
It's one of the best videos I have watched. The concepts are explained very much, specially with codes.
@shullepys
@shullepys 20 сағат бұрын
I'm finally understanding self attention thanks to this video! Thank you!
@softwine91
@softwine91 Жыл бұрын
What can I say, dude! God bless you This is the only content on the whole youtube that really explain the self-attention mechanism in a brilliant way. Thank you very much. I'd like to know if the key, query, and value matrixes are updated via backpropagation during the training phase.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the kind words. These matrices I mentioned in the code represent the actual data. So no. However, the 3 weight matrices that map a word vector to Q,K,V are indeed updated via backprop. Hope that lil nuance makes sense
@picassoofai4061
@picassoofai4061 Жыл бұрын
I definitely agree.
@EngineeredFemale
@EngineeredFemale Жыл бұрын
I was legit searching for self attention concept vids and thinking that it sucked that you didn't cover it yet. And voila here we are. Thankyou so much for uploading!!
@CodeEmporium
@CodeEmporium Жыл бұрын
Glad I could deliver. Will be uploading more such content shortly :)
@simonebonato5881
@simonebonato5881 Жыл бұрын
One video to understand them all! Dude thanks I've tried to watch like 10 other videos on transformers and attention, yours was really super clear and much more intuitive!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for this compliment! Means a lot :)
@rainmaker5199
@rainmaker5199 Жыл бұрын
This is great! I've been trying to learn attention but it's hard to get past the abstraction in a lot of the papers that mention it, much clearer this way!
@rajpulapakura001
@rajpulapakura001 Жыл бұрын
This is exactly what I needed! Can't believe self-attention is that simple!
@kotcraftchannelukraine6118
@kotcraftchannelukraine6118 Жыл бұрын
I still not understand how to perform a backward pass on the self-attention
@muskanmahajan04
@muskanmahajan04 Жыл бұрын
The best explaination on the internet, thank you!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for the comment. Glad you liked it :)
@radityawaliulu
@radityawaliulu 18 күн бұрын
GREAT,, i hopeyou give more than one hour,, you explain it well than my lecture
@srijeetful
@srijeetful 10 ай бұрын
Extremely well explained. Kudos !!!!
@becayebalde3820
@becayebalde3820 Жыл бұрын
This is pure gold man! Transformers are complex but this video really gives me hope.
@pratyushrao7979
@pratyushrao7979 Жыл бұрын
What are the prerequisites for this video? Do we need to know about encoder decoder architecture before hand? The video feels like I jumped right in the middle of something without any context. I'm confused
@cv462-l4x
@cv462-l4x 9 ай бұрын
​@pratyushrao7979 there are Playlists for different topics
@arunganesan1559
@arunganesan1559 Жыл бұрын
Thanks!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the donation! And you are very welcome!
@shivamkaushik6637
@shivamkaushik6637 Жыл бұрын
With all my heart, you deserve a lot of respect Thanks for the content. Damn I missed my metro station because of you.
@CodeEmporium
@CodeEmporium Жыл бұрын
Hahahaha your words are too kind! Please check the rest of the Transformers from scratch” playlist for more (it’s fine to miss the metro for education lol)
@shailajashukla5841
@shailajashukla5841 10 ай бұрын
Excellent , how well you explained. NO other video on youtube explained like this , Really done good job.
@debjanidas5786
@debjanidas5786 8 ай бұрын
search CampusX
@jazonsamillano
@jazonsamillano Жыл бұрын
Great video. Thank you very much.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much!
@dataflex4440
@dataflex4440 Жыл бұрын
This Has been a most wonderful series on this channel so far
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks a ton! Super glad you enjoyed the series :D
@varungowtham3002
@varungowtham3002 Жыл бұрын
ನಮಸ್ಕಾರ ಅಜಯ್, ನೀವು ಕನ್ನಡಿಗ ಎಂದು ತಿಳಿದು ತುಂಬ ಸಂತೋಷವಾಯಿತು! ನಿಮ್ಮ ವಿಡಿಯೋಗಳು ತುಂಬ ಚನ್ನಾಗಿ ಮೂಡಿಬರುತ್ತಿವೆ.
@CodeEmporium
@CodeEmporium Жыл бұрын
Glad you liked this and thanks for watching! :)
@prashantlawhatre7007
@prashantlawhatre7007 Жыл бұрын
waiting for your future videos. This was amazing. especially the masked attention part.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much! Will be making more over the coming weeks
@noahcasarotto-dinning1575
@noahcasarotto-dinning1575 Жыл бұрын
Best video explaining this that ive seen by far
@pranayrungta
@pranayrungta Жыл бұрын
Your videos are way better than Stanford lecture cs224n
@CodeEmporium
@CodeEmporium Жыл бұрын
Words I am not worthy of. Thank you :)
@ayoghes2277
@ayoghes2277 Жыл бұрын
Thanks a lot for making the video!! This deserves more views.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for watching. Hope you enjoy the rest of the playlist as I code the entire transformer out !
@pocco8388
@pocco8388 Жыл бұрын
Best contents ever I've seen. Thanks for this video.
@giovannibianco5996
@giovannibianco5996 2 ай бұрын
best video found about the topic, great
@CodeEmporium
@CodeEmporium 2 ай бұрын
Thanks a ton for watching
@nexyboye5111
@nexyboye5111 3 ай бұрын
thanks, this is the only video I found useful on attention.
@Wesker-he9cx
@Wesker-he9cx 5 ай бұрын
Bro You're The Best, Mad Respect For You, I'm Subscribing
@MahirDaiyan7
@MahirDaiyan7 Жыл бұрын
Great! This is exactly what I was looking for in all of the other videos of yours
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the comment! There is more to come :)
@junior14536
@junior14536 Жыл бұрын
My god, that was amazing, you have a gift my friend; Love from Brazil :D
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks a ton :) Hope you enjoy the channel
@ChrisCowherd
@ChrisCowherd Жыл бұрын
Fantastic explanation! Wow! You have a new subscriber. :) Keep up the great work
@lawrencemacquarienousagi789
@lawrencemacquarienousagi789 Жыл бұрын
Wonderful works you've done! I really love your video and have studied twice. Thank you so much!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for watching! More to come :)
@JBoy340a
@JBoy340a Жыл бұрын
Great walkthrough of the theory and then relating it to the code.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much! Will be making more of these over the coming weeks
@ishwaragoudapatil9654
@ishwaragoudapatil9654 4 ай бұрын
Nice explanation. Thanks a lot Kannadiga :)
@AI-xe4fg
@AI-xe4fg Жыл бұрын
Good video Bro. Studying Transformer this week but still a little confused before I met your video. Thanks
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the kind words. I really appreciate it :)
@deepalisharma1327
@deepalisharma1327 Жыл бұрын
Thank you for making this concept so easy to understand. Can’t thank you enough 😊
@CodeEmporium
@CodeEmporium Жыл бұрын
My pleasure. Thank you for watching
@ganesha4281
@ganesha4281 2 ай бұрын
ನಮಸ್ಕಾರ ಅಜಯ್, ನೀವು ಕನ್ನಡಿಗ ಎಂದು ತಿಳಿದು ತುಂಬ ಸಂತೋಷವಾಯಿತು!
@Slayer-dan
@Slayer-dan Жыл бұрын
Huge respect ❤️
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much!
@shaktisd
@shaktisd Жыл бұрын
Excellent video . If you can please make a hello world on self attention like first showing pca representation before self attention and after self attention to show how context impacts the overall embedding
@nandiniloomba
@nandiniloomba Жыл бұрын
Thank you for teaching this.❤
@CodeEmporium
@CodeEmporium Жыл бұрын
My pleasure! Hope you enjoy the series
@chrisogonas
@chrisogonas Жыл бұрын
Awesome! Well illustrated. Thanks
@bradyshaffer3302
@bradyshaffer3302 Жыл бұрын
Thank you for this very clear and helpful demonstration!
@CodeEmporium
@CodeEmporium Жыл бұрын
You are so welcome! And be on the lookout for more :)
@rajv4509
@rajv4509 Жыл бұрын
Absolutely brilliant! Thumba chennagidhay :)
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks a ton! Super glad you like this. I hope you like the rest of this series :)
@PraveenHN-zj3ny
@PraveenHN-zj3ny 9 ай бұрын
very happy to see kannada here Great 😍Love from kannadigas
@klam77
@klam77 Жыл бұрын
"query" , "key" , and "value" terms come from the world of databases! So how do individual words in "My name is Ajay" each map to their own query and key and value semantically? that remains a bit foggy. i know you've shown random numbers in the example, but is there any semantic meaning to it? is this the "embeddings" of the LLM?
@maximilianschlegel3216
@maximilianschlegel3216 Жыл бұрын
This is an incredible video, thank you!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for watching and commenting!
@bhavyageethika4560
@bhavyageethika4560 Жыл бұрын
why is it d_k in both Q and K in the np.random.randn ?
@PaulKinlan
@PaulKinlan Жыл бұрын
This is brilliant, I've been looking for a bit more hands on demonstration of how the process is structured.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much! Happy you liked it :)
@Danpage04
@Danpage04 Жыл бұрын
how do we actually choose the dimensions of Q, K and V? Also, are they parameters that are fixed for each word in the English language, and do we get them from training the model? That part is a little confusing since you just mentioned that Q, V and K are initialized at random, so I assume they have to change in the training of the model.
@kotcraftchannelukraine6118
@kotcraftchannelukraine6118 Жыл бұрын
Q - query, V - value and K - key
@jamesjang8389
@jamesjang8389 Жыл бұрын
Amazing video! Thank you😊😊
@CodeEmporium
@CodeEmporium Жыл бұрын
You are very welcome
@chessfreak8813
@chessfreak8813 Жыл бұрын
Thanks! U r very deserved and underdog!
@picassoofai4061
@picassoofai4061 Жыл бұрын
Mashallah, man you are a rocket.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks for the kind words :)
@prasadhadkar1775
@prasadhadkar1775 4 ай бұрын
I have a question, since we generated q k and v randomly, how does the output that you are getting in your jupyter notebook have correct matrix values? like how is the value corresponding to my and name in matrix high, without any training?
@elunsgrace
@elunsgrace 8 күн бұрын
Usually masking is done with the queries on columns and with keys as rows(attention pattern), so the masking matrix will be an upper triangular matrix of ones.Am I wrong here or can it be done this way also?
@rameshsimhadri4884
@rameshsimhadri4884 17 күн бұрын
Here q,k,v are randomly generated. But they have to be generated by multiplying input embedding with Wq,Wk,Wv matrices. How these matrices are generated? What are the inputs and outputs to these matrices.
@radityawaliulu
@radityawaliulu 18 күн бұрын
8:34 how in formula you give + M? instead in journal doesnt belong +M.
@mamo987
@mamo987 Жыл бұрын
Amazing work! Very glad I subscribed
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for commenting!
@virtualphilosophyjourney8897
@virtualphilosophyjourney8897 Жыл бұрын
which phase does the model take the pretrianed info to decide the output?
@erfanemami1595
@erfanemami1595 29 күн бұрын
Well explained!!!
@rameshsimhadri4884
@rameshsimhadri4884 17 күн бұрын
Regarding advantage in transformer architecture over RNNs, that parallel processing is possible; However we are doing parallel processing in encoder but it is still sequential processing in decoder. May I request you to clarify?
@sockmonkeyadam5414
@sockmonkeyadam5414 Жыл бұрын
u have saved me. thank u.
@ParthivShah
@ParthivShah 6 ай бұрын
Really Appreciate Your Efforts. Love from Gujarat India.
@sriramayeshwanth9789
@sriramayeshwanth9789 Жыл бұрын
you made me cry brother
@imagiro1
@imagiro1 Жыл бұрын
Got it, thank you very much, but one question: What I still don't understand: We are talking about neural networks, and they are trained. So all the math you show here, how do we (know|make sure) that it actually happens inside the network? You don't train specific regions of the NN to specific tasks (like calculating a dot product), right?
@FelLoss0
@FelLoss0 Жыл бұрын
Dear Ajay. Thank you so much for your videos! I have a quick question here. Why did you transpose the values in the softmax function? Also... why did you specify axis=-1? I'm a newbie at this and I'd like to have strong and clear foundations. have a lovely weekend :D
@faiazahsan6774
@faiazahsan6774 Жыл бұрын
Thank you for explaining in such an easy way. It would be great if you could upload some codes on GCN algorithm.
@CodeEmporium
@CodeEmporium Жыл бұрын
I shall explore that possibility!
@7_bairapraveen928
@7_bairapraveen928 Жыл бұрын
why we need to stabilise the variance of attention vector with query and key vectors.
@McMurchie
@McMurchie Жыл бұрын
Hi I noticed this has been added to the transformer playlist, but there are 2 unavailable tracks - do i need them in order to get the full end to end grasp?
@CodeEmporium
@CodeEmporium Жыл бұрын
You can follow the order of “transformers from scratch” playlist. This should be the first video in the series. Hope this helps and thanks for watching ! (It’s still being created so you can follow along :) )
@ritviktyagi9221
@ritviktyagi9221 Жыл бұрын
How did we get the values of q, k and v vectors after initializing them as randoms. Great video btw. Waiting for more such videos.
@CodeEmporium
@CodeEmporium Жыл бұрын
The weight matrices that map the original word vectors to these 3 vectors are trainable parameters. So they would be updated by back propagation during training
@ritviktyagi9221
@ritviktyagi9221 Жыл бұрын
@@CodeEmporium Thanks for clarification
@paull923
@paull923 Жыл бұрын
Thx for your efforts!
@CodeEmporium
@CodeEmporium Жыл бұрын
Super welcome :)
@dataflex4440
@dataflex4440 Жыл бұрын
Brilliant Mate
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks a ton! :)
@yijingcui7736
@yijingcui7736 Жыл бұрын
this is very helpful
@CodeEmporium
@CodeEmporium Жыл бұрын
Glad! And thank you!
@yonahcitron226
@yonahcitron226 Жыл бұрын
this is amazing!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks a lot!
@gabrielnilo6101
@gabrielnilo6101 Жыл бұрын
I stop the video sometimes and roll it back some seconds to hear you explaining something again and I am like: "No way that this works, this is insane", some explanations on AI techniques are not enough and yours are truly simple and easy to understand, thank you. Do you collab with anyone when making these videos, or is it done all by yourself?
@CodeEmporium
@CodeEmporium Жыл бұрын
Haha yea. Things aren’t actually super complicated. :) I make these videos on my own. Scripting, coding, research, editing. Fun stuff
@TechTrendSpectrum
@TechTrendSpectrum Ай бұрын
Sir, I have my assigment to write report on, " Large Language Model are few shot clinical information extractor" and I have to make such LLM,,, and I reach here to your video. Sir can you please guide me.? Always be thankful!
@govindkatyura7485
@govindkatyura7485 Жыл бұрын
I have a few doubts 1. Do we use multiple ffnn after the attention layer? So suppose we have 100 input words for the encoder then 100 ffnn will get trained ? One for each of the word, i checked the source code but they were using only one, so I'm confused how one FFNN can handle multiple embedding specially with batch size. 2. In decoder do we pass multiple input also, just like encoder layer specially in training part?
@pulkitmehta1795
@pulkitmehta1795 Жыл бұрын
Simply wow..
@li-pingho1441
@li-pingho1441 Жыл бұрын
you save my life!!!!!
@CodeEmporium
@CodeEmporium Жыл бұрын
It’s what I do best :)
@anwarulislam6823
@anwarulislam6823 Жыл бұрын
How could someone hack my brain wave and convoluted this by evaluate inner voice? May I know this procedure? #Thanks
@EngineeredFemale
@EngineeredFemale Жыл бұрын
Haha ikr. I felt the same. Was looking for a good Self attention video.
@rujutaawate5412
@rujutaawate5412 Жыл бұрын
Thanks, @CodeEmporium / Ajay for the great explanation! One quick question- can you please explain how the true values of Q, K, and V are actually computed? I understand that we start with random initialization but do these get updated through something like backpropagation? If you already have a video of this then would be great if you can state the name/redirect! Thanks once again for helping me speed up my AI journey! :)
@CodeEmporium
@CodeEmporium Жыл бұрын
That's correct back prop will update these weights. For exact details, you can continue watching this playlist "Transformers From Scratch" where we will build a working transformer. This video was the first in that series. Hope you enjoy it :)
@naziadana7885
@naziadana7885 Жыл бұрын
Thank you very much for this great video! Can you please upload a video on Self Attention code using Graph Convolutional Network (GCN)?!
@CodeEmporium
@CodeEmporium Жыл бұрын
I’ll look into this at some point. Thanks for the tips.
@ayush_stha
@ayush_stha Жыл бұрын
In the demonstration, you generated the q, k & v vectors randomly, but in reality, what will the actual source of those values be?
@CodeEmporium
@CodeEmporium Жыл бұрын
Each of the q,k,v vectors will be a function of each word (or byte pair encoding) in the sentences. I say a “function” of the sentences since to the word vectors, we add position encoding and then convert into q,k,v vectors via feed forward layers. Some of the later videos in this “Transformers from scratch”playlist show some code on exactly how it’s created. So you can check those out for more intel :)
@SnehaSharma-nl9do
@SnehaSharma-nl9do 10 ай бұрын
Kannada Represent!! 🖐
@CodeEmporium
@CodeEmporium 10 ай бұрын
Haha! Yes 🙌
@Tomcat342
@Tomcat342 6 ай бұрын
Ayo yellaru hegiddra?
@philhamilton3946
@philhamilton3946 Жыл бұрын
What is the name of the text book you are using?
@klam77
@klam77 Жыл бұрын
if u watch the vid carefully, the url shows the books are "online" free access bibles of the field.
@sometimesdchordstrikes...7876
@sometimesdchordstrikes...7876 9 ай бұрын
@1:41 here you have said that you want the context of the words that will be coming in the future but in masking part of the video you have said that it will be cheating know the context of the words that will be coming in the future
@goodmusic284
@goodmusic284 Ай бұрын
This is a great tutorial. I think the links to your paid courses are broken.
@jonfe
@jonfe Жыл бұрын
i still dont understand the difference between Q K V, can someone explain?
@ajaytaneja111
@ajaytaneja111 Жыл бұрын
Ajay, I don't think the point of capturing the context in terms of words 'after' has a significance in language modelling. In language modelling you are predicting only the next word. Yes, for a task like machine translation, yes. Thus I don't think Bi-directional RNNs have anything better to offer for language modelling than the regular (one-way) RNNs. . Let me know what you think
@josephpark2093
@josephpark2093 Жыл бұрын
I watched the video around 3 times but I still don't understand. Why are these awesome videos so unknown?
@SIADSrikanthB
@SIADSrikanthB 9 ай бұрын
I really like how you use Kannada language examples in your explanations.
@Slayer-dan
@Slayer-dan Жыл бұрын
Ustad 🙏
@CodeEmporium
@CodeEmporium Жыл бұрын
too kind :)
@ProdbyKreeper
@ProdbyKreeper 4 ай бұрын
appreciate!
@YT-yt-yt-3
@YT-yt-yt-3 Жыл бұрын
I felt the q, k, v parameter is not explained very well.. similar search analogy would be better to get a intuition of these parameter then explaining as what I can offer, what I actual offer
@wishIKnewHowToLove
@wishIKnewHowToLove Жыл бұрын
thx
@CodeEmporium
@CodeEmporium Жыл бұрын
My pleasure :)
@kotcraftchannelukraine6118
@kotcraftchannelukraine6118 Жыл бұрын
You forgot to show the most important thing, how to train self-attention with backpropagation? You forgot about backward pass
@CodeEmporium
@CodeEmporium Жыл бұрын
This is the first video in a series of videos called “Transformers from scratch”. Later videos show how the entire architecture is training. Hope you enjoy the videos
@kotcraftchannelukraine6118
@kotcraftchannelukraine6118 Жыл бұрын
@@CodeEmporium thank you, i subscribe
@ChethanaSomeone
@ChethanaSomeone Жыл бұрын
Seriously, are u from karnataka ? your accent is so different dude.
@commonguy7
@commonguy7 2 ай бұрын
wow
@bkuls
@bkuls Жыл бұрын
Guru aarama? Nanu kooda Kannada ne!
@CodeEmporium
@CodeEmporium Жыл бұрын
Doin super well ma guy. Thanks for watching and commenting! :)
Multi Head Attention in Transformer Neural Networks with Code!
15:59
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 2 МЛН
БОЙКАЛАР| bayGUYS | 27 шығарылым
28:49
bayGUYS
Рет қаралды 1,1 МЛН
진짜✅ 아님 가짜❌???
0:21
승비니 Seungbini
Рет қаралды 10 МЛН
Transformer Neural Networks - EXPLAINED! (Attention is all you need)
13:05
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Positional Encoding in Transformer Neural Networks Explained
11:54
CodeEmporium
Рет қаралды 46 М.
The math behind Attention: Keys, Queries, and Values matrices
36:16
Serrano.Academy
Рет қаралды 274 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,2 МЛН
Learn Machine Learning Like a GENIUS and Not Waste Time
15:03
Infinite Codes
Рет қаралды 340 М.
Transformer Neural Networks Derived from Scratch
18:08
Algorithmic Simplicity
Рет қаралды 153 М.
Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head & Self-Attention
15:25
30 Year History of ChatGPT
26:55
Art of the Problem
Рет қаралды 1,1 МЛН
Blowing up the Transformer Encoder!
20:58
CodeEmporium
Рет қаралды 20 М.