The Narrated Transformer Language Model

  Рет қаралды 323,597

Jay Alammar

Jay Alammar

Күн бұрын

Пікірлер: 228
@parthchokhra948
@parthchokhra948 4 жыл бұрын
Your blog on Illustrated Transformer was my intro to Deep Learning with NLP. Thanks for the amazing contributions for the community.
@jc_777
@jc_777 3 жыл бұрын
Yeah it is being referenced in my DL class too. Truly great content for new learners!
@ahmeterdonmez9195
@ahmeterdonmez9195 3 ай бұрын
@@jc_777 Gemini also refers Mr Alammar's blog post👍
@andresjvazquez
@andresjvazquez 3 жыл бұрын
Dear Teacher Alammar , thanks to this video I was able to accepted into BYU lab as an external researcher (even though I didn’t finish college) and have been invited by my professor to participate with the lab in CASP15 . You really changed the course of my life by demystifying such complex topics for non traditional learners like me . I’m eternally in your debt
@ans1975
@ans1975 4 жыл бұрын
The Illustrated Transformer blog is a masterpiece!
@Roshan-xd5tl
@Roshan-xd5tl 3 жыл бұрын
Your ability to explain and breakdown complex topics into simpler and intuitive sections is legendary. Thank you for your contribution!
@bighit7596
@bighit7596 3 жыл бұрын
you have a gift for explaining complex materials... many other technical talks assumes the audience is very knowledgeable and are attending the session just for networking
@ayush612
@ayush612 3 жыл бұрын
I remember Seeing your Transformer's Blog Jay.. It was legendary!! Was referred to by other youtubers as well... And thanks a lot for the wonderful explanation as well!
@quietkael7349
@quietkael7349 4 жыл бұрын
Thank you so much for all the tireless work you do for us visual learners out there! I’m looking forward to videos where you get into your excellent visualizations of the underlying matrix operations. Your visual abstractions both at the flow chart level and matrix/vector level have really shaped my mental model for what I think about when I’m engineering models. I’m so grateful and so excited to see what you come out with next (this library you hint at looks wonderful!)
@arp_ai
@arp_ai 4 жыл бұрын
Thanks Jack!
@nileshkikle8112
@nileshkikle8112 Жыл бұрын
Outstanding job demystifying the inner working details of the Transformer model architecture! All the illustrations and animations for the inference working are awesome. Thank you for taking all the time and sharing your understanding with all of us. Kudos! 👍
@maruthiprasad8184
@maruthiprasad8184 Жыл бұрын
Amazing explanation, my search to understand the transformers ended here, you done the wonderful job, thank you so much for the simplest explanation I ever seen.
@drt-on-ai
@drt-on-ai 4 жыл бұрын
Never been more excited by a KZbinr channel than when I saw this guy had a channel.
@kazimafzal
@kazimafzal Жыл бұрын
You sir are an amazing teacher! I'm absolutely flabbergasted by how well you've explained, to think its all mathematics at the end of the day! Thank you for taking the time to put together such a concise yet complete guide to transformers!
@jacakopl
@jacakopl 4 жыл бұрын
This is the best video I have seen by far in this domain. You strike a perfect balance in assuming the level of understanding of audience :)
@arp_ai
@arp_ai 4 жыл бұрын
Awesome! Glad you found it useful!
@diogo.magalhaes
@diogo.magalhaes 4 жыл бұрын
Jay, as a PhD student, I'm a fan of your ability to explain complex topics, in a very simple, illustrated and didactic way! I always recommend your ' illustrated' posts to my colleagues. Thanks again for this great video, keep up the good work!
@arp_ai
@arp_ai 4 жыл бұрын
Thanks Diogo!
@perpetuallearner8257
@perpetuallearner8257 Жыл бұрын
Which university?
@curiouspie1264
@curiouspie1264 Жыл бұрын
One of the most comprehensive video and blog overviews of Transformers I've seen. Thank you. 🙏
@tachyon7777
@tachyon7777 3 жыл бұрын
It would nice to have a step by step walkthrough of the training process. And why each of those steps makes sense intuitively.
@goelnikhils
@goelnikhils Жыл бұрын
I haven't see such a clear explanation of Transformers and Decoder LM Models, Amazing Work Jay
@kalinda619
@kalinda619 4 жыл бұрын
A phenomenal extension of your blog post. Commenting for that bump in the recommendation algorithm!
@arp_ai
@arp_ai 4 жыл бұрын
Thank you! Much appreciated!
@JimBob-lq1db
@JimBob-lq1db Жыл бұрын
Thank you for this great explanation. Visualize , visualize, visualize, the best way to undestand how it works.
@ultraviolenc3
@ultraviolenc3 3 жыл бұрын
I’ve just read your “The illustrated transformer” article and I wanted to say that you made very smart and simple visual representations. It seems you put a lot of thought into that.
@raminbakhtiyari5429
@raminbakhtiyari5429 3 жыл бұрын
i don't khnow how must say thank you, I just can say please continue uploading your amazing videos. I live in a constrained country and this video is my only hope for learning like other peoples. yours sincerely. Ramin Bakhtiyari.
@nisalbandara
@nisalbandara 3 жыл бұрын
Im doing a Twitter sentiment analysis and i couldn't wrap my head around BERT and i came across this video. Perfectly explained. Thanks alot
@ishandindorkar2846
@ishandindorkar2846 Жыл бұрын
Jay, many thanks for your work. These videos help me a lot to understand key concepts in NLP domain through visualization.
@OslecVardeven
@OslecVardeven 9 ай бұрын
Jay, recentemente estive em um curso de I.A, Mas voce apresentou muito bem, de forma didática a PNL.... eu aprendi muito com voce. Obrigado. Continue sendo este cara maravilhoso.
@sudzam
@sudzam Жыл бұрын
Wow! One of THE best explanation of Transformers.. Thanks @Jay!!
@jesuslopez3306
@jesuslopez3306 2 жыл бұрын
Definitely it is easier to understand in a vertical way. Thanks for everything!
@stephenngumbikiilu3988
@stephenngumbikiilu3988 2 жыл бұрын
Your blog was referred to me by my lecture Julia Kreutzer of Google Translate, it's just amazing piece of work. It has really helped me in my understanding of these concepts. Thanks.
@studmatze958
@studmatze958 Жыл бұрын
Thank you so much for you work on attention and transformers. Your posts and videos are the best i have encountered so far in terms of visualization and explanation. And you did it way better than my Professor. Again thank you :)
@NarkeEmpire
@NarkeEmpire Жыл бұрын
You are a great teacher!!! If you chek the EQ settings and lower the music at the beginning the video is perfect!!! Thanks a lot for sharing your knowledge in this very understandable way
@abugigi
@abugigi 5 ай бұрын
Great video, and perhaps just as important, great selection of albums
@KlimovArtem1
@KlimovArtem1 4 жыл бұрын
27:56 - this explains a lot, thank you so much!
@AdityPai
@AdityPai 4 жыл бұрын
Thank you for writing the blog. It has helped me .
@1Kapachow1
@1Kapachow1 3 жыл бұрын
Really enjoyed your blog post and video, super clear - thank you very much for this amazing resource :)
@jpmarinhomartins
@jpmarinhomartins 3 жыл бұрын
Dude I freakin love your blog, keep up with the good work! Thanks for everything!
@a.e.5054
@a.e.5054 4 жыл бұрын
The best explanation of the Transformer and GPT model !!
@o_felipe_reis
@o_felipe_reis 4 жыл бұрын
Great video! Best regards from Brazil!
@nmstoker
@nmstoker 4 жыл бұрын
Watching it now, thanks so much! It's really helpful to go through these kinds of things with clear examples and explanations. My only preference would've been to reduce the volume of the background music in the intro. So many podcasts do this and it's an annoying trend!
@arp_ai
@arp_ai 4 жыл бұрын
Thanks Neil! Noted on the audio!
@Halterofilic
@Halterofilic 8 ай бұрын
2024, still a great reference to Transformers. Million thanks for the amazing work!
@tehseenzia3135
@tehseenzia3135 4 жыл бұрын
Amazing illustration. Keep going Jay.
@Opinionman2
@Opinionman2 2 жыл бұрын
Awesome stuff. your blog really helped clarify my deep learning class.
@javierechevarria1548
@javierechevarria1548 4 жыл бұрын
Your are really good (excellent) at explaining a complex topic in a simple way. Congratulations !!!!
@IyadKhuder
@IyadKhuder 2 жыл бұрын
I've ended up here to familiarize myself with NLP transformers. Your video was the optimal choice for me, as it' explains the concept in an understandable scientific manner. Thanks.
@zongmianli9072
@zongmianli9072 Жыл бұрын
Thanks for the very clear and concise explanation, Jay!
@sharkeyryan
@sharkeyryan 2 жыл бұрын
Thanks for creating this content. Your explanation is quite easy to follow, especially for someone like me who is just beginning to explore these areas of AI/ML.
@exxzxxe
@exxzxxe Жыл бұрын
Maybe the best video on this subject.
@sachinr3823
@sachinr3823 3 жыл бұрын
Omg, thanks lot for these amazing videos. Your lectures and blogs are so easy to understand.
@sachinr3823
@sachinr3823 3 жыл бұрын
Small request, please pin the BGM you used in the video
@yudiguzman8926
@yudiguzman8926 3 жыл бұрын
I really appreciate your explanation about this topic. One more time, I check that DL is my new passion. Thanks a lot.
@KlimovArtem1
@KlimovArtem1 4 жыл бұрын
14:15 - so, the Self-Attention layer is actually the thing that’s trying to understand the meaning of the whole sequence? How does it work and how can it be trained? How long sequenced can it analyze?
@HelenTueni
@HelenTueni 2 жыл бұрын
Amazing video. Thank you very much for making this topic accessible.
@gergerger53
@gergerger53 4 жыл бұрын
Amazing video. Have to admit that every time I heard the wrong pronunciation of "Shawshank" it did feel a bit like nails on a blackboard but easily forgivable. Jay, your resources and videos are phenomenal :) Thank you for putting in the work to help us all out.
@arp_ai
@arp_ai 4 жыл бұрын
Haha! Wrong how? Am I overpronouncing the shaWshank? Thank you!
@gergerger53
@gergerger53 4 жыл бұрын
@@arp_ai The "Shaw" is pronounced like "sure/shore" but in the video you use the vowel that's in "how/cow". Anyway, I only meant this as a tiny point :) Take home message is that you are an incredible ML / NLP teacher!!
@romulodrumond3526
@romulodrumond3526 3 жыл бұрын
One of the best videos of the subject
@damonandrews1887
@damonandrews1887 3 жыл бұрын
I found this very helpful visual explainer, thanks so much for your time, and thanks for chopping it up into sections for easy revision 🤓!
@tiborsaas
@tiborsaas Жыл бұрын
This video really aged well. It came out just after GPT3 and before ChatGPT. I love it how it gives massive insights to how current generative AI works behind the scenes (but obviously in a simplified way).
@itall9025
@itall9025 4 жыл бұрын
Great explanation! Please keep doing this format.
@jemma-joon
@jemma-joon 4 ай бұрын
this is amazing. One thing I didn't understand is the matrix, how it is generated and used in the processing to return the probability (how "the" turns into a big array of inputs)
@niundisponible
@niundisponible 2 жыл бұрын
I see Miles Davis vinyl, kind of blue. Awesome album, and thanks for the video!
@rsilveira79
@rsilveira79 4 жыл бұрын
Nice collection of albuns man! Miles Davis, Radiohead, John Coltrane, very classy! 👏👏👏
@kumarvikas_134
@kumarvikas_134 4 жыл бұрын
Spot on observation, kind of ironic to be listening to Ok Computer and teaching about artificial intelligence :D
@ygorgallina2691
@ygorgallina2691 3 жыл бұрын
Thank you so much for your work ! The illustration help to clearly understand these models !!
@utsavshukla7516
@utsavshukla7516 3 жыл бұрын
great explanation! also love all the pop culture references in your room :p
3 жыл бұрын
Just a personal comment on the format of the videos: I, personally, find that constant change of scene (like in "The architecture of the transformer" section) where the camera changes constantly showing you and then showing the computer screen and then back to you, is extremely annoying. The content of the video itself was informative.
@omarsultan827
@omarsultan827 3 жыл бұрын
Thank you for this awesome introduction!
@tsadigov1
@tsadigov1 Жыл бұрын
I am trying to understand working of transformer, you explain it much accessible way. One small thing I wish the video had less of transitions between two cameras.
@TusharKale9
@TusharKale9 3 жыл бұрын
Great master piece explanation of NLP in real life scenario. Thank you
@Alex-oo5rt
@Alex-oo5rt Жыл бұрын
6:13 actually, GPT-2 and GPT-3 models are both composed of an encoder-decoder architecture. The encoder-decoder architecture is a common framework used in natural language processing (NLP) tasks, particularly in sequence-to-sequence models. while GPT-2 and GPT-3 have an encoder component, it is not as prominently utilized as the decoder for generating text outputs.
@Udayanverma
@Udayanverma Жыл бұрын
loved it. thanks. got some new neurons in my head created by this video.
@armingh9283
@armingh9283 3 жыл бұрын
Thanks for the explanation. Good music taste at the background by the way👍
@arp_ai
@arp_ai 3 жыл бұрын
Thank you!
@amirhosseinfereidooni1798
@amirhosseinfereidooni1798 3 жыл бұрын
Thanks for the great explanation. MLP (at 11:35) stands for multilayer perceptron :)
@tusharkhustule3316
@tusharkhustule3316 Жыл бұрын
1 minute into the video and I already subscribed.
@thecutestcat897
@thecutestcat897 Жыл бұрын
Thanks, your Blog is so clear!
@maxbeber
@maxbeber 4 жыл бұрын
Thank you so much for the clear and concise explanation. Keep it up the great work.
@junlinguo77
@junlinguo77 2 жыл бұрын
I like the way you are teaching! !!
@KlimovArtem1
@KlimovArtem1 4 жыл бұрын
15:30 - when it was trained on the huge texts, how did they decide how to tokenize is? Is it based on some linguistic objects? Syllables?
3 жыл бұрын
if you're using pre-trained word embeddings , you have to tokenize it in the exact fashion the so-called word embedding was tokenized. Other than that , if you won't use pre-trained embeddings (which is usually not the case), you can just keep going over the entire corpus and create a list of distinct words or n-grams or whatever way you have chosen to define a token.
@KlimovArtem1
@KlimovArtem1 3 жыл бұрын
@ why tokens are needed at all? Why not to use letters?
3 жыл бұрын
@@KlimovArtem1 all a model understands, is numbers
@KlimovArtem1
@KlimovArtem1 3 жыл бұрын
@ letters are numbers too. Again, I asked why not to use letters? When words are separated onto other constructs instead - what are they from a linguist point of view?
@FabioAlmeida-k6t
@FabioAlmeida-k6t 7 ай бұрын
Excellent explanation, Thanks!
@ankitmaheshwari7310
@ankitmaheshwari7310 2 жыл бұрын
Helpful.. you missed to import torch in your GitHub code.
@pypypy4228
@pypypy4228 9 ай бұрын
A huge thank you for this explanation!
@hongkyulee9724
@hongkyulee9724 3 жыл бұрын
You are my hero. You give me reason of my life :D
@wu_kenny
@wu_kenny 3 жыл бұрын
Anyone knows what is the lm.trace_tokens in 26:30 in the final un-shared Notebook? Is it sth open-sourced for visualization? Looks pretty cool!
@akremgomri9085
@akremgomri9085 6 ай бұрын
same thinking I am looking for
@NilaMasrourisaadat
@NilaMasrourisaadat Жыл бұрын
Amazinnnng illustration of language model transformers
@parmarsuraj99
@parmarsuraj99 4 жыл бұрын
❤️ That library!!!!
@arp_ai
@arp_ai 4 жыл бұрын
It's been my entire focus the last few months. Stay tuned!
@jackdavidweber
@jackdavidweber 3 жыл бұрын
This is really great! Highly recommend!
@yoonyamm
@yoonyamm Жыл бұрын
Thank you for sharing wonderful insight!
@josephsueke
@josephsueke 10 ай бұрын
Really clear. amazing job!
@Udayanverma
@Udayanverma Жыл бұрын
towards the end at time 27:30 how are you printing the words with probability? pl share code I couldnt find any
@KlimovArtem1
@KlimovArtem1 4 жыл бұрын
23:20 - so, the whole sentence is processed as one thing? How long can such sentence be? Is every Transformer Block unique from others? Is it just a single layer of perceptrons there or a few?
@asheeshkumar5650
@asheeshkumar5650 3 жыл бұрын
yes, the whole sentence is processed as one, but simple models take only 1024 tokens(sometimes padded if needed), every block is similar in the decoder and the same within encoder. To predict the outputs(like tokens ids) we need to project in the same size and convert the scores in probabilities so. Correct me if I'm wrong with anything
@hasanb2312
@hasanb2312 4 жыл бұрын
Great video Jay, thank you so much!
@hunorszegi4007
@hunorszegi4007 Жыл бұрын
Thank you for your videos and blog posts. These were my inspiration to create a Java GPT-2 implementation for learning purposes. I can't use a link here, but as huplay I uploaded it to the biggest hosting site, and it is called gpt2-demo.
@peterkahenya
@peterkahenya Жыл бұрын
Wow! 🎉 Awesome into.
@MsFearco
@MsFearco 2 жыл бұрын
I just found this now. it's super. thanks
@yuchenyang4394
@yuchenyang4394 4 жыл бұрын
Great content! can't wait for more.
@arp_ai
@arp_ai 4 жыл бұрын
Thank you Yuchen!
@rupakgoyal1611
@rupakgoyal1611 3 жыл бұрын
loved the music behind ..
@RupertBruce
@RupertBruce Жыл бұрын
"We have ways to calculate the error..." - there's a lot of 'ways'-the chosen way would be interesting!
@hailongle
@hailongle 3 жыл бұрын
Fantastic teacher. Thanks Jay!
@WanderNatureDaily
@WanderNatureDaily 3 жыл бұрын
absolutely amazing video
@vijayko-e9f
@vijayko-e9f Жыл бұрын
Great work 👍👍👍
@majorminus66
@majorminus66 2 жыл бұрын
You, sir, have some good music taste!
@spacewaves94
@spacewaves94 3 жыл бұрын
Haha the chicken was a man, thanks for all the work breaking this down!
@evertonlimaaleixo1084
@evertonlimaaleixo1084 3 жыл бұрын
Amazing! Thank you for share!
@akshikaakalanka
@akshikaakalanka 2 жыл бұрын
Thank you very much! this is awesome and easy to understand.
@mrityunjayupadhyay7332
@mrityunjayupadhyay7332 Жыл бұрын
Great explanation
@RK-fr4qf
@RK-fr4qf Жыл бұрын
Impressive. Thank you.
@akremgomri9085
@akremgomri9085 6 ай бұрын
Hello all, I am comming from the future (4 years after the video has been released), and I hope I am not late XD I love the way you teach
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,2 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
We Attempted The Impossible 😱
00:54
Topper Guild
Рет қаралды 56 МЛН
LSTM is dead. Long Live Transformers!
28:48
Seattle Applied Deep Learning
Рет қаралды 530 М.
Vision Transformer Basics
30:49
Samuel Albanie
Рет қаралды 34 М.
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 2 МЛН
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
31:51
Algorithmic Simplicity
Рет қаралды 214 М.
A Hackers' Guide to Language Models
1:31:13
Jeremy Howard
Рет қаралды 540 М.
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 133 М.
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 5 МЛН
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,5 МЛН
We Attempted The Impossible 😱
00:54
Topper Guild
Рет қаралды 56 МЛН