Thanks for great video. Shouldn’t we use “residue=x.clone()” at 1:22:30 for residual connection. Otherwise residue variable will get updated also
@RishabhMishra-h5gКүн бұрын
For the first minibatch in off-policy learning, the ratio of offline and online log probas would be 1, right? It's only after the first minibatch pass, online policy would start producing different log probas for action tokens
@xugefuКүн бұрын
Thanks!
@praveensoni1119Күн бұрын
"This explaination is all I need" to crack my DS interviews. I really liked the smooth buildup of Transformer concepts in this video. Thanks a lot man!!
@akaskmsskssk6927Күн бұрын
One random afternoon last year I decided to watch the whole video, and now I have my own LLM with 1B parameter with your code. Thank you so much. Don't ever stop inspring new ai programmers! Greetings from Philippines.
@chatpatey72822 күн бұрын
Sir, I understood most of the content, but I'm struggling to grasp how the decoder block of UNET was designed. I tried to understand it and even attempted to write it on my own, but I couldn't manage. Could you please guide me on where I should focus or share any resources that can help me understand this better?
@Levi-AckermanYT3 күн бұрын
Excellent video! Could you provide the notes, I would greatly appreciate it.❤
@Levi-AckermanYT3 күн бұрын
I can't find the slides in github rpo
@nursami78423 күн бұрын
next explain vision transformer 🙏
@waniubaid77183 күн бұрын
Please make videolllama explanation in the same way❤❤❤
@waniubaid77183 күн бұрын
❤
@vanerk_3 күн бұрын
Great, as always, thank you sir!
@omerayklc9004 күн бұрын
Hello Umar, i'm always be amazed with your videos. Can you make a video about how to run huggingface models on Google's TPU's. It's hard to understand torch_xla and documentation is not quite well. When working with billion parameter models like LLaMA 7b etc. it would be easier to run it on Colab's TPU's. Also Google has a program calling the TPU Research Cloud(TRC) and it gives us researchers a great opportunity about training or fine-tuning this billion parameter models. It would be great if there a tutorial about how to utilize the TPU's. Have a great day!
@Mohitdadhich-fn8ix4 күн бұрын
What a great contribution Mr Umar …. Deep Respect for your work ❤
5 күн бұрын
thank you so much, great work
@hamedmc79385 күн бұрын
bro build GPT backend as a training course
@xian6235 күн бұрын
worth every second
@binfos74345 күн бұрын
Completed it today. One word `Amazing`. Looking forward to the flash attention with Triton.
@Mohammed.14715 күн бұрын
Thank you so much.... Best explanation on yt i could find on this topic....🙌
@ShiouTianHsu5 күн бұрын
Thanks!
@guneeshvats466 күн бұрын
Best explanation of transformers on youtube!!
@tiagojc6 күн бұрын
Thanks!
@tiagojc6 күн бұрын
Great video Umar, thank you so much. Any chance to you release the finetuning video to guide how to fit this model to my own dataset?
@Peaceful-er4vf6 күн бұрын
3:39:42 哈哈哈 真可爱
@赵品学7 күн бұрын
This is ABSOLUTELY AMAZING!
@jamesx7087 күн бұрын
The best video for learning RLHF.
@millenniumbismay3827 күн бұрын
Thank you! It has been a life saver :) It is definitely the best explanation of such important concepts by a long way! Thank you so much.
@Sathyam_a317 күн бұрын
The best video explanation I have ever got on Mistral!! Thank You so much for your efforts.
@magnetpest2k77 күн бұрын
Thanks!
@tunicorn35518 күн бұрын
AI圣经
@GvK-wb2nc8 күн бұрын
Great !!! Molto grazie !!!
@arminbiglari10439 күн бұрын
You are my lifesaver thank you very much
@Allen-TAN9 күн бұрын
thanks for the masterpiece, can you make a video to talk about the recent famous model: Recurrent Memory Transformers (RMT), and how to make this new model compatible with transformers in HF.
@Army-qs5fu10 күн бұрын
How can i train a transformer model from scractch for pdf summerization.... Is that the same procedure...???? Can you please reply. I'm a student who knows only fundamentals. And wish to knows more..
@parichehresmailian10 күн бұрын
How can i start to deep dive into the RAG? I have studied software engineering in the university as bsc and these days i’m studying E-Commerce as msc, which is mixed up completely with Ai.. and i exactly want to continue to RAG🥺
@fortuneolawale911311 күн бұрын
thanks
@ruijian569311 күн бұрын
If you only have time to watch one video about flash attention, this one is the one.
@janigiovanni607511 күн бұрын
Actually watching it for the second time now, because there are so much valuable informations in here :D
@janigiovanni607511 күн бұрын
Great video, thank you very much for this!
@mlworks11 күн бұрын
Brillant video on transformers with key math explanation.
@AnushkaMehta-c6b11 күн бұрын
thanktha
@piz-qg9xp11 күн бұрын
So it's the cat that is behind the scene kzbin.info/www/bejne/f4SxlYSZhc2mqtU. Thanks Dr Kitty
@maitreyimandal891012 күн бұрын
Such great content!
@Ask0ldd12 күн бұрын
Thank you very much for all your hard work. Your channel is a goldmine. 👑
@cbr250-p6v13 күн бұрын
pls bring a 'vision transformer architecture explained' video
@umarjamilai13 күн бұрын
It’s explained in the first hour of my “Coding a Vision Language Model” video
@yanghelena14 күн бұрын
Thank you for your selfless sharing and hard work! This video helps me a lot!
@AD-zj7ck15 күн бұрын
Thanks for the amazing video. Can you make video explaining and proving the universal approximation theorem?
@Cryptic3.015 күн бұрын
This is gold!
@jaewanpark257016 күн бұрын
This absolutely is gold. Actually it's closer to diamond than gold. My favorite part is 2:44:50, 2:47:50 when the cat also feels the content is wonderful.
@harshalhirpara458916 күн бұрын
Thank you Umar, you video made me connect all the dots!