Layer Normalization in Transformers | Layer Norm Vs Batch Norm

  Рет қаралды 18,080

CampusX

CampusX

Күн бұрын

Пікірлер: 127
@abhisheksaurav
@abhisheksaurav 5 ай бұрын
This playlist is like a time machine. I’ve watched you grow your hair from black to white, and I’ve seen the content quality continuously improve video by video. Great work!
@animatrix1631
@animatrix1631 5 ай бұрын
I feel the same as well but I guess he's not that old
@zerotohero1002
@zerotohero1002 4 ай бұрын
Courage comes at a price ❤
@sachink9102
@sachink9102 4 ай бұрын
Thank you, NitishJi, Eeagerly waiting to attend your Transformers sessions. Please complete this series.
@RamandeepSingh_04
@RamandeepSingh_04 4 ай бұрын
Another student added in the waiting list demanding for next video. Thank you sir.
@muhammadsheraz177
@muhammadsheraz177 5 ай бұрын
Please end this playlist as early as possible
@ayushrathore2570
@ayushrathore2570 5 ай бұрын
This whole playlist is the best thing I discovered on KZbin! Thank you so much, sir
@nvnurav1892
@nvnurav1892 4 ай бұрын
Sir one small suggestion, aap apni videos pe speech to speech translation laga ke english mai convert kar lo and upload it on Udemy/youtube. it will help a lot of people jinko hindi nhi aati and will help your hard work get more and more attraction.🙂🙂. We are really very lucky that we are getting such rich content for free.. God bless you.
@yashshekhar538
@yashshekhar538 5 ай бұрын
Respected Sir, your playlist is the best. Kindly increase the frequency of videos.
@sahil5124
@sahil5124 5 ай бұрын
this is really important topic. Thank you so much. Please cover everything about Transformer architecture
@sharangkulkarni1759
@sharangkulkarni1759 3 ай бұрын
जबरदस्त,! मजा आगया, जिस तरह से padding के zeroes को लपेटे मे ले लिया, मजा आगया
@akeshagarwal794
@akeshagarwal794 5 ай бұрын
Congratulations for building a 200k Family you deserve even more reach🎉❤ We love you sir ❤
@ShivamSuri-lz5it
@ShivamSuri-lz5it 3 ай бұрын
Excellent deep learning playlist , highly recommended !!
@PrathameshKhade-j2e
@PrathameshKhade-j2e 5 ай бұрын
Sir try to complete this playlist as early as possible , you are the best teacher and we want to learn the deep learning concept from you
@SBhupendraAdhikari
@SBhupendraAdhikari 3 ай бұрын
Thanks a Lot Sir, Really enjoying the learning of Transformers
@udaysharma138
@udaysharma138 2 ай бұрын
Thanks a lot Nitish Sir , best Explanation
@praneethkrishna6782
@praneethkrishna6782 Ай бұрын
@campusx Hi Nitish, thanks a lot for the elaborated explanation. But I have a query, Is it really that the values '0' representing the padding tokens really the reason (or the only reason) which is stopping from using Batch Normalization. because it can be internally handled to not consider '0' which calulating the mean and stadard deviation while calulating z across features. on the other hand I think, this technique (Batch Norm) is clubbing the embeddings of different sentences while calulating Z which seems little odd to me. and that is the reason for not using this technique. please correct me if I am wrong here
@AKSHATSHAW-tf3ow
@AKSHATSHAW-tf3ow 14 күн бұрын
Same doubt, I think there is both the reasons for this.
@vinaykumar-xh5pi
@vinaykumar-xh5pi 4 ай бұрын
please release the next video very curious to complete ...... loved your content as always
@ryannflin1285
@ryannflin1285 Ай бұрын
bhai literally mujhe samajh nhi aa rha hai ki mujhe samajh kaise aa rha, koi itna accha kaise padha sakta hai yrr,love u sir ( from IITJ)
@AidenDsouza-ii8rb
@AidenDsouza-ii8rb 4 ай бұрын
Your DL playlist is like a thrilling TV series - can't wait for the next episode! Any chance we could get a season finale soon? Keep up the awesome work!
@just__arif
@just__arif 2 ай бұрын
Top-quality content!
@Xanchs4593
@Xanchs4593 4 ай бұрын
Can you pls explain what is the add in add and norm layer?
@ai_pie1000
@ai_pie1000 5 ай бұрын
Congratulations Brother for 200k users Family ... 👏👏👏
@GanitSikho-xo2yx
@GanitSikho-xo2yx 4 ай бұрын
Well, I am waiting for your next video. It's a gem of learning!
@Fazalenglish
@Fazalenglish 3 ай бұрын
I really like the way you explain things ❤
@taseer12
@taseer12 5 ай бұрын
Sir I can't describe your efforts Love from Pakistan
@saurabhbadole821
@saurabhbadole821 5 ай бұрын
I am glad that I found this Channel! can't thank you enough, Nitish Sir! One more request: If you could create one-shot revision videos for machine learning, deep learning, and natural language processing (NLP).🤌
@WIN_1306
@WIN_1306 4 ай бұрын
at 46:10 ,why it is zero? as beta is added so it will prevent it from becoming zero?
@dilippokhrel4009
@dilippokhrel4009 Ай бұрын
Initially the gama value is kept 1 and beta is kept zero, hence initially the value will be zero. But during training process may be value will be other than zero
@Schrödinger3
@Schrödinger3 4 ай бұрын
please complete this this playlist and add transformers tutorials as soon as possible
@shibrajdeb5177
@shibrajdeb5177 5 ай бұрын
sir please upload regular video . This videos help me a lot. please sir upload regular videos
@ghousepasha4172
@ghousepasha4172 4 ай бұрын
Please sir update videos regularly, we wait a lot for your videos
@Shisuiii69
@Shisuiii69 2 ай бұрын
Question: Sir agr kia ho ky jo padding wala vector hai isme B¹ ki value 0 ky bjae khuch aur ajae ky wo update hoti rehti hai to is se padding vector 0 nhi rhe ga to kia ye model me affect nhi kre ga ?
@slaypop-b5n
@slaypop-b5n 29 күн бұрын
Bro Did u find the answer ? Had the same doubt
@Ishant875
@Ishant875 21 күн бұрын
Same doubt
@slaypop-b5n
@slaypop-b5n 14 күн бұрын
@@Ishant875 any updates , bro ? Did u get the answer ?
@AmitBiswas-hd3js
@AmitBiswas-hd3js 4 ай бұрын
Please cover this entire Transformer architecture as soon as possible
@1111Shahad
@1111Shahad 5 ай бұрын
Thank you Nitish, Waiting for your next upload.
@ESHAANMISHRA-pr7dh
@ESHAANMISHRA-pr7dh 4 ай бұрын
Respected sir, I request you to please complete the playlist. I am really thankful to you for your amazing videos in this playlist. I have recommended this playlist to a lot of my friends and they loved it too. Thanks for providing such content for free🙏🙏
@gurvgupta5515
@gurvgupta5515 5 ай бұрын
Thanks for this video sir. Can you also make a video on Rotary Positional Embeddings (RoPE) that is used in Llama as well as other LLMs for enhanced attention.
@AkashSingh-oz7qx
@AkashSingh-oz7qx Ай бұрын
please also cover Generative and diffusion models.
@muhammadsheraz177
@muhammadsheraz177 5 ай бұрын
Sir kindly can you tell that when this playlist will complete.
@shreeyagupta5720
@shreeyagupta5720 5 ай бұрын
Congratulations for 200k sir 👏 🎉🍺
@bmp-zz9pu
@bmp-zz9pu 4 ай бұрын
SIr krdo pls iss playlist ko poora!!!!!!!!!
@mayyutyagi
@mayyutyagi 5 ай бұрын
Amazing series full of knowledge...
@krisharora2959
@krisharora2959 4 ай бұрын
Next video is awaited more than anything
@arunkrishna1036
@arunkrishna1036 4 ай бұрын
Sir what if Beta value is updated during learning process? Then it will get added along with the padded zeros making it a non zero value in the further iterations
@Shisuiii69
@Shisuiii69 2 ай бұрын
Same confusion, did you find the answer?
@physicskiduniya8054
@physicskiduniya8054 4 ай бұрын
Bhaiya! Awaiting for your course upcoming videos please try to complete this playlist asap bhaiya
@znyd.
@znyd. 5 ай бұрын
Congrats on the 200k subs, love from Bangladesh ❤.
@darkpheonix6592
@darkpheonix6592 4 ай бұрын
please upload remaining videos quickly
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
@WIN_1306
@WIN_1306 4 ай бұрын
sir can u tell that around how many and which topics are left?
@princekhunt1
@princekhunt1 4 ай бұрын
Sir, Please complete this series.
@SulemanZeb.
@SulemanZeb. 5 ай бұрын
Please start MLOPs playlist as we are desperately waiting for.......
@SaurabhKumar-t4s
@SaurabhKumar-t4s 3 ай бұрын
At 46:04, if sigma4 is 0 then how do we divide with this value.
@Shisuiii69
@Shisuiii69 2 ай бұрын
Mujhe bhi same confusion thi chatgpt se pta kiya usne kha ky hum ek error value add krty jo very close to zero hota hai to isliye hum zero likh dety hai after normalization
@ishika7585
@ishika7585 5 ай бұрын
Kindly make video on Regex as well
@WIN_1306
@WIN_1306 4 ай бұрын
what is regex?
@himansuranjanmallick16
@himansuranjanmallick16 3 ай бұрын
thank you sir................
@UCS_B_DebopamDey
@UCS_B_DebopamDey Күн бұрын
thank you sir
@rajnishadhikari9280
@rajnishadhikari9280 5 ай бұрын
Thanks for this amazing series.
@peace-it4rg
@peace-it4rg 4 ай бұрын
sir mera doubt that ki mai agar transformer architecture mai batchnorm use karoon kunki jo values matrix mai hai un sabka apna learning rate and bias factor hai to jo bias hai uskai karan to zero chala hi jayega fir layer norm kyun. kyunki ham ((x-u)/var)*lambda+bias krtai hi hain to bias to apne aap usko zero nhi hone dega. Please help sir
@RamandeepSingh_04
@RamandeepSingh_04 4 ай бұрын
still it will be a very small number and will affect the result and not represent the true picture of the feature in batch normalization.
@WIN_1306
@WIN_1306 4 ай бұрын
@@RamandeepSingh_04 compared to others who are without padding it will be small, but still sir wrote zero but zero to nhi hi hoga
@technicalhouse9820
@technicalhouse9820 5 ай бұрын
Sir love you so much from Pakistan
@intcvn
@intcvn 4 ай бұрын
complete jaldi sir waiting asf
@rb4754
@rb4754 5 ай бұрын
Congratulations for 200k subscribers!!!!!!!!!!!!!!!!!!
@sagarbhagwani7193
@sagarbhagwani7193 5 ай бұрын
thanks sir plse complete this playlist asap
@hassan_sid
@hassan_sid 5 ай бұрын
It would be great if you make a video on RoPE
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost.
@WIN_1306
@WIN_1306 4 ай бұрын
i am the 300th person to like this video sir plzz upload next vidoes we are eagerly waiting
@29_chothaniharsh62
@29_chothaniharsh62 5 ай бұрын
Sir can you please continue the 100 interview questions on ML playlist?
@shubharuidas2624
@shubharuidas2624 5 ай бұрын
Please also continue with vision transformer
@advaitdanade7538
@advaitdanade7538 5 ай бұрын
Sir please end this playlist fast placement season is nearby😢
@arpitpathak7276
@arpitpathak7276 5 ай бұрын
Thank you sir I am waiting for this video ❤
@SuperRia33
@SuperRia33 11 күн бұрын
Thanks
@virajkaralay8844
@virajkaralay8844 5 ай бұрын
Absolute banger video again. Appreciate the efforts you're taking for transformers. Cannot wait for when you explain the entire transformer architecture.
@virajkaralay8844
@virajkaralay8844 5 ай бұрын
Also, congratulations for 200k subscribers. May you reach many more milestones
@vinayakbhat9530
@vinayakbhat9530 2 ай бұрын
excellent
@not_amanullah
@not_amanullah 5 ай бұрын
This is helpful 🖤
@oden4013
@oden4013 4 ай бұрын
sir please upload next video please its almost a month
@rose9466
@rose9466 5 ай бұрын
Can you give an estimate by when this playlist will be completed
@khatiwadaAnish
@khatiwadaAnish 3 ай бұрын
Awesome 👍👍
@manojprasad6781
@manojprasad6781 5 ай бұрын
Waiting for the next video💌
@not_amanullah
@not_amanullah 5 ай бұрын
Thanks ❤
@barryallen5243
@barryallen5243 5 ай бұрын
Just ignoring padded rows while performing batch normalization should also work, I feel like it that padded zeros are not the only reason we layer normalization instead of batch normalization.
@WIN_1306
@WIN_1306 4 ай бұрын
how would you ignore padding cols in batch normalisation?
@SANJAYTYAGI-bk6tx
@SANJAYTYAGI-bk6tx 5 ай бұрын
Sir In batch normalization , in your example we have three mean and three variance along with same number of beta and gamma i.e. 3. But in layer normalization , we have eight mean and eight variance along with 3 beta and 3 gamma. That means number of beta and gamma are same in both batch and layer normalization. Is it correct? Pl elaborate on it .
@campusx-official
@campusx-official 5 ай бұрын
Yes
@WIN_1306
@WIN_1306 4 ай бұрын
mean and variance are used for normalisation ,beta and gamma are used for scaling
@teksinghayer5469
@teksinghayer5469 5 ай бұрын
when will you code transformer from scratch in pytorch
@adarshsagar9817
@adarshsagar9817 5 ай бұрын
sir please complete the NLP playlist
@WIN_1306
@WIN_1306 4 ай бұрын
which one? how many videos does it have?
@vikassengupta8427
@vikassengupta8427 4 ай бұрын
Sir next video ❤❤
@gauravbhasin2625
@gauravbhasin2625 5 ай бұрын
Nitish, please relook at your covariate shift funds... yes, you are partially correct but how you explained covariate shift is actually incorrect. (example - Imagine training a model to predict if someone will buy a house based on features like income and credit score. If the model is trained on data from a specific city with a certain average income level, it might not perform well when used in a different city with a much higher average income. The distribution of "income" (covariate) has shifted, and the model's understanding of its relationship to house buying needs to be adjusted.)
@WIN_1306
@WIN_1306 4 ай бұрын
ig , the explanation that sir gave and your explanation are same with different example of covariate shift
@dharmendra_397
@dharmendra_397 5 ай бұрын
Very nice video
@zerotohero1002
@zerotohero1002 4 ай бұрын
one month ho gya sir please upload eagarly waiting🥺🥺🥺
@harshgupta-w5y
@harshgupta-w5y 5 ай бұрын
Jldi next video dalo sir
@MrSat001
@MrSat001 5 ай бұрын
Great 👍
@turugasairam2886
@turugasairam2886 Ай бұрын
sir, why dont you translate it to english and upload , making a new channel like campusX english, i am sure it will attract more audience and reach. I am sure you would have thought of this already
@titaniumgopal
@titaniumgopal 4 ай бұрын
Sir PDF Update karo
@aksholic2797
@aksholic2797 5 ай бұрын
200k🎉
@bmp-zz9pu
@bmp-zz9pu 5 ай бұрын
A video after 2 weeks in this playlist.....itna zulam mat karo.....thoda tez kaam kro sirji..............
@ghousepasha4172
@ghousepasha4172 4 ай бұрын
Sir please complete playlist I will pay 5000 for that
@faizack
@faizack 3 ай бұрын
😂😂😂🎉
@space_ace7710
@space_ace7710 5 ай бұрын
Yeah!!
@not_amanullah
@not_amanullah 4 ай бұрын
🖤🤗
@DarkShadow00972
@DarkShadow00972 5 ай бұрын
Bring some coding example bro
@ashutoshpatidar3288
@ashutoshpatidar3288 5 ай бұрын
please be a little fast!
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
@Amanullah-wy3ur
@Amanullah-wy3ur 4 ай бұрын
this is helpful 🖤
@Amanullah-wy3ur
@Amanullah-wy3ur 4 ай бұрын
thanks ❤
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
@anonymousman3014
@anonymousman3014 4 ай бұрын
Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism. I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.
Positional Encoding in Transformers | Deep Learning | CampusX
1:13:15
Who's spending her birthday with Harley Quinn on halloween?#Harley Quinn #joker
01:00
Harley Quinn with the Joker
Рет қаралды 24 МЛН
Real Man relocate to Remote Controlled Car 👨🏻➡️🚙🕹️ #builderc
00:24
Human vs Jet Engine
00:19
MrBeast
Рет қаралды 202 МЛН
Why Does Batch Norm Work? (C2W3L06)
11:40
DeepLearningAI
Рет қаралды 200 М.
Layer Normalization - EXPLAINED (in Transformer Neural Networks)
13:34
Standardization vs Normalization Clearly Explained!
5:48
Normalized Nerd
Рет қаралды 150 М.
What is Layer Normalization? | Deep Learning Fundamentals
5:18
AssemblyAI
Рет қаралды 36 М.
Data Scientist vs. AI Engineer
10:39
IBM Technology
Рет қаралды 193 М.
Batch normalization | What it is and how to implement it
13:51
AssemblyAI
Рет қаралды 63 М.
The KV Cache: Memory Usage in Transformers
8:33
Efficient NLP
Рет қаралды 43 М.
Standardization Vs Normalization- Feature Scaling
12:52
Krish Naik
Рет қаралды 302 М.
Batch Normalization (“batch norm”) explained
7:32
deeplizard
Рет қаралды 229 М.
Who's spending her birthday with Harley Quinn on halloween?#Harley Quinn #joker
01:00
Harley Quinn with the Joker
Рет қаралды 24 МЛН