Tutorial 12- Stochastic Gradient Descent vs Gradient Descent

  Рет қаралды 234,115

Krish Naik

Krish Naik

Күн бұрын

Пікірлер: 104
@shashanktripathi3034
@shashanktripathi3034 4 жыл бұрын
Krish sir your youtube channel is just like GITA for me as one gets all the answers to life in GITA I get all my doubts cleared on your channel. Thank you, SIr.
@kartikdave659
@kartikdave659 4 жыл бұрын
after becoming member how can i get the data science material, can you please tell me?
@BalaguruGupta
@BalaguruGupta 3 жыл бұрын
Amazing explanation Sir! You'll always be the hero for the AI Enthusiasts. Thanks a lot!
@saurabhnigudkar6115
@saurabhnigudkar6115 4 жыл бұрын
Best Deep Learning playlist on youtube
@ravindrav1895
@ravindrav1895 2 жыл бұрын
whenever i am confused with some topics , i come back to this channel and watch your videos and it helps me a lot sir .Thank you sir for an amazing explanation
@archanamaurya89
@archanamaurya89 4 жыл бұрын
This video is such a light bulb moment for me :D Thank you so very much!!
@nitayg1326
@nitayg1326 5 жыл бұрын
My God! Finally am clear about GD SGD and mini batch SGD!
@nagesh866
@nagesh866 4 жыл бұрын
what an amazing teacher you are. Crystal clear.
@lakshminarasimhanvenkatakr3754
@lakshminarasimhanvenkatakr3754 4 жыл бұрын
This is excellent explanation so that anyone can understand with so much granular level of details.
@ajithtolroy5441
@ajithtolroy5441 5 жыл бұрын
I saw many videos but this one is quite comprehensible and informative
@fedisalhi6320
@fedisalhi6320 5 жыл бұрын
Excellent explanation, it was really helpful thank you.
@guytonedhai
@guytonedhai Жыл бұрын
How are you so good at explaining 😭😭😭😭😭 Thanks a lot ♥♥♥
@funpoint3966
@funpoint3966 10 ай бұрын
please workout your camera issue it seems like it is set to auto focus resulting in a little disturbance.
@VVV-wx3ui
@VVV-wx3ui 5 жыл бұрын
Superb...simply superb. understood the concept now from the Loss function. Well don Krish.
@OmerFarukUcer
@OmerFarukUcer 15 күн бұрын
Really nice explanation! Thanks for the video
@tsharmi919
@tsharmi919 3 ай бұрын
Best video explanation on this so far
@khuloodnasher1606
@khuloodnasher1606 4 жыл бұрын
Really this is the best video i'v seen ever explaining the concept better than famous. school
@Anand-uw2uc
@Anand-uw2uc 4 жыл бұрын
Good Explanation! But you did not speak much about when to use SGD although you clarified better on GD and Mini Batch SGD
@vishaldas6346
@vishaldas6346 4 жыл бұрын
There is nothing much to explain about SGD when you are talking about 1 datapoint at a time while considering dataset of 1000 datapoints.
@SandeepKashyap-ek2hx
@SandeepKashyap-ek2hx 2 жыл бұрын
You are a HERO sir
@RishikeshGangaDarshan
@RishikeshGangaDarshan 3 жыл бұрын
Good Good clearly explained nobody can explained like this
@user-wd2xh3vj2v
@user-wd2xh3vj2v Ай бұрын
Thank you sir for your valuable information.. ❤
@ArthurCor-ts2bg
@ArthurCor-ts2bg 4 жыл бұрын
Krish you concise subject most meaningfully
@gayathrijpl
@gayathrijpl Жыл бұрын
such a clean way of explanation
@goodnewsdaily-tamil1990
@goodnewsdaily-tamil1990 2 жыл бұрын
1000 likes for you man👏👍
@sandipansarkar9211
@sandipansarkar9211 4 жыл бұрын
Thanks Krish. Good video.I want to use all this knowledge in my next batch of deep learning by ineuron
@tonyzhang2501
@tonyzhang2501 3 жыл бұрын
Thank you, It is clear explanation. I got it!
@yukeshnepal4885
@yukeshnepal4885 4 жыл бұрын
8:58 , using GD it converge quickly and while using mini-batch SGD it follows zigzag path, How??
@kannanparthipan7907
@kannanparthipan7907 4 жыл бұрын
In case of mini batch sgd, we are considering only some points so some deviations will be there in the calculation compared to usual gradient descent where we are considering all values. Simple example GD is like total population and mini SGD is like sample population, it will never be equal and in sample population some deviation always will be there in distribution compared to total population distribution. We cant use GD everywhere, due to time computation factor, using mini SGD will give approximate correct result.
@bhargavpotluri5147
@bhargavpotluri5147 4 жыл бұрын
@@kannanparthipan7907 Deviation will be there in the final output or in the final converge result. Question is why do we have during the process of convergence. Also for every epoch if we consider different samples then understood that there can be zig zag results in the process of convergence. But if only one sample of k records are considered then why is that zig zag during convergence?
@bhargavpotluri5147
@bhargavpotluri5147 4 жыл бұрын
Ok now I got it. For every iteration, samples are picked at random, so is zig zag. Just gone through other artciles
@rohitsaini8480
@rohitsaini8480 Жыл бұрын
Sir, please solve my problem, in my view we are doing gradient descent to find the best value of m (slop in case of linear regression, considering b = 0) so if we use all the point then we must came to know at which point the value of m is less, so why we have to use learning rate to update weight because we already know the best value.
@allaboutdata2050
@allaboutdata2050 5 жыл бұрын
What an explaination 🧡 . Great !! Awesome !! .
@ukc2704
@ukc2704 5 жыл бұрын
Great video man 👍👍..Please keep it up. I am waiting for next videos
@chinmaybhat9636
@chinmaybhat9636 4 жыл бұрын
Awesome @KrishNaik Sir.
@koustavdutta5317
@koustavdutta5317 3 жыл бұрын
Hi Krish, one request to you ...like this playlist, please make long videos for the ML Playlist with the Loss Functions , Optimizers used in various ML Algorithms --> mainly in case of Classification Algorithms
@aditisrivastava7079
@aditisrivastava7079 5 жыл бұрын
Just wanted to ask to ask if you could also suggest some good resources online that we can read which could bring more clarity.......
@minakshiboruah1356
@minakshiboruah1356 3 жыл бұрын
@12:02 Sir it should bemini batch stocastic g.d.
@Kurtmind
@Kurtmind 2 жыл бұрын
Excellent explanation Sir!
@severnsevern1445
@severnsevern1445 4 жыл бұрын
Great explanation . Very clear . Thank!
@vinuvarshith6412
@vinuvarshith6412 Жыл бұрын
Top notch explanation!
@soheljagirdar8830
@soheljagirdar8830 4 жыл бұрын
4:17 SGD have minimum 256 records to find error / minima you said it's 1 record at a time
@pramodyadav4422
@pramodyadav4422 4 жыл бұрын
I read few articles which says In "SGD a randomly one data point is picked from the whole data set at each iteration". 256 records which you're talking about may be Mini Batch SGD "It is also common to sample a small number of data points instead of just one point at each step and that is called “mini-batch” gradient descent."
@tejasvigupta07
@tejasvigupta07 4 жыл бұрын
@@pramodyadav4422 yeah ,even I have read that in SCD only one data point is selected and updated in each iteration instead of all.
@gauravsingh2425
@gauravsingh2425 5 жыл бұрын
Thanks Krish !!! very nice explanation
@sathishkumar6076
@sathishkumar6076 2 ай бұрын
Please Explain FedAvg, FedAvgM, FedProx, Scaffold etc
@rababmaroc3354
@rababmaroc3354 4 жыл бұрын
thank you very much for your efforts. please how can we solve a portfolio allocation problem using this algorithm? please answer me
@Skandawin78
@Skandawin78 5 жыл бұрын
Your vidoes are excellent reference to brush up these concepts
@sreejus8218
@sreejus8218 4 жыл бұрын
If we use a sample of output to find the loss, will we use its derivative for changing whole weight or change the weights of the respective output
@lj123-g9d
@lj123-g9d 6 ай бұрын
So simply explained
@jiayuzhou6051
@jiayuzhou6051 8 ай бұрын
the only video that explains
@nikkitha92
@nikkitha92 4 жыл бұрын
Sir your videos are amazing. Can you please explain about latest methodologies such as BERT , ELMO
@bhavanapurohit2627
@bhavanapurohit2627 4 жыл бұрын
Hi, is it completely theoretical or will you code in further sessions?
@ankitbiswas8380
@ankitbiswas8380 2 жыл бұрын
when you mentioned SGD takes place in linear regression . I didnt understand that comment . Even in your linear regression videos for the mean square error we are having sum of squares for all data points . So how SGD got linked in linear regression ?
@rabidub733
@rabidub733 10 ай бұрын
thanks for this! great explanation
@jsverma143
@jsverma143 5 жыл бұрын
negative weights and positive weights best explained as-- since the angle of tangent is more than 90 degree in left side of the curve so this results in -ve values and for other its less than 90 degree so it would be +ve
@bikkykumar6312
@bikkykumar6312 Ай бұрын
Hello sir, I am stuck with gradient descent, ,mini batch and sgd . Sir can you recommend some text book or material for this topics. Any help will be appreciated. Thank you
@muhammedsahalot8683
@muhammedsahalot8683 8 ай бұрын
which have more convergence speed SGD or GD ?
@NaveenKumar-ts1om
@NaveenKumar-ts1om 6 ай бұрын
Awesome KRISHHHHHH
@nansonspunk
@nansonspunk 2 жыл бұрын
yes i really liked this explanation thanks
@taranilakshmi9680
@taranilakshmi9680 5 жыл бұрын
Explained very well. Thankyou.
@achrafkmout9398
@achrafkmout9398 3 жыл бұрын
very good explanation
@AjanUnderscore
@AjanUnderscore 2 жыл бұрын
Thank u sir 🙏🙏🙌🧠🐈
@samiabidah4197
@samiabidah4197 3 жыл бұрын
please what the difference between GD and Batch GD !
@siddharthachatterjee9959
@siddharthachatterjee9959 4 жыл бұрын
Good attempt 👍. Please record with camera on manual focus.
@a.sharan8876
@a.sharan8876 Жыл бұрын
py:28: RuntimeWarning: overflow encountered in scalar power cost = (1/n)*sum([value**2 for value in(y-y_predicted)]) hey bro . ia m stuck here with this error , i could not understand the error itself, if you suggests me some solution. .... just now i started to practice a ml algorthm.
@r7918
@r7918 3 жыл бұрын
I have 1 question regarding this topic. Is this concept applicable to linear regression, right?
@syedsaqlainabatool3399
@syedsaqlainabatool3399 4 жыл бұрын
This is what i was looking for
@ruchikalalit1304
@ruchikalalit1304 5 жыл бұрын
have you make the videos of practical implementation of all the work if so please share the links
@vineetagarwal18
@vineetagarwal18 2 жыл бұрын
Great Sir
@bijaynayak6473
@bijaynayak6473 5 жыл бұрын
Hello Sir, could you share the link for the code where you explained, these videos series are very nice with short of the period we can cover so many concepts. :)
@akfvc8712
@akfvc8712 4 жыл бұрын
greate video excelent effort. appreciated!!
@pareesepathak7348
@pareesepathak7348 3 жыл бұрын
can you share the paper for reference and also can you share the resources for deep learning for image processing.
@alsabtilaila1923
@alsabtilaila1923 3 жыл бұрын
Great one!
@manojsalunke2842
@manojsalunke2842 4 жыл бұрын
9.28 time, you said sgd will take time to converge than gd, then which is fast , sgd or gd????
@response2u
@response2u 2 жыл бұрын
Thank you, sir!
@muralimohan6974
@muralimohan6974 4 жыл бұрын
How can we take k inputs at the same time
@rdf1616
@rdf1616 4 жыл бұрын
good explanation! thankss
@vishaljhaveri7565
@vishaljhaveri7565 3 жыл бұрын
Thank you sir.
@abhrapuitandy3327
@abhrapuitandy3327 4 жыл бұрын
please do tell about stochastic gradient ascent also
@_JoyshreeMozumder
@_JoyshreeMozumder 4 жыл бұрын
what is resource of data point?
@khushboosoni2788
@khushboosoni2788 Жыл бұрын
sir can you explain me SPGD algorithm please
@codewithbishal895
@codewithbishal895 4 ай бұрын
Excellent
@percyjardine5724
@percyjardine5724 4 жыл бұрын
thanks Krish
@aminuabdulsalami4325
@aminuabdulsalami4325 5 жыл бұрын
Great guy.
@RaviRanjan_ssj4
@RaviRanjan_ssj4 5 жыл бұрын
great video !!
@louerleseigneur4532
@louerleseigneur4532 3 жыл бұрын
Thanks buddy
@rameshthamizhselvan2458
@rameshthamizhselvan2458 5 жыл бұрын
Excellent!
@ting-yuhsu4229
@ting-yuhsu4229 4 жыл бұрын
You are AWESOME! :)
@thanicssubakar6303
@thanicssubakar6303 5 жыл бұрын
Nice bro
@phaneendra3700
@phaneendra3700 4 жыл бұрын
hats off man
@sathvikambati3464
@sathvikambati3464 2 жыл бұрын
Thanks
@praneethcj6544
@praneethcj6544 5 жыл бұрын
Perfect ..!!!
@shubhangiagrawal336
@shubhangiagrawal336 4 жыл бұрын
good video
@atchutram9894
@atchutram9894 5 жыл бұрын
Switch the auto focus feature in your camera. It is distracting.
@devaryan2201
@devaryan2201 3 жыл бұрын
do change your method of teaching seems like someone has read a book and just trying to copy thatt content from ones side .....use your own ideologies for it :)
@shekharkumar1902
@shekharkumar1902 5 жыл бұрын
Confusing one !
@chalapathinagavarmabhupath8432
@chalapathinagavarmabhupath8432 5 жыл бұрын
our videos are good but camara was bad
@KKKK-jr1nm
@KKKK-jr1nm 5 жыл бұрын
Why dont you buy him a new one ?
@chalapathinagavarmabhupath8432
@chalapathinagavarmabhupath8432 5 жыл бұрын
Pora eri poka
Tutorial 13- Global Minima and Local Minima in Depth Understanding
14:03
Gradient Descent, Step-by-Step
23:54
StatQuest with Josh Starmer
Рет қаралды 1,4 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
Арыстанның айқасы, Тәуіржанның шайқасы!
25:51
QosLike / ҚосЛайк / Косылайық
Рет қаралды 700 М.
Tutorial 14- Stochastic Gradient Descent with Momentum
13:15
Krish Naik
Рет қаралды 123 М.
25. Stochastic Gradient Descent
53:03
MIT OpenCourseWare
Рет қаралды 88 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 860 М.
Stochastic Gradient Descent, Clearly Explained!!!
10:53
StatQuest with Josh Starmer
Рет қаралды 499 М.
Tutorial 7- Vanishing Gradient Problem
14:30
Krish Naik
Рет қаралды 216 М.
Gradient descent, how neural networks learn | DL2
20:33
3Blue1Brown
Рет қаралды 7 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 360 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН