Second Order Optimization - The Math of Intelligence #2

  Рет қаралды 84,361

Siraj Raval

Siraj Raval

Күн бұрын

Gradient Descent and its variants are very useful, but there exists an entire other class of optimization techniques that aren't as widely understood. We'll learn about second order method variants, how they compare to first order methods, and implement our own in Python.
Code for this video (with challenge):
github.com/llS...
Alberto's Winning Code:
github.com/alb...
Ivan's Runner up Code:
github.com/Pia...
Please Subscribe! And like. And comment. That's what keeps me going.
Course Syllabus:
github.com/llS...
More learning resources:
web.stanford.e...
www.cs.toronto...
www.quora.com/...
en.wikipedia.o...
• (ML 15.1) Newton's met...
• (ML 15.2) Newton's met...
Join us in the Wizards Slack channel:
wizards.herokua...
And please support me on Patreon:
www.patreon.co...
Follow me:
Twitter: / sirajraval
Facebook: / sirajology Instagram: / sirajraval Instagram: / sirajraval
Signup for my newsletter for exciting updates in the field of AI:
goo.gl/FZzJ5w
Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available):
www.wagergpt.co

Пікірлер: 205
@abhiwins123
@abhiwins123 7 жыл бұрын
epic intro :D
@SirajRaval
@SirajRaval 7 жыл бұрын
Thanks abhijith!
@FilipeSilva1
@FilipeSilva1 7 жыл бұрын
Bro, new mic, new rythm of teaching... This is quality!
@SirajRaval
@SirajRaval 7 жыл бұрын
thanks Filipe!
@BizTechLife
@BizTechLife 7 жыл бұрын
who need college when you have Siraj
@SirajRaval
@SirajRaval 7 жыл бұрын
I always got bored after ~15 minutes of almost every lecture i attended in college. I sometimes thought something was wrong with me. But the real problem was the system. boring teachers. the content wasn't dense enough. education should be as fun as cinematic movies. we are all apart of a new wave, education is getting disrupted hard af
@st101k
@st101k 7 жыл бұрын
Siraj Raval I completely agree.
@akmonra
@akmonra 7 жыл бұрын
For grammar lessons. Oooh!
@JustFun-li2dk
@JustFun-li2dk 7 жыл бұрын
That's Exactly how i feel!! Learn with Fun!
@Apreston14
@Apreston14 7 жыл бұрын
@Siraj Raval couldn't agree more! Starting my industrial placement for an ai company next week, cannot tell enough how much your tutorials have helped me prepare for it.
@nisargbhatt4512
@nisargbhatt4512 7 жыл бұрын
your videos are really awesome.. I actually get bored while reading research papers.. This is cool..keep it up
@rajumondal4283
@rajumondal4283 7 жыл бұрын
I find your low speed great plz keep this speed
@SirajRaval
@SirajRaval 7 жыл бұрын
thanks k
@UToob157
@UToob157 7 жыл бұрын
Had to crank the speed up to 1.25x to get the normal Siraj speed.
@wh33lers
@wh33lers 7 жыл бұрын
Cloak111 did this too
@SirajRaval
@SirajRaval 7 жыл бұрын
Cloak111 hm ok. perhaps this was the first video I've made where I actually went too slow. well now I know the lower bound for speed. thanks!
@kmrfrom
@kmrfrom 7 жыл бұрын
I watch your video at 0.75 speed .Now watched this and stunned myself that speed is Normal. Thanks maintain this speed.Can I expect a book on machine learning from you with "write what you speak/teach " method Not Like " I want to show my English literature skills and through on 'ML ocean' style" . I still love Rich F Raposa Java learning book style.(Of course i found it after my graduation:( )
@avijitmitra
@avijitmitra 7 жыл бұрын
Cloak111 I was gonna comment the same 😂
@konstaConstant
@konstaConstant 7 жыл бұрын
Too slow > Too fast You can always change the speed of the video and speeding up the video reduces audio quality less than slowing it down, because no interpolation is needed.
@PabloMartinez-ut8on
@PabloMartinez-ut8on 7 жыл бұрын
You are the best! Thanks for all ! From Montevideo Uruguay : )
@BizTechLife
@BizTechLife 7 жыл бұрын
outstanding!
@SirajRaval
@SirajRaval 7 жыл бұрын
thank you!
@karanpatel2866
@karanpatel2866 7 жыл бұрын
Awesome introduction on maths of ML !!
@marcusstoica
@marcusstoica Жыл бұрын
In the 2nd dimensional case: I believe it should be Hessian(X,Y,W) instead of Hessian(X,N,W). Thanks for the video. Using it to implement an iterated GMM estimator.
@amartyasingh09
@amartyasingh09 7 жыл бұрын
What's the classical music remix played at the intro?
@SirajRaval
@SirajRaval 7 жыл бұрын
Black Swan Hip Hop Beat/Instrumental
@Fightingbadgers
@Fightingbadgers 7 жыл бұрын
I like this "chill" Siraj :) I know it's not totally related to the video, but what are your thoughts on spiking neural networks? Would love to see those covered at one point, even though they are not so mature yet.
@SirajRaval
@SirajRaval 7 жыл бұрын
thanks will do one in future
@avishcheetaram1807
@avishcheetaram1807 7 жыл бұрын
Just awesome as always bro. Cheers!
@pinkiethesmilingcat2862
@pinkiethesmilingcat2862 7 жыл бұрын
i want todo ask a favor, please add you script to the subtitles in English, so that for me and people of other idioms it is easier to traslate. For my part i want to traslate it into Spanish so that people in Latín América and Spain can Access you good content. Thanks
@pinkiethesmilingcat2862
@pinkiethesmilingcat2862 7 жыл бұрын
this should be for all!!!!
@SirajRaval
@SirajRaval 7 жыл бұрын
copies of both scripts so far docs.google.com/document/d/1oacFxbgoROXCj3Eofyalhlnnzc_kE3vs9WYs80Rk9is/edit?usp=sharing
@pinkiethesmilingcat2862
@pinkiethesmilingcat2862 7 жыл бұрын
Siraj Raval Thanks!! i'll stay work on this
@hiral.shah.
@hiral.shah. 7 жыл бұрын
is this script still available? please share. Thanks
@diptinemade9259
@diptinemade9259 6 жыл бұрын
Taylor Swift lolz... This is the first time I understand what Taylor series actually is! And I have studied engineering mathematics 😯.. I wish your videos existed back then! Life would be so much easier. THANK YOU!!
@vinayhegde714
@vinayhegde714 7 жыл бұрын
really loved it..keep it up Siraj.
@BharathLaxman
@BharathLaxman 7 жыл бұрын
Did you decide to switch to regular speed? Now I have to manually dial up to 1.5x 😑
@SirajRaval
@SirajRaval 7 жыл бұрын
yes it is the lower bound now i know
@nikhilchigali
@nikhilchigali 6 жыл бұрын
What's the relation between running anomaly detection on 500 TB data and using second order optimization technique? Didn't quite understand the point there..
@CristiVladZ
@CristiVladZ 7 жыл бұрын
I dont know why I stopped following your videos a while ago. Silly me. Keep doing what you're doing. I'm glad I clicked this on fb.
@SirajRaval
@SirajRaval 7 жыл бұрын
glad to see u back cristi
@christianpeters3743
@christianpeters3743 7 жыл бұрын
Did you ever try a second order optimization algorithm with a neural network? Sure, it would probably require more computational effort, but I think there are some variations of newton for example, where you do not have to evaluate the whole hessian matrix. I think it would be interesting to see a comparision between first order and second order optimization methods in machine learning.
@Promixro
@Promixro 7 жыл бұрын
10/10 content
@McMurchie
@McMurchie 7 жыл бұрын
FORTRAN!!!!!! arrrrrghhhhhh!!!!!!! * antiquity intensifies*
@SirajRaval
@SirajRaval 7 жыл бұрын
hahah
@luck3949
@luck3949 7 жыл бұрын
Hold on, Jacobian looks like a matrix only for R^n -> R^m functions, and Hessian only exists for R^n -> R functions, so saying that Hessian is a second order derivative analog of Jacobian is incorrect. Seems that Hessian is a second order analog of a gradient, not a Jacobian. Or am I wrong?
@avatar098
@avatar098 7 жыл бұрын
This is like Bill Nye The Science Guy but for Machine Learning :D
@SirajRaval
@SirajRaval 7 жыл бұрын
awesome to hear!
@khandarwilliam5439
@khandarwilliam5439 7 жыл бұрын
why didn't you my lecturer?
@SirajRaval
@SirajRaval 7 жыл бұрын
i am now
@ShankhaShubhra
@ShankhaShubhra 7 жыл бұрын
at first i thought i was watching at 0.5x speed. did someone pointed out that you explain at a very fast pace?? 😝 anyways great video as always. keep rocking! :)
@pavanlulla
@pavanlulla 7 жыл бұрын
Siraj I understand you have to entertain us.. but do you really have to gesticulate so.. really? it’s so frustrating, because you are talking complex stuff but we have to keep getting distracted with what your hands are doing. Almost like a sign language. Why do you torture us so.
@complovr2971
@complovr2971 7 жыл бұрын
What's the abs(0-f(x)) for in the newton method? Why can't it just be abs(f(x))? Does it matter? Can someone answer this?
@k2s3p
@k2s3p 7 жыл бұрын
simple explanation+clear illustrations=intelligence :-) Take a bow (Sir)aj.
@jasneetsingh4018
@jasneetsingh4018 7 жыл бұрын
Bible for Machine learning??
@hammadshaikhha
@hammadshaikhha 7 жыл бұрын
I thought the pace in this video was perfect, the content was very clear as well, great work! On another note, I was looking your channel to see if you had any video Github and Jupyter notebook, I didn't seem to find anything. I will try look else where for now so I can start submitting the coding challenges.
@SirajRaval
@SirajRaval 7 жыл бұрын
thanks! Hmm good point. i will make github vid soon
@marceloguaycurus
@marceloguaycurus 7 жыл бұрын
If you don't want to wait, this udacity course is very complete (and free): br.udacity.com/course/how-to-use-git-and-github--ud775/
@maryguty1705
@maryguty1705 3 жыл бұрын
the second order optimization, how is it used in machine learning and neural network?
@RELU254
@RELU254 2 ай бұрын
this makes me feel i dont know anything in machine learning
@karimabdultankian28
@karimabdultankian28 7 жыл бұрын
2 Years of boring Maths explained in 10 minutes, well done.
@amr3162
@amr3162 7 жыл бұрын
when you try to run "newtons_method_optimization.py" you get an error " 'Float' has no attribute sqrt " which is due to numpy (np.linalg.norm) being unable to deal with with sympy ('sympy.matrices.dense.MutableDenseMatrix') that gets changed(from np.ndarray to sympy.matrices.dense.MutableDenseMatrix) on line 71 tried with both python3.6 numpy 1.12.1 and python 2.7 numpy 1.10.4 tried to solve it but it was apparently way too hard
@parambhavsar1573
@parambhavsar1573 7 жыл бұрын
i also have same problem.If you find solution please let me know.....
@damedane913
@damedane913 7 жыл бұрын
Taylor swift.... I died.
@SirajRaval
@SirajRaval 7 жыл бұрын
lol thanks
@harshvardhan7546
@harshvardhan7546 6 жыл бұрын
root finding method is newton ralphson's method
@hanman5195
@hanman5195 6 жыл бұрын
Your awesome and searching for best words to praise you
@devangsanghani6765
@devangsanghani6765 7 жыл бұрын
That's a cool white shirt!
@isaacnewton1545
@isaacnewton1545 4 жыл бұрын
None of these methods find the Global optimal solution
@larryteslaspacexboringlawr739
@larryteslaspacexboringlawr739 7 жыл бұрын
thank for math and intelligence video
@frenchpet
@frenchpet 7 жыл бұрын
wtf did I just watch 😥
@ricardodiaz3999
@ricardodiaz3999 7 жыл бұрын
Siraj please can you make a video or Live session about Faster R-CNN please I really appreciate it there's no information about it online
@SirajRaval
@SirajRaval 7 жыл бұрын
hmm considering thx
@RandhirRawatlal
@RandhirRawatlal 7 жыл бұрын
Fantastic enthusiasm and dedication to your field - an inspiration to all teachers..! I really liked how you simplified the explanation of the Hessian
@novovires5625
@novovires5625 7 жыл бұрын
She's a gorgeous lady :) lol
@hanyuliangchina
@hanyuliangchina 7 жыл бұрын
Please find the boring theory of interesting applications in reality. I do not like the only boring theory. that is not your style.
@SirajRaval
@SirajRaval 7 жыл бұрын
+libai tony hm interesting. Alright thanks.
@sam_joshua_s
@sam_joshua_s 3 жыл бұрын
Understandable video
@rajantechie
@rajantechie 7 жыл бұрын
why is the video quality bad Siraj ??
@SirajRaval
@SirajRaval 7 жыл бұрын
will improve
@syn3rman65
@syn3rman65 6 жыл бұрын
You need to do more math videos
@kmeeraj
@kmeeraj 3 жыл бұрын
Thank you
@osinachi6887
@osinachi6887 3 жыл бұрын
Wow wow wow. Am walking on arc welding robot and I have to do the inverse kinematics of 6 dof. I never knew anything about optimisation and I downloaded some articles and studied it. I started with Dampes Least Square for the inverse kinematics. I wanted to study quasi-newton method, then I encountered this guy. Your short lecture connected all the dots I have been studying for a week. Thanks🙏.
@RudolfEremyan
@RudolfEremyan 7 жыл бұрын
Thanks!))
@ilyajob0524
@ilyajob0524 7 жыл бұрын
Grace 😀
@griesrt
@griesrt 6 жыл бұрын
dang you the best.
@onefulltimeequivalent1230
@onefulltimeequivalent1230 7 жыл бұрын
Do you mind saying "minimum"(sg.) instead of "minima"(pl.) next time in order to optimize the professionalism function? :)
@MONUKUMAR-jk8xl
@MONUKUMAR-jk8xl 7 жыл бұрын
which country are you from Siraj? Guess you are Indian! just asking out of curiousity. Wanted to know about you.
@absharkhan9954
@absharkhan9954 6 жыл бұрын
Wow!
@ryanbach937
@ryanbach937 7 жыл бұрын
Do you have a BTC address for donations?
@SirajRaval
@SirajRaval 7 жыл бұрын
yes see about me section, patreon preferred
@pickarpit
@pickarpit 7 жыл бұрын
I came here just to see Siraj Dance ! !
@SirajRaval
@SirajRaval 7 жыл бұрын
haha thanks
@vinodp8577
@vinodp8577 6 жыл бұрын
Ripping off content from mathematicalmonk but not giving credit? Wew Siraj, Shame!
@mrvzhao
@mrvzhao 7 жыл бұрын
First viewing: WTF?? Second viewing: wait, I think I'm onto something Third viewing: the epiphany moment guy @ 9:00
@suraj2085
@suraj2085 7 жыл бұрын
I was kicked outta Polyglot group for saying mathematics a language
@srivishnuk3451
@srivishnuk3451 7 жыл бұрын
Hi Siraj, This is my entry to this week's contest. github.com/Sri-Vishnu-Kumar-K/MathOfIntelligence/tree/master/second_order_optimization_newtons_method I have used the same dataset I used for week#1 and have also compared the performance of the 2nd order optimizer to gradient descent, there is a decrease in error. The dataset, however, was too small to identify changes in execution time. I found function derivatives using the first principle. I hope you like it. Thanks for the fun tutorials. This is serving as a great learning medium for me.
@SirajRaval
@SirajRaval 7 жыл бұрын
great work
@srivishnuk3451
@srivishnuk3451 7 жыл бұрын
Siraj Raval thanks a lot Siraj. This served as a great motivator :)
@gijsvermarien
@gijsvermarien 7 жыл бұрын
The amount of bicycles in the background of shots confirms that this movie was taken in the Netherlands. I suppose you will not speak in the Netherlands anytime soon?
@SirajRaval
@SirajRaval 7 жыл бұрын
+Gies yup in Amsterdam, where I now live. Id be down to speak possibly. Email me the venue :)
@gijsvermarien
@gijsvermarien 7 жыл бұрын
Awesome, I am planning on starting my master applied data science at UvA in 2018:) About the speaking part I was just wondering whether you'd be speaking in the Netherlands soon. Really liked your talk at TNW Amsterdam. Unfortunately I'm not hosting anything sorry!
@kushajveersingh9973
@kushajveersingh9973 7 жыл бұрын
We are using gradient descent and the line we will use to fit the data is y=mx+c. I just wanted to know how will you compute the second order derivatice of it knowing that it is always zero. Or please tell me whether i am confused and please clarify that it is the activation function that we actually double derivate. If it is the activation function than double derivative will always be zero for ReLU. Thanks
@hiral.shah.
@hiral.shah. 7 жыл бұрын
Mother of 2, wants to resume my career back after a long gap. and i landed up here. Wow!, you have given me confidence that a mother can learn math too :) After watching you/your videos I am feeling that I took a wise decision of not joining college instead self learning. :) God Bless you Siraj.
@sushantahuja9067
@sushantahuja9067 7 жыл бұрын
cool!
@brooklynground7090
@brooklynground7090 6 жыл бұрын
You just killed 98 percent of professor at my college. Amazing.
@williamchamberlain2263
@williamchamberlain2263 6 жыл бұрын
... one of the three dudes who invented calculus, I think ...
@tbass134
@tbass134 7 жыл бұрын
silly question, but what version of python should we be using? 2,3?
@rishitshah3908
@rishitshah3908 6 жыл бұрын
Hey Siraj, I know it's been a while since you uploaded this video but I have a quick question. Can it happen that the minimum of the estimated function using Taylor series lies on the objective function itself? In that case our program would be stuck in infinite loop. Or, from a second perspective, is it could be the case that the minimum of the estimated and objective function coincide only when we reach minimum of objective function(Not claiming this is true but it seemed intuitive in above example so assumed a generalization for it.). If this is true, what is the mathematical explanation for it? Thanks a lot. Your videos rock by they way. Great work
@victorarias9688
@victorarias9688 7 жыл бұрын
ohhh my god, is this seriously?. you are the best professor in AI, greetings from Colombia.
@soumyadeepdebnath4063
@soumyadeepdebnath4063 7 жыл бұрын
Hey Siraj. I really like the way you teach. You are doing awesome . I wanted to say that conjugate gradient descent is more powerful than the traditional steepest descent, so why not we use it. Could you please make a video based on solving Supervised learning problems based on conjugate gradient method!!
@izzyellisonxx1459
@izzyellisonxx1459 7 жыл бұрын
We're u flying Ryan air yesterday from morocco
@SirajRaval
@SirajRaval 7 жыл бұрын
no lol
@chasegraham246
@chasegraham246 7 жыл бұрын
Siraj, why are you talking sooooooooo sloooooooooooooooow? Also, you forgot to delete your red circle slash thing from your Brook Taylor Swift joke. Siraj, are you on drugs now? :/
@SirajRaval
@SirajRaval 7 жыл бұрын
+Chase Graham it was on purpose. Thx. No. I will go a bit faster nxt time
@chasegraham246
@chasegraham246 7 жыл бұрын
Sorry, joke went over my head then. Guess I'll be putting this pipe down now.
@jdjdjdjjdjdjdj7863
@jdjdjdjjdjdjdj7863 7 жыл бұрын
Hi a newbie here! Where is the code to draw that cool function!? And a second question that I have: Where is the code of the moments you show in the video, for instance 6:21 ? Thank you!
@angellove8641
@angellove8641 4 жыл бұрын
kissing the edges of the optimization curve :*
@cryptverse
@cryptverse 7 жыл бұрын
Loving your energy, brother!
@SirajRaval
@SirajRaval 7 жыл бұрын
thx brosicle
@guitarheroprince123
@guitarheroprince123 7 жыл бұрын
correction: It's Newton-Rapson method.
@AkashMishra23
@AkashMishra23 7 жыл бұрын
You Have Great Pick Up skills lol
@SirajRaval
@SirajRaval 7 жыл бұрын
thx anything is possible
@ShahidulAbir
@ShahidulAbir 7 жыл бұрын
Siraj Raval did you get the silver play button for 100k subs?
@suryasikharej8116
@suryasikharej8116 5 жыл бұрын
sir where in the code we are finding the tailor series? and what is the output of that hessian matrix?
@ruxiz2007
@ruxiz2007 4 жыл бұрын
So clearly explained, no single word is with less value! Thanks!
@sau002
@sau002 7 жыл бұрын
wow! If only somebody had spoken to me like this 25 years ago!
@ELStalky
@ELStalky 7 жыл бұрын
"Minima" has a singular, it's "minimum". #GrammarNazi
@정성엽-f1y
@정성엽-f1y 6 жыл бұрын
It is really good but difficult. :( I have to repeat it!!
@jsandmann
@jsandmann 7 жыл бұрын
The sound recording sucks
@SirajRaval
@SirajRaval 7 жыл бұрын
Jonathan Sandmann will fix thx
@interstellarsurfer
@interstellarsurfer 7 жыл бұрын
You're a lucky man, Siraj. Many men would gladly discus second order regression with your lady friend... if they even had a clue. :-)
@SirajRaval
@SirajRaval 7 жыл бұрын
haha thanks
@thecodingproject7882
@thecodingproject7882 7 жыл бұрын
wow, g8 editing and g8 way to give Iformation ..
@lisetteelgueta5473
@lisetteelgueta5473 6 жыл бұрын
lol who’s the guy on minute 9:04
@CausticCatastrophe
@CausticCatastrophe 6 жыл бұрын
I appreciate everything you have done here.
@guavacupcake
@guavacupcake 7 жыл бұрын
I saw Taylor Swift coming from a mile away. I was like, "He's gonna say Taylor Swift now, isn't he"
@SirajRaval
@SirajRaval 7 жыл бұрын
u know me too well lmao
@shivajidutta8472
@shivajidutta8472 7 жыл бұрын
Really enjoyed the Video!! Really liked the math perspective to things.
@SirajRaval
@SirajRaval 7 жыл бұрын
thanks!
@luvsuneja
@luvsuneja 7 жыл бұрын
Lol... was that a Leibniz joke at 4:50 ?
@SirajRaval
@SirajRaval 7 жыл бұрын
yes
@soumyadrip
@soumyadrip 4 жыл бұрын
What a intro...XD
@franciscomelojunior2535
@franciscomelojunior2535 7 жыл бұрын
Normal time: 1.25x
@Aryeh-o
@Aryeh-o 7 жыл бұрын
siraj is a newtype
@kamalnayan9157
@kamalnayan9157 6 жыл бұрын
Newton-Raphson
@thebluebloo
@thebluebloo 7 жыл бұрын
That was great! Loved it!!
@ego_sum_liberi
@ego_sum_liberi 7 жыл бұрын
Great Video!
Gaussian Mixture Models - The Math of Intelligence (Week 7)
38:06
Siraj Raval
Рет қаралды 139 М.
What does the second derivative actually do in math and physics?
15:19
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 43 МЛН
🍉😋 #shorts
00:24
Денис Кукояка
Рет қаралды 3,7 МЛН
Minecraft Creeper Family is back! #minecraft #funny #memes
00:26
The Evolution of Gradient Descent
9:19
Siraj Raval
Рет қаралды 93 М.
Vectors - The Math of Intelligence #3
10:59
Siraj Raval
Рет қаралды 103 М.
What's The Longest Word You Can Write With Seven-Segment Displays?
8:56
The Sad Story of the Smartest Man Who Ever Lived
14:15
Newsthink
Рет қаралды 3 МЛН
12 Signs You’re Way More Intelligent Than You Realize
11:44
TopThink
Рет қаралды 1,9 МЛН
The BEST Way to Find a Random Point in a Circle | #SoME1 #3b1b
18:35
The High Schooler Who Solved a Prime Number Theorem
5:15
Quanta Magazine
Рет қаралды 2,2 МЛН
Neural Networks - The Math of Intelligence #4
11:19
Siraj Raval
Рет қаралды 53 М.
Intro to Gradient Descent || Optimizing High-Dimensional Equations
11:04
Dr. Trefor Bazett
Рет қаралды 69 М.
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 43 МЛН