Regularization - Explained!

  Рет қаралды 13,948

CodeEmporium

CodeEmporium

Жыл бұрын

We will explain Ridge, Lasso and a Bayesian interpretation of both.
ABOUT ME
⭕ Subscribe: kzbin.info...
📚 Medium Blog: / dataemporium
💻 Github: github.com/ajhalthor
👔 LinkedIn: / ajay-halthor-477974bb
RESOURCES
[1] Graphing calculator to plot nice charts: www.desmos.com
[2] Refer section 6.2 on "Shrinkage Methods" for mathematical details: hastie.su.domains/ISLR2/ISLRv...
[3] Karush-Kuhn-Tucker conditions for constrained optimization with inequality constraints: en.wikipedia.org/wiki/Karush-...
[4] stat exchange discussions on [3]: stats.stackexchange.com/quest...
[5] Proof of ridge regression: stats.stackexchange.com/quest...
[6] Laplace distribution (or double exponential distribution) used for lasso prior: en.wikipedia.org/wiki/Laplace...
[7] ‪@ritvikmath‬ 's amazing video for the bayesian interpretation of lasso and ridge regression: • Bayesian Linear Regres...
[8] Distinction between Maximum "Likelihood" Estimations and Maximum "A Posteriori" Estimations: agustinus.kristia.de/techblog...
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.net/MathML
📕 Calculus: imp.i384100.net/Calculus
📕 Statistics for Data Science: imp.i384100.net/AdvancedStati...
📕 Bayesian Statistics: imp.i384100.net/BayesianStati...
📕 Linear Algebra: imp.i384100.net/LinearAlgebra
📕 Probability: imp.i384100.net/Probability
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.net/Deep-Learning
📕 Python for Everybody: imp.i384100.net/python
📕 MLOps Course: imp.i384100.net/MLOps
📕 Natural Language Processing (NLP): imp.i384100.net/NLP
📕 Machine Learning in Production: imp.i384100.net/MLProduction
📕 Data Science Specialization: imp.i384100.net/DataScience
📕 Tensorflow: imp.i384100.net/Tensorflow

Пікірлер: 26
@ashishanand9642
@ashishanand9642 17 күн бұрын
Why this is so Underrated, this should be on every one playlist for linear regression. Hatsoff man :)
@data_quest_studio4944
@data_quest_studio4944 Жыл бұрын
My man looks sharp and dapper
@CodeEmporium
@CodeEmporium Жыл бұрын
Haha. Thanks! I think this shirt looked better on camera than in person. :)
@ajaytaneja111
@ajaytaneja111 Жыл бұрын
Hi Ajay, great video, as always. One suggestion with your permission;) I think it might be worthwhile introducing the concept of regularization by comparing: Feature elimination ( which is equivalent to making the weight zero) vs reducing the weight ( which is regularization) and elaborate on this and then drfting towards Lasso and, Ridge. ;)
@lucianofloripa123
@lucianofloripa123 2 күн бұрын
Good explanation!
@paull923
@paull923 Жыл бұрын
I had to watch it twice to truly digest your approach, but I like your approach to the contour plot in particular. I hope to boost your channel with my comments a tiny bit ;). tyvm! what I was taught and what is helpful to know imo: 1) Speaking on an abstract level what regularization achieves: it punishes high-dimensional terms. 2) The notion of L1- and L2- regularization and when you talk about "Gaussian" for Ridge, you could also talk about "Laplace" distribution instead of double exponential distribution for Lasso regression
@CodeEmporium
@CodeEmporium Жыл бұрын
Thanks so much for your comments Paul! And yea, I feel like I have seen similar contour plots in books but never truly understood “why” they were like that until I started diving into details myself. Hopefully in the future I can explain it in a way that you’d be able to get it in a single pass through the video too :)
@blairnicolle2218
@blairnicolle2218 Ай бұрын
Excellent videos! Great graphing for intuition of L1 regularization where parameters become exactly zero (9:45) as compared with behavior of L2 regularization.
@cormackjackson9442
@cormackjackson9442 2 ай бұрын
Such an awesome video! Can't believe i hadn't made the connection between ridge and Lagrangians, literally has a lambda in it lol!
@cormackjackson9442
@cormackjackson9442 2 ай бұрын
With the lasso intuition, the stepwise function you get for theta, how do you get the conditions on the right i.e. yi < lambda/2.I thought perhaps instead of writing theta < 0, you are just using the implied relationship between yi and lambda. E.g. that if theta < 0, and therefore |theta|.= - theta, which then after optimising gives theta = y - lambda/2 i.e. y = lambda/2 + theta, but then i get the opposite conditions as you...i.e. as theta is negative in this case wouldn't that give y = lambda/2 + theta < lambda/2?
@NicholasRenotte
@NicholasRenotte Жыл бұрын
Well hello everyone right back at you Ajay! These are fire, the live viz is on point!
@CodeEmporium
@CodeEmporium Жыл бұрын
Thank you for noticing ma guy. I will catch up to the 100K gang soon. Pls wait for me 😂
@NicholasRenotte
@NicholasRenotte Жыл бұрын
@@CodeEmporium 😂 you're one hunnit in my eyes 🙏
@fujinzhou7150
@fujinzhou7150 Жыл бұрын
Love your awesome videos! Salute! Thank you so much!
@CodeEmporium
@CodeEmporium Жыл бұрын
You are so welcome! I am happy this helps
@TheRainHarvester
@TheRainHarvester Жыл бұрын
Great content on your channel. I just found it! Heh i used desmos to debug/visualize too! I just added a video explaining easy multilayer back propogation. The book math with all the subscripts is confusing, so i did it without any. Much simpler to understand.
@CodeEmporium
@CodeEmporium Жыл бұрын
Thank you! And Solid work on that explanation :)
@sivakrishna5530
@sivakrishna5530 11 ай бұрын
always find interesting things here ,Keep going .Good luck .
@CodeEmporium
@CodeEmporium 11 ай бұрын
Hah! Glad that is the case. I am here to pique that interest :)
@kakunmaor
@kakunmaor Жыл бұрын
AWESOME!!!!! thanks!
@chadx8269
@chadx8269 Жыл бұрын
Nice explaination of Bayesian. Isn't Regularization just the Lagrange multiplier. The optimum point is where the the gradient of the constraint is proportional to the gradient of the cost function.
@abhirajarora7631
@abhirajarora7631 3 ай бұрын
It is mathematically written in the same way but they are not the same. Langrange multipliers are used when you need to min/max a given function provided a constraint, and then you find the value of lambda, but in regularisation, we set the lambda value ourselves. Regularisation gives us a penalty if we take steps towards the non minimum direction and thus allows us to go back to the correct direction in the following iteration.
10 ай бұрын
Nice video, thanks! The only thing I think is slightly incorrect is that you could see polynomials with increasing degrees as complex. Since you are talking about maths, I was expecting to see imaginary unit when I first heard complex.
@alexandergeorgiev2631
@alexandergeorgiev2631 Жыл бұрын
How does Gauss-Newton for nonlinear regression change with (L2) regularization?
@lijinhui6902
@lijinhui6902 9 ай бұрын
thx !
@glenngray2658
@glenngray2658 Жыл бұрын
🌸 pքɾօʍօʂʍ
Curse of Dimensionality - EXPLAINED!
9:01
CodeEmporium
Рет қаралды 4,7 М.
Regularization in a Neural Network | Dealing with overfitting
11:40
터키아이스크림🇹🇷🍦Turkish ice cream #funny #shorts
00:26
Byungari 병아리언니
Рет қаралды 25 МЛН
Получилось у Вики?😂 #хабибка
00:14
ХАБИБ
Рет қаралды 5 МЛН
THEY WANTED TO TAKE ALL HIS GOODIES 🍫🥤🍟😂
00:17
OKUNJATA
Рет қаралды 2,8 МЛН
Bayesian Linear Regression : Data Science Concepts
16:28
ritvikmath
Рет қаралды 74 М.
Regularization Part 1: Ridge (L2) Regression
20:27
StatQuest with Josh Starmer
Рет қаралды 1 МЛН
Bias Variance Tradeoff Explained!
9:45
CodeEmporium
Рет қаралды 5 М.
Applied Linear Algebra:  Solvability & Regularization
48:26
Nathan Kutz
Рет қаралды 4,4 М.
Gaussian Processes
23:47
Mutual Information
Рет қаралды 117 М.
Difference between L1 and L2 regularization
17:50
CS Folks
Рет қаралды 2,2 М.
Ridge Regression (L2 Regularization)
9:54
Endless Engineering
Рет қаралды 16 М.
Logistic Regression - Is it Linear Regression?
10:34
CodeEmporium
Рет қаралды 3,6 М.
터키아이스크림🇹🇷🍦Turkish ice cream #funny #shorts
00:26
Byungari 병아리언니
Рет қаралды 25 МЛН