Gradient Boosting and XGBoost in Machine Learning: Easy Explanation for Data Science Interviews

  Рет қаралды 46,143

Emma Ding

Emma Ding

Күн бұрын

Пікірлер: 47
@anand3064
@anand3064 Жыл бұрын
Beautifully written notes
@TrendingTelugu-1975
@TrendingTelugu-1975 2 ай бұрын
use loop.
@jennyhuang7603
@jennyhuang7603 2 жыл бұрын
For 5:10, why the MSE delta r_i is Y-F(X) instead of 2*(Y-F(X))? or is the coefficent doesn't matter?
@emma_ding
@emma_ding 2 жыл бұрын
Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!
@SanuSatyam
@SanuSatyam 2 жыл бұрын
Thanks a lot. Can you please make a video on Time Series Analysis? Thanks in Advance!
@emmafan713
@emmafan713 2 жыл бұрын
I am confused about the notation, so h_i is a function to predict r_i and r_i is the gradient of the loss function w.r.t the last prediction F(X). so h_i should be similar to r_i why h_i is similar to gradient of r_i
@Heinz3792
@Heinz3792 10 ай бұрын
I believe there is an error in this video. r_i is the gradient of the loss function w.r.t. the CURRENT F(X), i.e. F_i(X). The NEXT weak model h_i+1 is then trained to be able to predict r_i, the PREVIOUS residual. Alternatively all this could be written with i-1 instead of i, and i instead of i+1. TLDR: Emma should have called the first step "compute residual r_i-1", not r_i. And in the gradient formula, she should have written r_i-1.
@vishwamgupta-n6k
@vishwamgupta-n6k 3 ай бұрын
Is there a link to this resource?
@nihalnetha96
@nihalnetha96 8 ай бұрын
is there a way to get the notion notes?
@PhucHoang-ng4vh
@PhucHoang-ng4vh Жыл бұрын
just read out loud, no explanation at all
@Leopar525
@Leopar525 26 минут бұрын
Be smarter
@PhucHoang-ng4vh
@PhucHoang-ng4vh 6 минут бұрын
@@Leopar525Oh, ok, dude. So you’re very talented, but not everyone is like you-just watching tons of formulas and thinking they understand them.
@zhenwang5872
@zhenwang5872 Жыл бұрын
I usually watch Emma's video when I doing revision.
@elvykamunyokomanunebo1441
@elvykamunyokomanunebo1441 Жыл бұрын
Hi Emma, I'm struggling to understand how to build a model on residuals: 1) Do I predict the residuals and then get the mse of the residuals? What would be the point/use of that? 2) Do I somehow re-run the model considering some factor that focuses on accounting for more of the variability e.g. adding more features(important features) which reduce mse/residual? Then re-running the model adding a new feature to account for remaining residual until there is no more reduction in mse/residual?
@poshsims4016
@poshsims4016 Жыл бұрын
Ask Chat GPT every question you just typed. Preferably GPT-4
@Heinz3792
@Heinz3792 10 ай бұрын
It's important to understand what the residual is. The residual is a vector giving a magnitude of the prediction error AND the direction, i.e. the gradient. Thus, regarding your questions: 1) we predict the residual with a weak model, h, in order to know in what direction to move the prediction of the overall model F_i(X) so that it is reduced. We assume h makes a decent prediction, and thus we treat it like the gradient. 2) we then calculate alpha, the regulation parameter, in order to know HOW FAR to move in the direction of the gradient which h provides. I.e., how much weight to give model h. Minimizing the loss function gives us this value, and keeps us from over or undershooting the step size.
@wallords
@wallords Жыл бұрын
How do you add L1 regularization to a tree???
@jet3111
@jet3111 2 жыл бұрын
Thank you for the very informative video. It came up at my interview yesterday. I also got a question on time series forecasting and preventing data leakage. I think it would great to have a video about it.
@annialevko5771
@annialevko5771 Жыл бұрын
Hi! I have a question, how does the parallel tree building work? Because based in the gradient boosting it needs to calculate the error from the previous model in order to create the new one, so I dont really understand in which way is this parallelized
@shashizanje
@shashizanje 11 ай бұрын
Its parallelized in such a way that , during formation of tree , it can work parallel....means it can work on multiple independent features parellaly to reduce the computation time....suppose if it has to find root node, it has to check information gain of every single independent feature and then decide which feature would be best for root node...so in this case instead of calculating information gain one by one, it can parallely calculate IG of multiple features....
@Leo-xd9et
@Leo-xd9et Жыл бұрын
Really like the way you use Notion!
@emma_ding
@emma_ding Жыл бұрын
Thanks for the feedback, Leo! I tried out a bunch of different presentation methods before this one, so I'm glad to hear you're finding this platform useful! 😊
@kandiahchandrakumaran8521
@kandiahchandrakumaran8521 8 ай бұрын
Excellent video Many thanks. Could you kindly make a video for time to event with survival SVM, RSF, or XGBLC?
@ermiaazarkhalili5586
@ermiaazarkhalili5586 2 жыл бұрын
Any chance to have slides?
@NguyenSon-ew9wn
@NguyenSon-ew9wn 2 жыл бұрын
Agree. Hope to have that note
@emma_ding
@emma_ding 2 жыл бұрын
Yes! Download all the PDFs of my Notion pages at emmading.com/resources by navigating to the individual posts. Enjoy!
@riswandaayu5930
@riswandaayu5930 Жыл бұрын
Hallo Miss, thankyou for the knowledge, Miss can I request your file in this presentation ?
@aaronsayeb6566
@aaronsayeb6566 7 ай бұрын
there is a mistake in the representation of algorithm. the equation for ri, L(Y, F(X)), and grad ri = Y-F(X) can't hold true at the same time. I think ri= Y-F(X) and grade ri should be something else (right?)
@MahdiShayanNasr
@MahdiShayanNasr 8 ай бұрын
An excellent video
@objectobjectobject4707
@objectobjectobject4707 9 ай бұрын
Okay subscribed !
@MacadamMarcus-y1x
@MacadamMarcus-y1x 4 ай бұрын
Jones William Martinez Melissa Brown Patricia
@JamesaGray-b1l
@JamesaGray-b1l 4 ай бұрын
Lee Angela Garcia Richard Walker William
@MillDonald-g3p
@MillDonald-g3p 4 ай бұрын
Thomas Jessica Harris Cynthia Wilson Timothy
@faisalsal1
@faisalsal1 11 ай бұрын
She just read the text with zero knowledge about the content. U no good.
@WhittierElliot-f3b
@WhittierElliot-f3b 4 ай бұрын
Moore Matthew Martinez Nancy White Timothy
@LisaRamos-y4s
@LisaRamos-y4s 4 ай бұрын
Jones Matthew Brown Jennifer Williams Angela
@WhittierElliot-f3b
@WhittierElliot-f3b 4 ай бұрын
Jones Thomas Williams Jeffrey Anderson Patricia
@JeffreyLopez-m2k
@JeffreyLopez-m2k 4 ай бұрын
Young Linda Moore Kenneth Williams William
@KarenPerez-w9d
@KarenPerez-w9d 4 ай бұрын
Taylor Jeffrey Moore Cynthia Lopez Anthony
@BurneJonesClaire-b1v
@BurneJonesClaire-b1v 4 ай бұрын
Perez Paul Williams Donna Williams Paul
@MaxPenelope-w4j
@MaxPenelope-w4j 5 ай бұрын
Anderson Betty White Steven Smith Gary
@Susan-l5n7d
@Susan-l5n7d 4 ай бұрын
Taylor Elizabeth Taylor Mary Davis Michelle
@ZollMisc-c1w
@ZollMisc-c1w 3 ай бұрын
White Michael Lee Patricia Harris Linda
@redflipper992
@redflipper992 3 ай бұрын
If anyone wanted to read through a notion page full of notes, they could do it themselves, lady.
@SweetJeff-x7g
@SweetJeff-x7g 4 ай бұрын
Anderson Kimberly Davis Gary Harris Shirley
@ThackerayAudrey-j5g
@ThackerayAudrey-j5g 3 ай бұрын
Garcia Margaret Davis Kenneth Hall Linda
@MariaThomas-j1y
@MariaThomas-j1y 3 ай бұрын
Johnson Paul Jones Margaret Perez Michelle
Gradient Boosting : Data Science's Silver Bullet
15:48
ritvikmath
Рет қаралды 74 М.
Жездуха 42-серия
29:26
Million Show
Рет қаралды 2,6 МЛН
XGBoost Made Easy | Extreme Gradient Boosting | AWS SageMaker
21:38
Prof. Ryan Ahmed
Рет қаралды 42 М.
Visual Guide to Gradient Boosted Trees (xgboost)
4:06
Econoscent
Рет қаралды 183 М.
Gradient Boost Machine Learning|How Gradient boost work in Machine Learning
14:11
When to Use XGBoost
7:08
Super Data Science: ML & AI Podcast with Jon Krohn
Рет қаралды 6 М.
XGBoost Part 1 (of 4): Regression
25:46
StatQuest with Josh Starmer
Рет қаралды 696 М.
681: XGBoost: The Ultimate Classifier - with Matt Harrison
1:09:56
Super Data Science: ML & AI Podcast with Jon Krohn
Рет қаралды 6 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 440 М.
Жездуха 42-серия
29:26
Million Show
Рет қаралды 2,6 МЛН