Gradient Boosting Method and Random Forest - Mark Landry

  Рет қаралды 32,242

H2O.ai

H2O.ai

Күн бұрын

Пікірлер: 12
@geoffreyanderson4719
@geoffreyanderson4719 8 жыл бұрын
Mr.Landry reviewed accuracy, @27:20, based on validation dataset which was used during training to tune the model. It's not a realistic error estimate -- too optimistic -- consequently, when hit_ratio_table is looked at. Better to use new data to estimate error rather than data that was used to tune the model.
@marklandry2140
@marklandry2140 8 жыл бұрын
Hi @Geoffrey Anderson. A final/new test set is used, actually. This is introduced at about 18:00, and discussed at more length at 32:00, where it is scored for the first time. It trains on 60%, uses 20% for an internal validation set (early stopping), and the final 20% to evaluate when all tuning is complete.
@pbharadwaj3
@pbharadwaj3 6 жыл бұрын
The coding starts at 16:38
@poojawalavalkar355
@poojawalavalkar355 7 жыл бұрын
Beautifully explained. Thanks Mark!
@nomadjoy
@nomadjoy 6 жыл бұрын
Hi Mark, this was extremely helpful. Can you please share the github path for the same. Thanks.
@sarthakyadav371
@sarthakyadav371 4 жыл бұрын
You are awesome Mark!
@kojikitagawa7333
@kojikitagawa7333 6 жыл бұрын
Could someone please elaborate a little more on the hit ratio table starting at 23:45? I am a little confused on what the score represents at k >= 2
@anitamishra04
@anitamishra04 6 жыл бұрын
The best explaination
@hleljihen2007
@hleljihen2007 5 жыл бұрын
thank you for the video but can you please talk slowly
@Coral_dude
@Coral_dude 4 жыл бұрын
You can control the speed yourself with youtube controls
@chsuresh009
@chsuresh009 7 жыл бұрын
Hi Mark, I could not find anywhere, how to know the optimal number of rounds for GBM, in xgboost in cv, we get to know at what iteration the model reached optimal loss, but h2o, even when I give, validation set, stopping metric (logloss), stopping rounds (150) and stopping error 0.0001, it does not seems to stop. number of trees is always what is set in ntrees
@marklandry2140
@marklandry2140 7 жыл бұрын
Hi @Suresh Chinta. Stopping rounds of 150 is quite high. It may be valid in your case, but H2O will wait until the average of 150 consecutive rounds is within the stopping tolerance (you are using 0.0001 it seems) of the prior 150 consecutive rounds. And rounds uses score_tree_interval for how many trees are part of a round (default is variable by scoring time estimation). For reference, I typically use 2 for stopping_rounds. I usually set ntrees at a nearly unattainable number (e.g. 2000, 10000), drop the tolerance to 0, and also set the score_tree_interval to somewhere between 2 and 5. And those models typically stop well before the ntrees limit. In case it helps, since the demo is intended to be fast for people in the audience and that makes it a little less indicative of typical modeling, this is the latest model I've run this week: gbm
Top 10 Deep Learning Tips and Tricks - Arno Candel
24:44
H2O.ai
Рет қаралды 10 М.
MAGIC TIME ​⁠@Whoispelagheya
00:28
MasomkaMagic
Рет қаралды 38 МЛН
Motorbike Smashes Into Porsche! 😱
00:15
Caters Clips
Рет қаралды 22 МЛН
Trapped by the Machine, Saved by Kind Strangers! #shorts
00:21
Fabiosa Best Lifehacks
Рет қаралды 32 МЛН
Machine learning - Random forests
1:16:55
Nando de Freitas
Рет қаралды 238 М.
Trevor Hastie - Gradient Boosting Machine Learning
44:14
H2O.ai
Рет қаралды 151 М.
23.  Gradient Boosting
1:24:35
Inside Bloomberg
Рет қаралды 21 М.
StatQuest: Random Forests Part 1 - Building, Using and Evaluating
9:54
StatQuest with Josh Starmer
Рет қаралды 1,1 МЛН
H2O - Hands on with R, Python and Flow with Amy Wang
1:27:07
Applying Machine Learning Using H2O
1:04:38
H2O.ai
Рет қаралды 18 М.