Thanks Julia! Love the inclusion of a linear model for imputing speed and angle with a linear model!
@mattm90693 жыл бұрын
Your blogs have helped me so much. Tidymodels for life!
@mkklindhardt3 жыл бұрын
Thank you Julia! I have been waiting for this unknowingly for too long. Great pleasure to follow your videos and always very insightful! Congratulations with your new space :)
@alexandroskatsiferis3 жыл бұрын
Another splendid screencast Julia!
@pabloormachea34043 жыл бұрын
Impressive! Thanks so much for the educational video - - it makes tidymodels very appealing!
@deannanuboshi13872 жыл бұрын
great video! Do you know how to get prediction or confidence interval in r? Thanks~~
@JuliaSilge2 жыл бұрын
An algorithm like xgboost doesn't involve math that can produce one natively (unless I am mistaken) but you can use resampling to create those kinds of intervals: markjrieke.github.io/workboots/
@gkuleck Жыл бұрын
Hi Julia, Nice video on a topic that I find intrinsically interesting as a baseball AND tidy models fan. I did run into an error when executing the tune_race_anova. Error in `test_parameters_gls()`: ! There were no valid metrics for the ANOVA model. I am not sure how to fix this and I have been careful to follow the scripts. Any idea what might be causing the error?
@JuliaSilge Жыл бұрын
When you see an error like that, it usually means your models are not able to fit/train. If you ever run into trouble with a workflow set or racing method like this, I recommend trying to just plain _fit_ the workflow on your training data one time, or use plain old `tune_grid()`. You will likely get a better understanding of where the problems are cropping up.
@juliantagell18913 жыл бұрын
Cheers Julia, Great video! Have been wondering about xgboost a bit lately -in regards to using tidymodels vs using the underlying xgboost package directly, with xgb.train(). I've heard mention that xgb.train() has an "automatic stop", that limits the number of trees when no more improvement is detected. This seems pretty helpful (and a great processing-time saver) rather then having to pre-specify the number of trees used. But I'm certainly not a pro at xgboost, so was just wondering your opinion. I like that tidymodels can be applied to all models but was just wondering if, in doing so, this comes at a cost (for xgboost tuning, specifically)
@JuliaSilge3 жыл бұрын
Yes, you can specify this (and even tune it to find the best value) in tidymodels. We call this early stopping parameter `stop_iter`: parsnip.tidymodels.org/reference/details_boost_tree_xgboost.html I used it in the last episode of SLICED I was on (with the Spotify dataset) if you want to watch that to see it in action, but I'll try to put together a tutorial/blog post demoing that soon.
@hansmeiser60783 жыл бұрын
Thank you Julia! I was asking me myself what would be the benefit. Can you tell us something about the advantages of tune_sim_anneal() too? And when it is better to fill param grid with a grid and not with an integer?
@JuliaSilge3 жыл бұрын
When you use an integer, the tune package uses a space-filling design rather than a regular grid design for the possible parameters to try. You can read about these two kinds of grids here: www.tmwr.org/grid-search.html#grids We write a bit about iterative search with simulated annealing here: www.tmwr.org/iterative-search.html#simulated-annealing
@hansmeiser60783 жыл бұрын
@@JuliaSilge But when I fill grid param with grid_latin or max_entropy, this would be space filling too- or do I missunderstan this?
@hansmeiser60783 жыл бұрын
simulated annealing is hard tobac... hope you make a video of it.
@JuliaSilge3 жыл бұрын
@@hansmeiser6078 Yes, that's right. If you put an integer, then it uses `grid_latin_hypercube()` to make a semi-random space-filling grid as a default: tune.tidymodels.org/reference/tune_grid.html#parameter-grids
@hansmeiser60783 жыл бұрын
@@JuliaSilge In a regression-case, what is better for tun_bayes(),tune_sim_anneal(),tune_race_anova()? To provide an external tuned grid (maybe grid_latin or grid_regula), or an integer, where is there the benefit. Could we avoid an averhead- or some redundance?
@AndreaDalseno3 жыл бұрын
Thank you very much once more for your videos, Julia. Another question for you: is there a way to have a progress bar or something like that to monitor the tuning process (that may take a long time to run)?
@JuliaSilge3 жыл бұрын
We don't have support for a progress bar due to how we use parallel workers (considering using the future package for this, though, which may open up other options) but you can set various `verbose` options in `control_race()` that may give you some of what you want: finetune.tidymodels.org/reference/control_race.html
@AndreaDalseno3 жыл бұрын
@@JuliaSilge thank you very much for your kind reply. I tried to use control_grid(verbose = TRUE) in the RandomForest example, just before fitting the grid, but I couldn't get any output (with parallel processing). Can you kindly let me have an example? I will check the future package
@JuliaSilge3 жыл бұрын
@@AndreaDalseno Ah I'm sorry I wasn't more clear; we are considering adding support for the future package which will likely allow for better progress messaging in the... future. I'm not sure if the `verbose` option will work right now. Here is an example to try: github.com/tidymodels/tune/issues/377
@AndreaDalseno3 жыл бұрын
@@JuliaSilge thank you very much for your hint. I did: regular_res
@JuliaSilge3 жыл бұрын
@@AndreaDalseno Yes, you can read more about the current status of how parallel processing works here: tune.tidymodels.org/articles/extras/optimizations.html#parallel-processing-1
@recordyao3 жыл бұрын
Hi Julia. Great tutorial! I think it's a great time-saving solution for tuning random grid points. It would be awesome if tune_race_anova could work with tune_bayes, in that once random grids are selected from tune_race_anova, it could pass as "initial" into tune_bayes to fine-tune the best. But currently it does not work, as tune_race_anova only finishes one point that fits all folders and tune_bayes needs as least the same number as tuning parameters. Is there an way around? Again, great work! : )
@JuliaSilge3 жыл бұрын
Ah no, this doesn't currently work as the infrastructure for tune_bayes() does currently expect all the tuning parameters to have been evaluated completely on resamples. You could post an issue on the repo asking if tune_bayes() could be changed to accept the subset and we could discuss it there: github.com/tidymodels/tune/issues
@recordyao3 жыл бұрын
@@JuliaSilge Thanks for pointing to the right place. It'll be awesome if the two can be combined. But of course, it'll be a lot of works for developers. We users are taking things as granted haha.
@jacquesboubou Жыл бұрын
Thank you so much! Great presentation. I have learned a lot. new_subsciption