TidyTuesday: Feature Elimination with TidyModels

  Рет қаралды 2,979

Andrew Couch

Andrew Couch

Күн бұрын

In this week's #TidyTuesday video, I go over common methods for handling data with a large number of correlated features. Using #TidyModels I go over general feature elimination methods using recipes. I then explain issues with correlated features and ways to analyze which correlated components to select. I explain different #MachineLearning algorithms that are useful for handling high-dimensional correlated data. I then show how to use domain knowledge and intuition by utilizing model variable importance scores. Finally, I show a brute force method of utilizing recursive feature elimination.
Connect with me on LinkedIn: / andrew-couch
Code for this video: github.com/and...
TidyTuesday: github.com/rfo...
PC Setup (Amazon Affiliates)
Keyboard: amzn.to/3Bbbk3T
Mouse: amzn.to/3BcRGVo
Microphone: amzn.to/3ePo9JS
Audio Interface: amzn.to/3qTAmjz
Webcam: amzn.to/3L9Ql6j
CPU: amzn.to/3qGa6Zu
GPU: amzn.to/3DnhMHL
RAM: amzn.to/3LdTxh7

Пікірлер: 10
@prod.kashkari3075
@prod.kashkari3075 3 жыл бұрын
This gotta be the most underrated video ever. Mans dropped KNOWLEDGE
@teegnas
@teegnas 3 жыл бұрын
Woah ... this was some really awesome content ... so glad to have subscribed to your TidyTuesday project last year ... helped me a lot in my Data Science job ... thanks a lot, Andrew!
@tighthead03
@tighthead03 3 жыл бұрын
great job on this video, there's so much info in here
@rashawnhoward564
@rashawnhoward564 3 жыл бұрын
Just finished reading the tidymodels book. This video is a great edition.
@Rodr51zx
@Rodr51zx 3 жыл бұрын
Hey, can you send me the link for the book?
@rashawnhoward564
@rashawnhoward564 3 жыл бұрын
@@Rodr51zx The link is: www.tmwr.org I made a pdf version so I could read it on my kindle
@mattm9069
@mattm9069 3 жыл бұрын
This is awesome. Thank you! Have you tried building a random forest without dummy encoding? I'm curious about the model performance in that case. Apparently, the ranger package can handle the raw columns. On the other hand, xgboost needs the dummy encoding.
@AndrewCouch
@AndrewCouch 3 жыл бұрын
I have done it without dummying/one-hot encoding and it generally will not make a difference. For me, I like to do dummy encoding so I can try different models with the same recipe. Thanks for watching!
@邵扬
@邵扬 3 жыл бұрын
welcome back!
@517127
@517127 2 жыл бұрын
O miss your videos
TidyTuesday: Deploying Shiny Apps using Docker
15:09
Andrew Couch
Рет қаралды 13 М.
TidyTuesday: Modern Forecasting with Prophet and TidyModels
38:10
Throwing Swords From My Blue Cybertruck
00:32
Mini Katana
Рет қаралды 11 МЛН
English or Spanish 🤣
00:16
GL Show
Рет қаралды 18 МЛН
An Unknown Ending💪
00:49
ISSEI / いっせい
Рет қаралды 8 МЛН
TidyTuesday: Model Analysis with Autoplot and TidyModels
26:57
Andrew Couch
Рет қаралды 1,6 М.
Data Scientist Ranks Every Data Visualization
44:28
Andrew Couch
Рет қаралды 1,2 М.
Cursor Is Beating VS Code (...by forking it)
18:00
Theo - t3․gg
Рет қаралды 69 М.
TidyTuesday: A Brief Overview of Dplyr and Tidyr
55:00
Andrew Couch
Рет қаралды 2 М.
TidyTuesday: Common GGplot2 Extensions
29:30
Andrew Couch
Рет қаралды 2,6 М.
TidyTuesday: A Brief Overview of Readr, Stringr, and Forcats
23:30
Andrew Couch
Рет қаралды 1,3 М.
TidyTuesday: A Data Science Workflow with RStudio
18:08
Andrew Couch
Рет қаралды 2,1 М.
TidyTuesday: A Brief Overview of Functional Programming with Purrr
44:04
TidyTuesday: Reactive Dashboards with R Shiny
28:16
Andrew Couch
Рет қаралды 7 М.