Scikit-Learn Model Pipeline Tutorial

  Рет қаралды 28,558

Greg Hogg

Greg Hogg

Күн бұрын

Пікірлер: 50
@GregHogg
@GregHogg Жыл бұрын
Take my courses at mlnow.ai/!
@TheCsePower
@TheCsePower Жыл бұрын
Thanks Greg. This made me realise how non-standard my code is. I learnt: - Use copy or deepcopy and not assignment. - Always perform preprocessing on the train and test separately. - sklearn pipelines have nothing to do with ETL pipelines from Data Engineering. - sklearn transfers have nothing to do with NLP Transformers. - sk elarn estimators have nothing to do with Statistics estimators.
@GregHogg
@GregHogg Жыл бұрын
Super glad you got some useful pointers!!
@crepantherx
@crepantherx 3 жыл бұрын
Keep Posting Greg, I am Data Analyst by profession and your video certainly helps a lot
@GregHogg
@GregHogg 3 жыл бұрын
That's awesome! Thank you 😄
@hansenmarc
@hansenmarc 2 жыл бұрын
Great stuff! I’m curious why you used FunctionTransformer instead of ColumnTransformer, which could run the two scalers in parallel? Also, since FunctionTransformer is stateless, the documentation says that fit just checks the input rather than actually fitting the scaling parameters. Doesn’t that lead to data leakage since applying transform to test data won’t use parameters learned from fitting on the training data?
@kyleGrealis
@kyleGrealis 5 ай бұрын
thanks, Greg. really good explanation and structured example. this makes it easy to create a template for easy reuse!
@brandonn8166
@brandonn8166 Жыл бұрын
Just out of curiosity, is there a reason you don't use train_test_split to get X and y values?
@NikitaShilyaev
@NikitaShilyaev Жыл бұрын
yes, why he uses X_train for train_predictions instead of another dataset X_valid
@AmitabhSuman
@AmitabhSuman 2 жыл бұрын
A very practical video, that I came across on Pipelines. Thank you for this video!
@GregHogg
@GregHogg 2 жыл бұрын
Awesome that's great to hear. You're very welcome ☺️☺️
@ilanyutsis9653
@ilanyutsis9653 5 ай бұрын
When you do the StandardScaler().fit on the dataframe, what is the meaning of this operation? what is happening?
@alexrook5604
@alexrook5604 Жыл бұрын
I undstand what you are doing here but I have two questions that I think would be helpful and would make it easier to follow along and replicate you steps. 1) Where did you get the data. I can't the california_housing dataset that is already in the train/test form. 2) Why not use scikit-learn tooling rather than doing it yourself? Like you could have used train/test split or pipelines (or column transformer... or similar stuff). That just has me confused.
@JJGhostHunters
@JJGhostHunters 2 жыл бұрын
Great tutorial! I use the MinMaxScaler with the option to scale from -1 to 1 instead of 0 to 1 when I am dealing with values that can be positive and negative. Seems to be fine, but I may need to reconsider going forward. I have never noticed any issues though.
@rahiiqbal1294
@rahiiqbal1294 Жыл бұрын
This was very helpful, thank you :)
@JJGhostHunters
@JJGhostHunters 2 жыл бұрын
I would love to see a tutorial that covers using pipelines with multilayer perceptron models (MLPs), CNNs and LSTMS.
@lythien390
@lythien390 2 жыл бұрын
Thank you Greg! It's a great video!
@GregHogg
@GregHogg 2 жыл бұрын
Glad to hear it!
@TheFrankyguitar
@TheFrankyguitar Жыл бұрын
Thanks for this amazing video! Would that work also with a statsmodels model?
@GregHogg
@GregHogg Жыл бұрын
Thanks so much!! And I'm not sure, haven't tried :)
@junaidlatif2881
@junaidlatif2881 2 жыл бұрын
How to transform y variable and then fit model. And after how to reverse transform for the scatter plotting
@marcofogale9719
@marcofogale9719 11 ай бұрын
Perfect explanation. Thanks a lot
@GregHogg
@GregHogg 11 ай бұрын
Very welcome 😁
@talyb7383
@talyb7383 2 жыл бұрын
Thanks for the great tutorial! what do I need to change to create a pipeline for an image classification model? like the cifar10 model?
@GregHogg
@GregHogg 2 жыл бұрын
Well, everything. You probably won't be using scikit for that. And you're very welcome!
@talyb7383
@talyb7383 2 жыл бұрын
@@GregHogg I didnt explained myself clearly... I want to create a pipeline that receives a trained cifar10 model an also make preprocessing on the e data set ? so I cant use your way?
@Nadia-db6nb
@Nadia-db6nb 2 жыл бұрын
Thanks for the great tutorial. Can you make a video on how to combine multiple feature selection methods and feature extraction using python?
@fabio336ful
@fabio336ful 2 жыл бұрын
Did you say pipelines doesn't function for classifications problems? Min: 1:07
@GregHogg
@GregHogg 2 жыл бұрын
Does, not doesn't
@fabio336ful
@fabio336ful 2 жыл бұрын
@@GregHogg thanks 🙏🏼
@adriandiazNY
@adriandiazNY Жыл бұрын
Great Video!
@GregHogg
@GregHogg Жыл бұрын
Thank you Adrian!
@krzysztofzaucha3592
@krzysztofzaucha3592 9 ай бұрын
nice video Greg
@GregHogg
@GregHogg 9 ай бұрын
Thanks so much!!
@nabanitadasgupta
@nabanitadasgupta Жыл бұрын
Thank you for the video!
@00SeijiHan00
@00SeijiHan00 Жыл бұрын
TYSM bro really appreciate this
@GregHogg
@GregHogg Жыл бұрын
Very welcome!!
@tareq8109
@tareq8109 3 жыл бұрын
Bro can you show how to make youtube and any video downloader make by python
@juampaaa90
@juampaaa90 2 жыл бұрын
awesome ty
@allanmachado2011
@allanmachado2011 9 ай бұрын
Thank you!
@Supernyv
@Supernyv Жыл бұрын
Awesome !
@GregHogg
@GregHogg Жыл бұрын
Thank you!
@m18293
@m18293 Жыл бұрын
Can you share this notebook?
@GregHogg
@GregHogg Жыл бұрын
dang i think i lost it, sorry
@AceOnBase1
@AceOnBase1 Жыл бұрын
Bro you literally just copied this out of a textbook lmao but I respect the grind.
@MrAhsan99
@MrAhsan99 3 жыл бұрын
you are ❤
@GregHogg
@GregHogg 3 жыл бұрын
❤️
@johnspivack
@johnspivack Жыл бұрын
Too confusing. Too many tangents, doesn't cover the main idea clearly. Downvoted.
@GregHogg
@GregHogg Жыл бұрын
Well I upvoted it to counter you
@n8trh
@n8trh 3 ай бұрын
What tangents? This video was not only to the point from the start, but it also went into depth with useful examples. If you thought those were tangents, I recommend watching again, maybe with more care this time.
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
人是不能做到吗?#火影忍者 #家人  #佐助
00:20
火影忍者一家
Рет қаралды 20 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
Feature Engineering Techniques For Machine Learning in Python
47:58
Programming Is NOT Enough | Add these 7 skills…
13:19
Travis Media
Рет қаралды 424 М.
What is Data Pipeline? | Why Is It So Popular?
5:25
ByteByteGo
Рет қаралды 238 М.
DBSCAN Clustering Coding Tutorial in Python & Scikit-Learn
40:31
one year of studying (it was a mistake)
12:51
Jeffrey Codes
Рет қаралды 205 М.
How do I encode categorical features using scikit-learn?
27:59
Data School
Рет қаралды 140 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 357 М.
Data Pipelines Explained
8:29
IBM Technology
Рет қаралды 173 М.
Simplify Data Preprocessing with Python's Column Transformer: A Step-by-Step Guide
13:52
How To Self Study AI FAST
12:54
Tina Huang
Рет қаралды 632 М.
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН