Building a Machine Learning Pipeline with Python and Scikit-Learn | Step-by-Step Tutorial

  Рет қаралды 4,769

Ryan Nolan Data

Ryan Nolan Data

10 ай бұрын

Welcome to our comprehensive tutorial on building powerful machine learning pipelines using Python and Scikit-Learn! In this video, we will guide you through the entire process of creating a robust machine learning pipeline, from data preprocessing to model evaluation
Interested in discussing a Data or AI project? Feel free to reach out via email or simply complete the contact form on my website.
📧 Email: ryannolandata@gmail.com
🌐 Website & Blog: ryannolandata.com/
🍿 WATCH NEXT
Scikit-Learn and Machine Learning Playlist: • Scikit-Learn Tutorials...
Simple Imputer: • Handling Missing Data ...
Column Transformer: • Simplify Data Preproce...
Voting Classifier: • Mastering Voting Class...
MY OTHER SOCIALS:
👨‍💻 LinkedIn: / ryan-p-nolan
🐦 Twitter: / ryannolan_
⚙️ GitHub: github.com/RyanNolanData
🖥️ Discord: / discord
📚 *Practice SQL & Python Interview Questions: stratascratch.com/?via=ryan
WHO AM I?
As a full-time data analyst/scientist at a fintech company specializing in combating fraud within underwriting and risk, I've transitioned from my background in Electrical Engineering to pursue my true passion: data. In this dynamic field, I've discovered a profound interest in leveraging data analytics to address complex challenges in the financial sector.
This KZbin channel serves as both a platform for sharing knowledge and a personal journey of continuous learning. With a commitment to growth, I aim to expand my skill set by publishing 2 to 3 new videos each week, delving into various aspects of data analytics/science and Artificial Intelligence. Join me on this exciting journey as we explore the endless possibilities of data together.
*This is an affiliate program. I may receive a small portion of the final sale at no extra cost to you.

Пікірлер: 17
@RyanNolanData
@RyanNolanData 10 ай бұрын
d2 = {'Genre':['Rock', 'Metal', 'Bluegrass', 'Rock', np.nan, 'Rock', 'Rock', np.nan, 'Bluegrass', 'Rock'], 'Social_media_followers':[1000000, np.nan, 2000000, 1310000, 1700000, np.nan, 4100000, 1600000, 2200000, 1000000], 'Sold_out':[1,0,0,1,0,0,0,1,0,1]}
@dsmn92
@dsmn92 8 ай бұрын
This is by far the best tutorial I’ve come across on YT on pipelines and column transformers. Thank you Ryan
@RyanNolanData
@RyanNolanData 8 ай бұрын
Means a lot. Thanks for checking it out
@shahbazKHAN-wf9yn
@shahbazKHAN-wf9yn Күн бұрын
best tutorial come across ...love from india
@RyanNolanData
@RyanNolanData 14 сағат бұрын
Appreciate it
@modhua4497
@modhua4497 Ай бұрын
Excellent demo! Nice job
@user-iu5nz2gy6l
@user-iu5nz2gy6l 3 ай бұрын
Thanks again. This is one great video. Very informative and demonstrate how to get it done w pipeline. I just summarize your content, and have a few questions. Let me know if I misunderstand some of your content. Using the cat/num pipeline example for the summary. # 1 train_test_split raw data with NaN value # 2 define num_cols and cat_cols for the num_pipeline, and cat_pipeline # 3 make num_pipeline w Pipeline # 4 make cat_pipeline w Pipeline #5 use column transformer to combine num_pipeline and cat_pipeline # 6 using make_pipeline to combine column_transformer (preprocessing) and DTC (classifier)
@antonietakuz5636
@antonietakuz5636 Ай бұрын
Thanks you teacher!!
@RyanNolanData
@RyanNolanData Ай бұрын
No problem
@arnabmukherjee3129
@arnabmukherjee3129 28 күн бұрын
First of all I want to thank you for making such a beautiful informative video on this topic and very neat and clean explanation. I also want to know how to implement multiple Ml algorithms in the pipelines and choose the best algo according to the problem statement and dataset. Is there any way to do different tasks like 'nulifying multicollinearity(VIF)' and 'dimentionality reduction' within the pipeline?
@Guidussify
@Guidussify 24 күн бұрын
Do we need to scale for a decision tree?
@rishidixit7939
@rishidixit7939 Ай бұрын
What is the difference between makepipeline and pipeline while importing
@modhua4497
@modhua4497 Ай бұрын
where can I find this notebook with all the python scripts in the video?
@user-iu5nz2gy6l
@user-iu5nz2gy6l 3 ай бұрын
here are my few question: 1) Do we test_train_split first before we do anything? 2) We use fit_transfrom for imputer in previous videos? Do you just need to fit the data in pipeline, and imputer will transform it in the pipeline? 3) what is n_jobs = -1 in the column_transformer? 4) pipeline going to save us time to retype all the code, so say if i want to use a different classifier (like random forest)? Do i just need to modify step 6 like this RF = RandomForestClassifier() pipefinal = make_pipeline (col_transformer, RF) 5) also -wonder if i run a different classifier, does that mean i also redo all the preprocessing step and then use a different classifier 6) can you explain how i can use the saved pipeline? Thanks again, sorry for so many question, but you really provide a lot of good info on this topic for beginners. Thank you very much.
@princendukwe1627
@princendukwe1627 10 ай бұрын
Awesome 👏 I learnt new tricks
@RyanNolanData
@RyanNolanData 10 ай бұрын
Great! Thanks for checking it out
@user-xn8wg6yw7g
@user-xn8wg6yw7g 4 ай бұрын
Good video, thanks. One of the best on this topic. It would help if you explained the main idea more though. It always seemed mysterious how the output of one procedure/ function/pipeline component flowed into the next one. Also what conditions are required to be confident this process turns out right?
A Comprehensive Guide to Cross-Validation with Scikit-Learn and Python
24:55
Python Tuples Explained in 15 Minutes
18:33
Ryan Nolan Data
Рет қаралды 237
MEGA BOXES ARE BACK!!!
08:53
Brawl Stars
Рет қаралды 25 МЛН
The day of the sea 🌊 🤣❤️ #demariki
00:22
Demariki
Рет қаралды 66 МЛН
Smart Sigma Kid #funny #sigma #comedy
00:19
CRAZY GREAPA
Рет қаралды 18 МЛН
Learn To Code Like a GENIUS and Not Waste Time
9:41
The Coding Sloth
Рет қаралды 1,2 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 212 М.
Scikit-Learn Model Pipeline Tutorial
16:50
Greg Hogg
Рет қаралды 24 М.
Python Feature Scaling in SciKit-Learn (Normalization vs Standardization)
11:59
How do I encode categorical features using scikit-learn?
27:59
Data School
Рет қаралды 137 М.
The BEST library for building Data Pipelines...
11:32
Rob Mulla
Рет қаралды 70 М.
Hands-On Hyperparameter Tuning with Scikit-Learn: Tips and Tricks
18:35
🔥Идеальный чехол для iPhone! 📱 #apple #iphone
0:36
Не шарю!
Рет қаралды 1,3 МЛН
Купил этот ваш VR.
37:21
Ремонтяш
Рет қаралды 284 М.