Difference Between fit(), transform(), fit_transform() and predict() methods in Scikit-Learn

  Рет қаралды 93,460

Krish Naik

Krish Naik

Күн бұрын

Hello All,
iNeuron is coming up with the Affordable Advanced Deep Learning, Open CV and NLP(DLCVNLP) course. This batch is starting from 17th April and the timing will be 12:30pm to 2:30pm IST on Saturdays and Sunday and it will be live sessions.
Prerequisites: Python And Basic Machine Learning
The course fees will be 3000 inr+18% GST.
Download the syllabus and fill the form to reserve the seat
ineuron1.viewpa...
Incase of any queries you can contact the below number.
8788503778
6260726925
9538303385
8660034247
9880055539
-------------------------------------------------------------------------------------------------------------------------
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite for a few months and I love it! www.kite.com/g...
All Playlist In My channel
Interview Playlist: • Machine Learning Inter...
Complete DL Playlist: • Complete Road Map To P...
Julia Playlist: • Tutorial 1- Introducti...
Complete ML Playlist : • Complete Machine Learn...
Complete NLP Playlist: • Natural Language Proce...
Docker End To End Implementation: • Docker End to End Impl...
Live stream Playlist: • Pytorch
Machine Learning Pipelines: • Docker End to End Impl...
Pytorch Playlist: • Pytorch
Feature Engineering : • Feature Engineering
Live Projects : • Live Projects
Kaggle competition : • Kaggle Competitions
Mongodb with Python : • MongoDb with Python
MySQL With Python : • MYSQL Database With Py...
Deployment Architectures: • Deployment Architectur...
Amazon sagemaker : • Amazon SageMaker
Please donate if you want to support the channel through GPay UPID,
Gpay: krishnaik06@okicici
Telegram link: t.me/joinchat/...
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
Please do subscribe my other channel too
/ @krishnaikhindi
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06

Пікірлер: 145
@jammerules80
@jammerules80 6 ай бұрын
Thank you for the clear explanation. I spent $10K learn ML-AI from UC Berkeley and yet I could not understand this concept before this video. Job well done!
@survivor9367
@survivor9367 3 жыл бұрын
Actually I am searching for this in on other videos. As it is not available in you lr play list.. You just updated.. Thank you so much sir
@alessandrodf888
@alessandrodf888 3 жыл бұрын
Batman, Superman and Krish Naik
@nguyenthituyetnhung1780
@nguyenthituyetnhung1780 11 ай бұрын
So clear explanation that i can also understand process of machine learning. Thanks a lot
@pranaypakhale
@pranaypakhale 3 жыл бұрын
Can you please make video on different types of transformation viz standardscaler, minmaxscaler etc and when to use which
@theforester_
@theforester_ 2 жыл бұрын
awesome video thanks very much. big shout out to all indians out there helping out the world. big greetings from brazil
@nigamaveena4211
@nigamaveena4211 3 жыл бұрын
I always get confused about fit() ,transform(), fit_transform()....thank you sir... you are like a saviour to many people like me...
@cocgamingstar6990
@cocgamingstar6990 Жыл бұрын
Still not got
@pratikghute2343
@pratikghute2343 Жыл бұрын
@@cocgamingstar6990 see bro, first of all we use .fit in two scenarios first one is at time of scaling and second one is at models training. (scaler.fit_transform(xtrain) and scaler.transform(xtest) that is part of Data preprocessing step and the second scenario we use .fit is at model training (model.fit(xtrain)) there we use fit to fetch the parameters like slope and y intercept.
@TheKumarAshwin
@TheKumarAshwin 22 күн бұрын
@@cocgamingstar6990 It happens here is in summery, fit(): Learns from data (training). transform(): Applies the learned transformation to data. fit_transform(): Combines fit() and transform() in one step.
@aishwaryanarkar2954
@aishwaryanarkar2954 3 жыл бұрын
THANK YOU KRISH AMAZINGGGG BLESSINGS TO YOU
@adipurnomo5683
@adipurnomo5683 3 жыл бұрын
Classifier algorithm whose using distance usually do normalize the datasets before put to model
@hanscesa5678
@hanscesa5678 2 жыл бұрын
So where should you use fit(), transform(), fit_transform() during a K-Fold Cross Validation? Before CV or During CV?
@priyadarshanr296
@priyadarshanr296 3 жыл бұрын
Sir , I can understand that it formats the test data in the same format of train_data , but how does transform function helps to overcome overfitting,
@shashikarathnasinghe1241
@shashikarathnasinghe1241 3 жыл бұрын
Thank you soo much ,, was struggling to understand this concept .superrr well explained
@sugavananv
@sugavananv 2 ай бұрын
Best video! One question: Where is y_test used?
@shaelanderchauhan1963
@shaelanderchauhan1963 3 жыл бұрын
Thanks a lot you have contributed a lot to this community
@vineetj9102
@vineetj9102 10 күн бұрын
Can we apply the transformation for the y(independent variables) value also or should it be applied only to X(dependent variables)
@kiranvanukuri9382
@kiranvanukuri9382 3 жыл бұрын
Plz make video on image recognition in jupyter note book and deployment technique with deep explanation
@rishabhkumar-qs3jb
@rishabhkumar-qs3jb 3 жыл бұрын
Awesome explanation :)
@KiranGunda-ph7df
@KiranGunda-ph7df 4 ай бұрын
Superbbb explanation brother...
@manishaundale7458
@manishaundale7458 2 жыл бұрын
If the train data and test data unique values are different then how can we apply label encoder with fit and transform?
@darshanayenkar
@darshanayenkar 2 жыл бұрын
you have cleared my concept
@SUMITKUMAR-qi6mz
@SUMITKUMAR-qi6mz 3 жыл бұрын
I am having experience of 1 year in customer service in BPO but I want toh become Data scientist . But I'm having difficult toh get job in same because they are asking for experience in data science. Pls help me how to portrait my resume to get job
@merveozdas1193
@merveozdas1193 2 жыл бұрын
In which platform did you tell this lesson? you can use your pencil properly.
@arshad1781
@arshad1781 2 жыл бұрын
Nice 👍
@optimusagha3553
@optimusagha3553 2 жыл бұрын
Simple and straightforward! Thanks!!👏
@shadmanansari5750
@shadmanansari5750 Жыл бұрын
Hi, You mentioned that Fit_transform() is applied on Training data and only Transform() is applied on Test data, So, in case of StandardScaler, Fit_transform(Train) will have mean and std dev of train data, and then we are using same mean and std dev on 'Test data' Should'nt we apply Fit(on entire data) to calculate mean and standard dev of entire data, then transform(train) and transform(test)? Please clarify
@ece15amritanshusingh22
@ece15amritanshusingh22 Жыл бұрын
same doubt
@anirbanc88
@anirbanc88 Жыл бұрын
superb
@munawarabbasi9683
@munawarabbasi9683 Жыл бұрын
Thanks for making it a complete halwa.
@jlxip
@jlxip 3 жыл бұрын
Thank you so much, this helped me a lot :)
@kumarprince5054
@kumarprince5054 3 жыл бұрын
Thanks
@rkkcode
@rkkcode 2 жыл бұрын
Thank you .
@sanyamsharma3940
@sanyamsharma3940 2 жыл бұрын
You are amazing !
@nagamohan1412
@nagamohan1412 3 жыл бұрын
Hi krish, I am Naga Mohan. I want to use data science or data analyst technology for my fathers agriculture land but I don't how to start actually I am so much confused. I have no data. I don't know how to create my own data for my farm land. Can you please give me tips. How to start the project and how to create the data. We have 2 acres of paddy land and 2 acres of banana land
@milliesadie486
@milliesadie486 2 жыл бұрын
thank you
@stevetrabajo4065
@stevetrabajo4065 2 жыл бұрын
15:34 amazing
@harshagrawal5613
@harshagrawal5613 3 жыл бұрын
100℅ clear
@tharallaanil4115
@tharallaanil4115 3 жыл бұрын
sir please share me github link this program
@mariuszwiesiolek9340
@mariuszwiesiolek9340 3 жыл бұрын
This was fantastic, I really got the essence of not only when, how, and why to use fit(), transform(), fit_transform(), predict() but in the context I was looking for!
@devinpython5555
@devinpython5555 3 жыл бұрын
summary of this is (intuition) for train data: fit creates formula for all the features in dataset ,transform will transform data with created formula. for test data: formula already created just transform it accordingly.
@srujankumar637
@srujankumar637 2 жыл бұрын
But on test data fit( ) has not applies then how it gives transform( ) value,,, Because mean(mu) and st.dev has to Calculated for test data by using fit ( ).
@srujankumar637
@srujankumar637 2 жыл бұрын
Or data distribution is almost same for both train and test data so that's why mean and st.dev is same for train and test data... And once we got values for train by using fit () that will be transformed for test data ????
@subhashvarma4551
@subhashvarma4551 3 жыл бұрын
sir, if we apply the same mean in transforming the test data as in train data, this may be the case of data leakage where we are leaking information of train to test. which might not be preferable in the real-time scenario as future data should be totally anonymous to the train data. we should also perform a fit transform on the test data in such cases. Need your thoughts on this.
@mouleshm210
@mouleshm210 3 жыл бұрын
No bro, we should be cautious only on the data leakage from test to train data where, future data parameters like mean or min/max values must not be leaked while doing preprocessing, thats why we do only transform() in test data.
@naiduvinay4911
@naiduvinay4911 2 жыл бұрын
Thank You, understood
@saurabhbarasiya4721
@saurabhbarasiya4721 3 жыл бұрын
Thanks for this
@jamalhasanzakarneh9837
@jamalhasanzakarneh9837 3 жыл бұрын
Thank you Krish; it is just another beautiful video of your very helpful videos
@frankdearr2772
@frankdearr2772 2 жыл бұрын
Hi, I understood about well what you told, but could you tell me WHY y_train is not scaled like X_train ??? For me that is because values are like false or true , if the y_train values were different like 10, 5 , 41, 5.8, etc , I think I will have to scale y_train ?? Please show me the way for that small question about your video :)) Thanks for your great video about that topic Laurent
@kavanadeshpande9690
@kavanadeshpande9690 2 жыл бұрын
Hi, as per my knowledge, scaling of dependent feature is not necessary when we have less cardinality for classification problem. For regression, if we scale the dependent feature then automatically Mean Square Error will also get scaled.
@frankdearr2772
@frankdearr2772 2 жыл бұрын
@@kavanadeshpande9690 Thanks, great information. That give me the right way to go ahead. Please have a nice day :) Laurent
@frankdearr2772
@frankdearr2772 2 жыл бұрын
@@kavanadeshpande9690 Hi, thanks a lot for your answer.. I understand better now :) Please have a nice day Laurent
@MahmouudTolba
@MahmouudTolba 2 жыл бұрын
شرحك عالي يحشوم
@gomathic9557
@gomathic9557 3 жыл бұрын
it is very useful video krish, now i got a clear information about fit and transform thanks giving this useful information krish .
@shivu.sonwane4429
@shivu.sonwane4429 3 жыл бұрын
Fit_transform use on training data but transform only on testing /new data Applies the same transformation to both set of data which creates consistent column and prevent data leakage it means learning something from testing data this is not allowed
@bhargavikoti4208
@bhargavikoti4208 3 жыл бұрын
Finally😁..Thanks for uploading
@oyesinghji7910
@oyesinghji7910 2 жыл бұрын
hi krish, can you make a full video of how to do deployment full process video, including all steps.
@parikshithh4991
@parikshithh4991 3 жыл бұрын
Beautifully Explained
@hinaaqil1774
@hinaaqil1774 4 күн бұрын
Lets take the standardscaler formula . Its z=(x-mew)/n. .fit calculates the parameters in the formula just. here mew will be calculated only. but it doesnt change the values to new scaled valued. Now For training data We do both fit_transform It will calculate the 'mew' plus will transform the data to new scaled data. For testing As fit already calculated 'mew' for training data above , no need to calculate separate mew for test set. Just transform, it will automatically use the mew of training data and will transform to new scaled values. The same formula/parameter values needs to be applied to the test data which is calculated in training data when we did fit_transform. This will save us from overfitting.
@ajaykuruba1738
@ajaykuruba1738 3 жыл бұрын
Hi Krish It would be really helpful if you create a playlist on tensorflow serving and tensorflow lite.
@prekshamishra9750
@prekshamishra9750 3 жыл бұрын
Second😀
@prekshamishra9750
@prekshamishra9750 3 жыл бұрын
Girls are more active here❤️...amazing explanatn krish...kudos to u guys👍👍
@Md.SouravSarker
@Md.SouravSarker 9 күн бұрын
Very informative. Well explained. Thank you .
@dhivakarsomasundaram21
@dhivakarsomasundaram21 2 жыл бұрын
o to evaluate test data we should not use fit_transform. ....... transform only requires??
@yashub9580
@yashub9580 Жыл бұрын
sir can you please tell me how to resolve this error "Deprecated distribution is specified in `adstock__tv_pipe__carryover__strength` of param_distributions. Rejecting this because it may cause unexpected behavior. Please use new distributions such as FloatDistribution etc."
@arpankhadka8671
@arpankhadka8671 2 ай бұрын
You said for test data we only do transform, we don't do fit. But can we do transformation without fit?? For standardization mean and SD is calculated by fit according to what i understood from your video. Please explain it.
@zaindeen4490
@zaindeen4490 5 ай бұрын
Thank you so much krish sir. It was quite informative! I was searching for this kind of video but wasn't able to find it Thanks for all of your great efforts ❤
@nellitharun8466
@nellitharun8466 3 жыл бұрын
Sir unable to access your github filescode IAM learning python from 12 April 10:00am
@soukainahanafi1685
@soukainahanafi1685 3 жыл бұрын
I understand that but with a polynomial model we use fit_transform and not only fit .It' hard to understand . ##this is the example that I'm working on pr = PolynomialFeatures(degree=5) x_train_pr = pr.fit_transform(x_train[['horsepower']]) x_test_pr = pr.fit_transform(x_test[['horsepower']])
@Trendz-w5d
@Trendz-w5d 3 жыл бұрын
Why only transform only xtest not ytest and fit transform only xtrain not ytrain. Pls help on this. Why not on y train and ytest
@khaboninamasemola1970
@khaboninamasemola1970 2 жыл бұрын
You saved my backside with this video. Thank you.
@Harshpatel-uw2dw
@Harshpatel-uw2dw 6 ай бұрын
it amazing video i had come through a great understanding and very easy to understand the concept thank you sir
@sherin7444
@sherin7444 2 жыл бұрын
Before calling train test split why we didn't scale our data?
@akashgautam1909
@akashgautam1909 Жыл бұрын
I still don't understand why we don't perform fit on test data?
@RAHUDAS
@RAHUDAS 2 жыл бұрын
Where to put outlier detection in ur data processing chain ??
@akashkumar-bq7cl
@akashkumar-bq7cl 3 жыл бұрын
hi krish ,what will happen if i apply fit_transform to my test data as well?what will be the outcome?why shudnt we do it?is it because new mean and sd will be calculated for the test data?but we need the same mean and sd and formula of the train data to be applied to the test data aswellright?is that the reason we use only transform?just did not get this part and the rest of the video im so happy that so much content in just half an hour that too for free,GOD BLESS YOU PLEASE HELP
@bluejadoo6912
@bluejadoo6912 Жыл бұрын
thank you for clearing my doubts sir
@suvamgupta2914
@suvamgupta2914 2 жыл бұрын
Hats off sir!! Your explanation is of God level 💯 Thank you sir ❤️
@arpandas5974
@arpandas5974 3 жыл бұрын
Actually I am searching for this in on other videos. As it is not available in you lr play list.. You just updated.. Thank you so much sir
@paulkang2806
@paulkang2806 3 жыл бұрын
if you are fitting, and transforming for the scalers and normalization, and you fitted (mean, stdev) for the training data, and say if you are applying it to the test data, isn't that something related with data leakage?
@adipurnomo5683
@adipurnomo5683 3 жыл бұрын
13:46 sir, what the real world application when we don't use test data instead we use unseen data. Is the data from unseen data need to be normalize before put into model?
@bkpusprajkumar8744
@bkpusprajkumar8744 3 жыл бұрын
Thank you so much, sir for this lecture.
@BytemeMaybe
@BytemeMaybe 10 ай бұрын
amazing explanation, thx bro
@AromonChannel
@AromonChannel 3 жыл бұрын
Thank you so much krish naik! i've been trying to understand this and you explain it in the very easy way, so we can easily understand it, thank you!!!!!!!!!!!!!!!!!!!!!
@ajayjaadu42
@ajayjaadu42 Жыл бұрын
Sir you explain so good .Thankyou for this
@heliyahasani6859
@heliyahasani6859 2 жыл бұрын
I love you man you are a game changer god bless you please load more videos
@pragavipul1563
@pragavipul1563 3 жыл бұрын
what is the writing pad you use ?
@adityasharma5876
@adityasharma5876 3 жыл бұрын
Hi Krish please make a video on difference between map(), flat_map() and apply() in tf.Dataset
@harikrishna-harrypth
@harikrishna-harrypth 3 жыл бұрын
Krish, you are a LEGEND!!!!!!!!!!!! Thanks much for making these enlightening tutorials!!!!!!!
@RobertoTexis-h5i
@RobertoTexis-h5i Жыл бұрын
15:28 here is what we all came for
@pavanviswanadhapalli3512
@pavanviswanadhapalli3512 Жыл бұрын
compared to all other channels ], your classes are so detail and very understandable, so sir please can you make a complete vedio on pca...? please sir
@notmimul
@notmimul 2 жыл бұрын
God bless you!!! Your videos make everything simple.
@abdulrahiman8111
@abdulrahiman8111 2 жыл бұрын
Hi Krish. Your video was really informative and helped me understand the requirement as well as the difference between fit(), transform(), fit_transform() very well. Thank you
@mangkhongsai9029
@mangkhongsai9029 Жыл бұрын
Thank you so much...
@louerleseigneur4532
@louerleseigneur4532 3 жыл бұрын
Thanks Krish
@superfreiheit1
@superfreiheit1 2 ай бұрын
I cant read this text
@tonnysaha7676
@tonnysaha7676 3 жыл бұрын
Thank you very much sir🙏
@1111Shahad
@1111Shahad 2 жыл бұрын
Thank you Krish
@robertoespinoza199
@robertoespinoza199 3 жыл бұрын
thanks so much for the value of your videos 💯💯
@subhamsaha2235
@subhamsaha2235 3 жыл бұрын
Sir, you didnt tell one thing is that if we are applying fit and transform to X_train which means (for standard scalar) fit(calculating mu and sigma) then transform(applying z formula to every value), and ONLY transform to X_test which means mu and sigma are not calculated then how is it transforming the values? I think something else is also there in fit which is used to teach the model? Kindly clear my doubt. Thank you
@saikiranreddykondapalli279
@saikiranreddykondapalli279 3 жыл бұрын
while transforming test data we are using actually the mue and sigma values of trained data and comparing the transformed test data with predicted data .(this is what he actually mean).but it is wrong to do we cant use mue and sigma values of other data.so it is always better to split only after all the data set is fit and transformed.the it is quite valid to check predicted and actual test values
@shivamshinde9810
@shivamshinde9810 3 жыл бұрын
very helpful!! Thanks!!
@tatendaVIDZ90
@tatendaVIDZ90 2 жыл бұрын
this is beautiful
@hiral9591
@hiral9591 2 жыл бұрын
It's amazing👍
@Trendz-w5d
@Trendz-w5d 3 жыл бұрын
Thank you sir
@soajack
@soajack 3 жыл бұрын
Clearly Explained ! Thanks a lot !!!
@Fatih9837
@Fatih9837 Ай бұрын
Great Video
@vlogsbybushra
@vlogsbybushra 3 жыл бұрын
EDA?stands for?
@ishanagrawal396
@ishanagrawal396 3 ай бұрын
Exploratory data analysis
@naeymaislamph.d9976
@naeymaislamph.d9976 2 жыл бұрын
Excellent!
@bivasbisht1244
@bivasbisht1244 Жыл бұрын
amazing
@eitanshirman9072
@eitanshirman9072 2 жыл бұрын
Thank you so much for such a brilliant explanation!
@muhammadzeerakkhan6300
@muhammadzeerakkhan6300 3 жыл бұрын
Great explanation and intuition (Y)
когда не обедаешь в школе // EVA mash
00:57
EVA mash
Рет қаралды 3,5 МЛН
小丑妹妹插队被妈妈教训!#小丑#路飞#家庭#搞笑
00:12
家庭搞笑日记
Рет қаралды 38 МЛН
Watermelon magic box! #shorts by Leisi Crazy
00:20
Leisi Crazy
Рет қаралды 14 МЛН
How do Cats Eat Watermelon? 🍉
00:21
One More
Рет қаралды 9 МЛН
Roadmap to Transition into AI and ML in just 6 months!
17:39
ChemCoder
Рет қаралды 4,6 М.
AI vs ML vs DL vs Generative Ai
16:00
Krish Naik
Рет қаралды 45 М.
How I Would Learn Python FAST in 2024 (if I could start over)
12:19
Thu Vu data analytics
Рет қаралды 352 М.
Pydantic Tutorial • Solving Python's Biggest Problem
11:07
pixegami
Рет қаралды 274 М.
Scikit-Learn Model Pipeline Tutorial
16:50
Greg Hogg
Рет қаралды 26 М.
Standardization Vs Normalization- Feature Scaling
12:52
Krish Naik
Рет қаралды 299 М.
когда не обедаешь в школе // EVA mash
00:57
EVA mash
Рет қаралды 3,5 МЛН