Discussing All The Types Of Feature Transformation In Machine Learning

Рет қаралды 81,051

Krish Naik

Күн бұрын

Пікірлер: 64

@krishnaik06 3 жыл бұрын

Please take care everyone.

@sarangtamrakar8723 3 жыл бұрын

you too sir.....

@shivu.sonwane4429 3 жыл бұрын

Yes take care you and your team for people in home isolation 👇🏻 I've almostj recovered from COVID in home isolation. I'm sharing what helped me recover in case it helps someone. • Steam atleast 3 times a day • Plenty of fluids: Water (preferably warm), lemonade, coconut water • Salt water gargles • Vitamin C supplement • Plenty of rest • Meditation for peace of mind • Balanced diet • Regain smell: Smell ajwain, kapoor and cloves • Lie on your stomach periodically Monitor oxygen every 2 hours. Seek medical assistance if it's 92 or below. Pls add if I missed anything Add ajwain and kapoor into the water while taking steam and drink malvani kadha (Tulsi, adrak, jaggery, lavng, Black paper, ajwain, gavti cha,dalchini)

@shivaragiman 3 жыл бұрын

@@shivu.sonwane4429 how can we monitor oxygen levels in home

@sunilsharanappa7721 3 жыл бұрын

Using oximetry it measures the oxygen level (oxygen saturation) though it's not very accurate but it's good enough for home.

@SALESENGLISH2020 3 жыл бұрын

Pray your team members recover quickly. India needs good teachers.

@teegnas 3 жыл бұрын

a very important video to review all feature important techniques at one go ... thanks for uploading!

@mostafakhazaeipanah1085 2 жыл бұрын

What A Useful and Informative Video. Most of the ML Courses are based on Algorithms which they forget the importance of Data Preparation

@shivaragiman 3 жыл бұрын

Get well soon, you people need more to us 👍👍👍👍👍

@poojapatil7128 3 жыл бұрын

I have completed my 1-year post-graduation program in data science from a leading institute, but the various techniques I learned from your videos in free, were not even mentioned in the curriculum. Thank you for your easy and detailed explanation.

@pseudounknow5559 3 жыл бұрын

Greetings from Poland

@cherubyGreens 3 жыл бұрын

Thanks mate!

@bhargavikoti4208 3 жыл бұрын

As usual neatly explained..👍👍thank you for uploading 🙏

@giandenorte 3 жыл бұрын

I am looking for these master krish! Take care too

@dheerendrasinghbhadauria9798 3 жыл бұрын

krish bhai....please upload a PDF of notes of video summary.... along with each video...

@kiyotube222 3 жыл бұрын

Get we soon Sudh!!

@nagrajkaranth123 3 жыл бұрын

Sir sudhanshu sir tested positive my god please I hope he get well soon

@alihaiderabdi9939 3 жыл бұрын

praying for employees of ineuron, inshallah everyone will get well soon.

@prakashkafle454 3 жыл бұрын

I pray for your team for speed recovery krish . We are also getting worst news day by day here in nepal ...

@mdadilhussain2967 3 жыл бұрын

I guess that you should first do fit_transform then train_test_split; As if you have first splited then according to train data you have calculated mean. Then applies same mean for test data, so test data won't have mean as zero. Please clear this doubt.

@fintech5816 3 жыл бұрын

Hi Adil, do you find the answer to your question? If yes, please share.

@70ME3E Жыл бұрын

from an SO answer: "Normalization across instances should be done after splitting the data between training and test set, using only the data from the training set. This is because the test set plays the role of fresh unseen data, so it's not supposed to be accessible at the training stage. Using any information coming from the test set before or during training is a potential bias in the evaluation of the performance."

@sandipansarkar9211 3 жыл бұрын

great explanation

@ashiqhussainkumar1391 3 жыл бұрын

Tbh I don't prefer any lecture series except nptel. But seeing your 20-25 I personally feel this channel is a better resource for practical implementation of ML... Initially I didn't subscribe bcz I felt ur profile is looking young and u might not be knowing the way u taught 😁😁😁... Subscribed Thanks to you and to Nptel

@satviksaxena3868 3 жыл бұрын

Hope the team will recover soon, Take Care !!

@captainmustard1 Жыл бұрын

thank you sir, it is just an amazing video!!

@ValliammaiMuthaiyah 5 ай бұрын

Excellent Sir!

@shubhamkondekar5382 3 жыл бұрын

Krish Naik is best

@ajaykushwaha-je6mw 2 жыл бұрын

Hi Krish, while transformation why we are not dividing our data in Train and Test ?

@yashpandey5484 3 жыл бұрын

Sir weather scalling is required after performing log transformation ??

@imtiazali-xu8gw 7 ай бұрын

Sir box cox transformation pe ak video banaye

@wahabali828 2 жыл бұрын

thank you very much sir

@tanujajoshi1901 3 жыл бұрын

Hey Krish, Can you explain Generative Adversarial Networks (GANs) especially the coding part for a dataset other than an image dataset?? It would be of great help.

@write2ruby 2 жыл бұрын

Very Informative

@SomeoneElsesSomeoneElse 2 жыл бұрын

With respect to StandardScaler() If you split the dataset prior to scaling the features then don't you risk having skewed features? Put differently, if you train your model to learn that values of 1 get a certain weight and in your test set the data isn't standardized around the same mean as the train set then the model will invariably have worse accuracy unless the train set and test set features have the same mean, right? Shouldn't the test set samples of the full dataset removed only to serve as an "out-of-sample" test? Not two separate datasets?

@sandipansarkar9211 3 жыл бұрын

finished watching

@abhishek_dataman6348 3 жыл бұрын

Do we require to check this transformation techniques in all binary classification problems?!

@mosart03 3 жыл бұрын

Are we suppose to scale categorical features along with continuous features?

@sunilsharanappa7721 3 жыл бұрын

No, you shouldn't scale categorical data. If the feature is categorical, it means that each value has a separate meaning, so normalizing will turn this features into something different. There are several ways to deal with categorical data: a) Integer Encoding: Where each unique label is mapped to an integer. b) One Hot Encoding: Where each label is mapped to a binary vector. c) Learned Embedding: Where a distributed representation of the categories is learned. --Sunil Sharanappa

@ayushsingh-qn8sb 3 жыл бұрын

If I have applied some encoding technique , do I have to scale them ?

@ashutoshtiwari5222 3 жыл бұрын

Sir app apna dyan rakhiye . 🥺😢

@priyayadav3990 3 жыл бұрын

In transformation we transform distribution in Normal distribution.then after transformation we also need to perform Standardisation(Scale down).please tell me if I am wrong.

@mayurgupta4004 3 жыл бұрын

when we are using gaussian transformation that will convert our distribution to gaussian distribution where mean=median or standard gaussian distribution where mean=0 and variance=1

@Sivaramakrishnanv7 3 жыл бұрын

In the join button, i can see (6 months: ₹283.20) plan. you have not mentioned this plan in that join video.Can you pls explain here sir?

@ishantyagi2701 2 жыл бұрын

should standardization be applied to whole dataset or after we split into train test data?

@Craeson1 Жыл бұрын

It is generally best to apply standardization to the training set only, and then apply the same scaling to the test set. This is because the test set should represent unseen data, and you want to evaluate the model's performance on the test set as closely as possible to how it would perform on new, unseen data. Applying standardization to the entire dataset before splitting it into training and test sets could result in information leakage, as the model could learn about the test set during training.

@sarthakphatate4595 3 жыл бұрын

good

@MdMahmudulHasanSuzan-- 3 жыл бұрын

how can i perform scaling on a k-fold data?

@vidulakamat6564 3 жыл бұрын

While doing the transformation, do we need to transform both numerical and categorical (encoded) features or only numerical ones? If target is continuous, do we need to transform that as well?

@sunilsharanappa7721 3 жыл бұрын

@vidulakamat6564 3 жыл бұрын

@@sunilsharanappa7721 thank you

@umaanil3344 3 жыл бұрын

Sir what about that 'df_scaled' term? I am getting error at that point that df_scaled is not defined... Can you please explain

@nishanthviswajith1496 3 жыл бұрын

I know python programming. And I'm learning data science by self-study .. My problem is I have 4 years gap in employment. Will I get job in data science field? Need your suggestions.. I'm 26 yrs old

@anandbihari3135 3 жыл бұрын

Same story bro , yes u will get job as data scientist just focus on prep and projects. I took gap for preparation for upsc and rbi. In 2016 I got campus placement in amazon as sde . But after 4 year break and covid scene i started preparing for ds and was fortunate enough to start with Sky as data engineer for 10lpa. So sure u will also get placed

@nishanthviswajith1496 3 жыл бұрын

@@anandbihari3135 skills required for a data engineer??

@208gamer4 Жыл бұрын

@@nishanthviswajith1496job lagi bro

@208gamer4 Жыл бұрын

@@nishanthviswajith1496Mca kar Raha hu koi scope hai bro

@foreignworker-2163 3 жыл бұрын

Pray for your team!

@venkatraaman4509 3 жыл бұрын

hai, for eg I have a feature regarding age, height, weight now I willing to make the gaussian transformation, here in my case ==>logarithm tx makes a good fit for age ==>reciprocal tx makes a good fit for height the question is may I use both features(applied with age(log tx) & height(reciprocal tx)) for my train data, kindly reply to me, sir

@venkatraaman4509 3 жыл бұрын

@Krish Naik. sir kindly reply me

@me_debankan4178 2 жыл бұрын

yeah , i have a same question , do you have any solution?

@moonSTAR1893 Жыл бұрын

Hello. Important mistake in this tutorial, so I have to stop watching it. Problem: you e.g. use MinMax Scaler on whole X_train with differently scaled variables inside. Let's assume "age" is distributed 18-65 while "fare" goes from 5-2000. Scaling age with the global min/max of the dataset, distorts your features. In this case for age 20 you would get z = X-Xmin/Xmax-Xmin = (20-5)/(2000-5) = 15/1995= 0.0075. Instead in the per-feature scaling with just age you would get z = (20-18)/(65-18) = 0.0426 corresponding to a 5-fold numerical difference. The maximal age of 65 would get z = (65-5)/(2000-5) = 0.03 !!!! Meaning age would have maximal value of 0.03 instead of 1!