You are the MVP, when no one has the answer, you do.
@doop9134 Жыл бұрын
I was stuck for days trying to figure out how to predict missing data using ML. This helped me understand so so so much better! 😍 Thank you so much!! 🙏💚
@mohitupadhayay14392 жыл бұрын
This was such an amazing life saver. I didn't even knew I had this question and the video just popped up. Didn't find this tutorial anywhere else.
@soumikchakraborty905 жыл бұрын
You are just awesome bro. Please make a video on AIC, AUC, ROC curve.
@aksontv5 жыл бұрын
Finally got right man to learn data science and ML. Thank you sir!
@duvanmartinez85865 жыл бұрын
Great work, you're awesome, you're the best youtuber I've found.
@pallabsaha40985 жыл бұрын
Very well explained. If you could show the same on a dataset and code that would be very helpful. Thank you sir for your videos. Love them all.
@chimadivine77153 ай бұрын
This is just awesome, Krish. Thanks so much!
@shivambhayre50565 жыл бұрын
I have no words to say just a thanks🙏
@abinashkumarsinha89583 жыл бұрын
This helped me a lot in my project work. Very useful and very well explained.
@keshavbansal51484 жыл бұрын
started this playlist today, loving it
@sandeepnallala483 жыл бұрын
doing a great work Krish. thanks a lot. Loved your Videos : )
@amedyasar94683 жыл бұрын
it was quite short explaination and nice points to undersdtand. Tanks!
@thatguyadarsh3 жыл бұрын
Amazing !! Use ML model to predict the NaN values.. That is clever sir.
@hv33005 жыл бұрын
Excellent video, as usual.
@tumul14744 жыл бұрын
thank you sir ! amazing video as always
@daniellazarolazaro10334 жыл бұрын
Thank you so much, this video actually helps a lot when you just got started like me hahahha, as I was saying, thank you so much for this great great great work!!!
@lupen2024-il2vc2 ай бұрын
Yes, very interesting but much more interesting if you make a practical aplication of this theory in a software like weka
@andyjackson45632 жыл бұрын
Thanks for explaining these methods
@ankurbanerji66053 жыл бұрын
Great explanation sir! Can you explain how to handle the missing values for multiple columns in a dataset
@AdventureinMotion-n8y7 ай бұрын
clean explaination
@Geethu_Mohan_DA2 жыл бұрын
Easy to understand. Thank you
@anandacharya99195 жыл бұрын
Thank you for this video. Please also make video how to handle missing value and Outlier in continues variables.
@anuragmishra62624 жыл бұрын
Can you please show practical implementation of the same. Thanks 😊
@divyaharshad998511 ай бұрын
For technique 3 will it lead to multicollinearity in the data?
@MegaJaivardhan5 жыл бұрын
love you bro.. could you make a video AUC and ROC curve?
@tahamansoor5994 жыл бұрын
its great it would be better if u show us a hands on the dataset
@sandyjust5 жыл бұрын
Great explanation of the concept. With unsupervised technique we might be in situation that both male and female falls under group 2. Then what would our approach?
@kaustabhmandal74834 жыл бұрын
I have also observed that in this video. You can put the the category with max frequency in that cluster.
@mohiuddinshojib26472 жыл бұрын
that is really informative
@Susa2702 жыл бұрын
Hello @ Krish Naik Hope you are doing well 🙂 First of all would like to thank you for such knowledgable videos. Most of the times your videos are really beam of hope. Can you please let me know where can I check the actual coding for the above mentioned concepts. It is a little difficult to get it in live scenario. Please guide, a humble request.
@out_aloud3 жыл бұрын
Hello sir, maybe I am here too late but I still hope that you would acknowledge this question as it might be of immense value. I have a disputed question which basically revolves around knn imputer, scaling and the concept of data leakage. As the knn imputer works on the principles same as knn algo, it does share the pros and cons of knn algo, right. So wont it be better to simply scale the data first ? Also, in case I am separating out the train and test data in order to avoid data leakage, should I split the data and then scale, impute ? Or should I impute and then split,scale it ? In case I split first...which is the most common preference which stats should I use for the user input. And lastly how should I handle the label encoded columns if any ? Nobody is discussing on this when it is one of the most imp problems a person would likely face. Can you please make a video on this ?
@madunishant60525 жыл бұрын
Thanks! 😊
@saurabhpathare41573 жыл бұрын
I am always reluctant to delete or use mode for categorical values. This video explains a lot. Good approach! In technique 3, which classifier do you recommend for best efficiency?
@riteshmukhopadhyay69222 жыл бұрын
KNN, there is no particular ways as such it depends on the dataset
@shaileshsahu95514 жыл бұрын
Please add a video in the Data Science and ML playlist of how to create our own predictor or estimator classifier algorithm to predict both categorical and continuous variables.
@CheeseKransky124 жыл бұрын
Thanks krish
@napoleonx5259 Жыл бұрын
كفو كريشنا ❤
@VikasSharma-ye7pu4 жыл бұрын
Hi krish ... Pls make video on in explaining 2 kaggle competition projects ...
@Saikrishna-lx9it5 жыл бұрын
Hi bro can you make one end to end chatbot video using rasa nlu, which is useful for all who are interested in nlp.
@fahimekheradmand58805 жыл бұрын
Excellent, Thank you
@ZUBINABRAHAM3 жыл бұрын
Thanks for the video it was informative. Can we use KNN?
@AmitYadav-ig8yt5 жыл бұрын
One more question- in some data set we find columns with many categories like Cars name column will have many cars name..In such case if we use this Unsupervised technique to create clusters, Won't it be too many clusters ?
@madhurchaudhary51094 жыл бұрын
Hi Krish, This is well explained!! I have an ID column which has unique value but for some records, ID is null how I can handle this type of data.
@Nursin-rg1ey Жыл бұрын
thanks very much sir
@ashokpalivela3114 жыл бұрын
thank you😍
@nasiksami23514 жыл бұрын
Amazing!
@Raja-tt4ll5 жыл бұрын
very nice video
@aronpollner2 жыл бұрын
Is there a Multivariate Imputer implementation for categorical values like a class from sklearn?
@AmitYadav-ig8yt5 жыл бұрын
Sir, U took data set which has a missing value in just one column. You told about Predicting missing value my using other columns as Training set. Let's say we have a data set in which every columns have some missing values..In such case which columns should be use to predict missing values?
@kannadarecipes-66264 жыл бұрын
Following
@habilmohammed51274 жыл бұрын
Following
@leilafakhraei784 жыл бұрын
Following
@barnadipdey84864 жыл бұрын
yes Amit I have the same query ,if you had solved this please dm me.
@mohammadarif80574 жыл бұрын
Sir can you provide a practical approach with complex data set ...that would be great thank you
@1a178909 ай бұрын
Sirji can you kindly show how it's done
@pankajkar20085 жыл бұрын
pure concepts
@amitjajoo95104 жыл бұрын
sir thanks for making feature engineering playlist.
@sandipansarkar92113 жыл бұрын
finished watching
@ateamoon4 жыл бұрын
Don't you increase correlation between features with those methods? If so - what that will bring to the output model - to the prediction?
@neggaznabil7570Ай бұрын
How you can apply knn knowing that the features F2,F3 are (numerical) and output (yes/no) I mean that how you evaluate the distance? You realise an encoding step (yes:1 no:0) And then you evaluate hamming distance or Euclidean distance
@sachinborgave80945 жыл бұрын
Hello sir... Please make a video that how to fill missing categories using logistic regression...
@muzamilshah80285 жыл бұрын
lets consider i want to predict value for f1 & row 2 as you have mention but what if we have also missing value in f2,f3 but not in same row ..what will we do in that scenario ????
@hindajjouri915111 ай бұрын
thank you
@KiranMadari1924 жыл бұрын
Krish . could please do with datasets
@sadikbilal51493 жыл бұрын
Nice , plz u have code to implement that techniques?
@ele_wings75215 жыл бұрын
thank you sir...
@chandrasekarank85834 жыл бұрын
Sir what if i can label encode the data then i can do a simple imputer which will replace the nan values by the mean or median as i wanted. Sir please tell me whether this is a way to do
@ashwinkrishnan42854 жыл бұрын
If we apply classifier algorithm to predict the Gender feature if it is male or female through other features including output feature as well, in training dataset and get the missing values of gender feature (Test dataset), and then finally when we go for the model to predict the classification of output hope it would be influenced or the data leakage would have happened as we considered that to fill missing column values? Please clarify on this point Krish..
@chirathabey77294 жыл бұрын
It won't as much because even though we are training including the output feature, it only used for predicting the missing samples ONLY. Considering the fact that there is much less missing samples as compared to rest of the samples. If the missing samples are considerably high and have in many other features then it will certainly create a bias on the final prediction.
@AmitYadav-ig8yt5 жыл бұрын
Sir, Can we get code for Create a classifier algorithm method for Missing value?
@abhipraydumka85874 жыл бұрын
Can you tell me how to assign a unique cateogry lets say U(undefined ) to missing cateogrical data
@sriraj83923 жыл бұрын
sir will u teach offline classes ...?
@theoutlet93004 жыл бұрын
since we are using output to predict our feature and then feature to predict our output, wouldnt it cause problems in prediction?
@raghavkumar83334 жыл бұрын
Sir, I have a student attrition dataset where I need to predict the reasons for student dropping out in 2nd year who got admission in 1st year. An year consist of 2 terms and I have grades of student (a,b,c,d) in 6 different courses in 1st and 2nd terms now most of these grade columns of 6 different courses in 2nd term are missing. Intuitive I think it could be a reason for dropping out. My question is 1) Should I impute missing values in this case because it is possible that it is not missing those students already dropped out. So, should I create dummy variables 2) If I impute missing value what technique should I use to impute those missing categorical variables
@Analystmind2 жыл бұрын
What if my model's missing values are not categorically it's number
@sachinborgave80945 жыл бұрын
Excellent Sir, can you please provide a python source code i.e. how to fill missing category data using logistics reg
@preetnandeshwar53314 жыл бұрын
which missing catgorial method suit for which data set and why?or we just have to use it like HIT AND TRIAL METHOD? Plz anyone help me .I am begineer
@Justme-dk7vm7 ай бұрын
Sir why do you have the same voice as my college chairman? 😩💓
@analistaremoto3 жыл бұрын
Niiiiiice!
@Gamers_glitch_turn3 жыл бұрын
Why do we fill NaN values with mean or median? And why does it won't effect the dataset Can you explain a bit in this?
@mitultank78722 жыл бұрын
If I have the missing values in numerical column, and I want to fill that based on other categorical variable column . Then how can I handle that?
@clivefernandes54354 жыл бұрын
Is method 3 widely used ? Never heard of it
@ommehta45012 жыл бұрын
If we have date categorical feature and have some missing values, please tell me how to do with this
@shivambhayre50565 жыл бұрын
If it is in quantitative variables we can replace missing value by mean
@AmitYadav-ig8yt5 жыл бұрын
Is it a question?, If yes, Then Yep You can take mean to replace Quantitative missing values
@AmitYadav-ig8yt5 жыл бұрын
Just a request...May you please upload codes for this also..-, I saw in many videos codes are missing for techniques..it will be very helpful if you provide us code. Thanks a lot
@janinajochim18434 жыл бұрын
Thank you for the video! Would you happen to know what to do in cases where the value is"Missing by design". I have a case where I am using the variable "Father's reaction to pregnancy" -- it has missing values for participants who did not know the father of the child because they didn't get this question :/
@sawradip4 жыл бұрын
May be you can consider that as a different catagory.
@chirumadderla81293 жыл бұрын
If there are several missing values in the solar radiation data during the night times and early morning hours how to handle them .The dataset I considered is of one year
@aditya_baser4 жыл бұрын
Here, you only had one categorical column. What if you have multiple categorical columns, how do you go about with the missing value treatment in that case?
@AmitYadav-ig8yt5 жыл бұрын
You said to Create a classifier to predict the missing values. What to do if we have Linear regression problem and Missing values there?, Should we create classifier for that too? Please response
@chirathabey77294 жыл бұрын
Yes, if you are trying to predict the missing value which belongs to a Categorical variable. Because when you are predicting missing value, your output variable will be the missing value variable and rest of the variables will become the input variables. You can think of you are trying to solve an entirely independent problem.
@RAJI110004 жыл бұрын
Sir how can impute if feature value like 100 mbps
@jaiminshah1433 жыл бұрын
How to handle missing(NaN) values in column having binary data values i.e Just 0 or 1 ?
@bismeetsingh3524 жыл бұрын
What do you do when you have missing values in textual data?
@chinmaybhat96364 жыл бұрын
Can you Share the Same thing by taking one dataset and showcase the same
@RishikeshGangaDarshan4 жыл бұрын
How to handel in regression oroblem
@shaikhkashif9973 Жыл бұрын
Sir pehle outliers fill yah null values fill karna chahiye ols answer
@RajaKumar-ne9bt2 жыл бұрын
Why we are skipping the output when doing clustering?
@kumarraju29234 жыл бұрын
How the initial clusters are selected for missing values
@akshayvilayatkar79854 жыл бұрын
How we can handle alphanumeric missing values in dataset. I can not got out of this problem ,Please help krish
@dineshkumar-kc7vt4 жыл бұрын
im unable to overcome this problem. I have initially done is get_dummies for the Dataset and i want to handle the missing values but i'm getting error so as TypeError: '(slice(None, None, None), slice(0, 2, None))' is an invalid key Please Help Me
@chirathabey77294 жыл бұрын
Before you apply One-Hot-Encoding, do the missing value treatment first
@archanapereira13334 жыл бұрын
How to identify dependent n independent variables in a dataset ?
@chirathabey77294 жыл бұрын
It depends on the problem description. It describes what the problem is. So, your output variable / dependent variable will give the answers to your problem. Rest of the features will become your independent variables
@cutyoopsmoments28005 жыл бұрын
Bro I want to make my career in Machine Learning. Kindly guide...
@arjyabasu13115 жыл бұрын
Sir please upload the implementation of these methods !!
@harshtiwari87654 жыл бұрын
can u send me the notes for feature enginerring which was given by Krish naik ? Help is appreciated
@vasusharma17734 жыл бұрын
sir if you could just show this in a code, it will be very helpful
@nhprml63245 жыл бұрын
we can replace missing values with corresponding feature's mean value.
@jaypatil47865 жыл бұрын
I have one easy question ...but I not remember it now please tell me to view how many missing values in dataset