Applied Ai course plus krish's tutorial!!! Deadly combination. I am in love wd ML. Thanks to you. You just changed my perception and gave me a perfect strategy to proceed. God bless u man!!!
@ganeshahire81685 жыл бұрын
Bro how much % you completed the course??
@unsharma92295 жыл бұрын
Which course u r talking about...where I can get the course please tell?
@akashsaha39214 жыл бұрын
@@ganeshahire8168 now doing case studies
@akashsaha39214 жыл бұрын
@@unsharma9229 search applied AI in google
@tulrose4 жыл бұрын
You are right. I have completed the course recently. I come here for quick revisions.
@shatadruroychowdhury63193 жыл бұрын
5:00 That condition should have been >0, if u have a feature with only 1 missing value, u won't be able to capture it.
@Ravi-sl5ms5 жыл бұрын
you have such a nice style of explaining things. waiting eagerly for the next part
@RahulGupta-kd1cn5 жыл бұрын
dividing train and test, if we have train and test data. then we can do like this train data, validate data (that will come when we divide the train data into the train and validate data) and use test data for testing the model
@astrostudent23024 жыл бұрын
Finding missing values part at 05:04..I have a doubt... Why is it not dataset[feature].isnull().sum() >= 1 (I have added equal operator). Can you please clarify sir?
@RavinderSingh-te8vy3 жыл бұрын
maybe for features having only 1 missing value , complete row can be deleted.
@c.vinaykumar77373 жыл бұрын
hello krish sir, your videos are brilliant, one slight correction, while calculating percentage of nan values in columns, your using mean to calculate percentage, after using mean you should multiply by 100 to get percentage, pardon me if iam wrong.
@youssefzayn11592 жыл бұрын
it is not necessary as multiply it by 100 is going to just make it more clear but you already know that 0.53 is 53%
@devadularani68115 жыл бұрын
Explained well...will be waiting for tomorrow's video
@ocean2738 Жыл бұрын
In train test split part their should be dataset.drop('SalePrice'),dataset['SalePrice]
@ananthkumar890111 ай бұрын
yes you are correct
@Mish-333Ай бұрын
Good explanation of the codes, but if you look closely, he's not explained the logic behind the codes meaning - did not explain the real reasons of why the particular steps are followed, or say what's the ultimate goal of the code/s.
@csprusty4 жыл бұрын
Amazing effort and clear explanation. great work Krish!!!
@ijeffking5 жыл бұрын
Very nice tutorial. Thank you very much.
@sahayaajay76844 жыл бұрын
Thank you for sharing your knowledge. I am waiting for your part 2 of feature engineering.
@sandipansarkar92114 жыл бұрын
Finished practising this particular code in Jupyter Notebook. thanks.
@rakeshdayalan80494 жыл бұрын
Krish , you're awesome ! thanks for your videos
@aashishrana41293 жыл бұрын
very insightful video, thanks krish
@ishankanodia74777 ай бұрын
For train test split, in the brackets we should use x and y but why are we using the complete dataset in place of x? (it contains Sales Price too)
@AmanKumarSharma-de7ft5 жыл бұрын
Great work sir👏👏👏simplicity as always Please upload the roc cirve video, docker video and how to write a research paper in ml and dl. Thanks for your support
@ManishKumar-qs1fm5 жыл бұрын
Really nice video 👍 sir, next part plz explain highly -ve or +ve skewed data, how it's becom normal in one code
@sarrae1003 жыл бұрын
Excellent, yet simple.
@Abdullahkbc2 жыл бұрын
pretty smart way at 09:58
@ananthkumar890111 ай бұрын
KrishNaik video + chatGPT/Bard is a deadly combination
@arunprabhu18535 жыл бұрын
Hi krish, I am in need of guidance from experts like you. My qualification is MSc (IT). My entire work experience is, being a "Computer Teacher". I am at my 30's too. By self-interest right now i am pursuing some Data Science courses through online. My question is, whether the companies will consider me for an IT job. Even as a fresher ? Please guide me in this. I am planning to change my profession.
@sidnayak43953 жыл бұрын
Yes ..one of my Faculty from Engg college with 4+ Years exp is working as ML Engineer
@raghupro4 жыл бұрын
Thanks for the video Krish. Would like to understand why for identifying features with missing values you have considered isnull().sum() > 1 and not > 0. If a feature has only 1 missing value, can we omit it?
@tanishajain90734 жыл бұрын
Did you get the ans.?
@prabhatkumarsharma42403 жыл бұрын
as you said in the case of high outliers the missing values for the feature should be replaced with median and mode, can you elaborate why? and what we should do if our variables don't have many outliers? Please answer if possible, it may solve others doubts as well.
@madhuradhongade76322 жыл бұрын
It might be because high outliers imply skewed distribution and hence mean would not be the correct measure of central tendency to consider. Take an eg, given a set of numbers: 10,15,17,20,95. The mean of this set of numbers would be 31.4 although most of the data is centered around 15/16. The reason being 95- the outlier. Hence it makes more sense to consider the median i.e. 17 as it makes more sense w.r.t to the given set of numbers. Correct me if I am wrong/missing something
@dipanshuawhad73962 жыл бұрын
For that you can prefer 7 day live statics playlist krish mentioned there about what is suitable mean,median,mode in case of outliers present in the dataset and why too
@sandipansarkar92114 жыл бұрын
Great video Krish. Now need to get my hands dirty with coding in Jupyter Notebook. Thanks
@143balug4 жыл бұрын
Hi Krish, I have observed one thing here, the temporal feature "GarageYrBlt" is replaced the missing values while handling the numerical feature. Could you please correct me if i am wrong.
@aimeokoko7265 жыл бұрын
Thanks for your videos, very useful. I have a question. To handle missings, you use "sum()>1", does it take into account features with only one missing?
@manojrangera3 жыл бұрын
Use sum() >=1
@manishchouhan66263 жыл бұрын
@@manojrangera Use sum () > 0
@miteshkumar77395 жыл бұрын
Hello sir... How to work data scientist in company. Make practical video plzzz....
@peacefullmusic83743 жыл бұрын
@Krish Naik bro why did use simple imputer for missing values ?
@ayanasalim33184 жыл бұрын
If the year fields are subtracted from year sold, is there handle situation where year value is zero, say in yearmodified? Because 2020-0 would give 2020 years since modified.
@unezkazi43494 жыл бұрын
The percentage of missing values that you are printing in the start are divided by 100 I guess. You need to multiply them by 100.
@unezkazi43494 жыл бұрын
And how did you handle categorical features? By just replacing nan with missing?
@vinayakbasavaraddi31354 жыл бұрын
why are we not creating dummies for categorical variables ? Instead of just replacing the null values with "Missing"
@vignesh76873 жыл бұрын
Nice Krish. I have one question, why can't we fill the NAs with mode value of that feature in a categorical feature column? Why encoding those as 'Missing'?
@ranganathjoshi15923 жыл бұрын
Bcoz,it can be even used for encoding(if in case).
@saipatibandla40494 жыл бұрын
Hi Krish, when I looked at the description for this dataset, some of the categorical features had 'NA' as one of the categories. I think this conflicts with data not being present vs showing some category as NA. Wouldn't that be like a problem when deciding which data is actually missing?
@vijendramathur14835 жыл бұрын
Any idea about Algo Trading?
@SuperChowhan5 жыл бұрын
Plz plz help me with Anaconda installation. I have reinstalled anconda but i cannot find anaconda-navigator, anaconda command prompt. No shortcuts are found related to anaconda.plz plz help me
@udaymishra32384 жыл бұрын
which OS ?
@rohitjaiswal61025 жыл бұрын
Thank you sir...
@tirumaleshn85044 жыл бұрын
Krish sir! Why didn't use the train and test data separately for feature engineering?
@raghavarora50774 жыл бұрын
What if we use K - fold Cross validation instead of Train-test-split?
@louerleseigneur45323 жыл бұрын
Thanks Krish
@suryapratap19614 жыл бұрын
Dataset.groupby('YrSold')['SalePrice']. median ().plot() . Here median will give one value so how we can plot from this ?
@mohanramesh35064 жыл бұрын
Hi Krish, How do I handle a column with 'Text description' i.e, a paragraph of text in it. Please let me know.
@abdurahman1019 Жыл бұрын
I am a beginner at data science and python as well. Is it expected from me to be able to write all these code by myself or is the understanding enough for an interview?
@adityapathania36183 жыл бұрын
For numerical nan values I am getting all replaced by 0 none of them are replaced by 1 , any suggestions. @krish
@saswatleo4 жыл бұрын
Why you are doing NaN replacement with 0 or 1 As u r replaced with Median ?? Plz Clarify
@adityanarendra58863 жыл бұрын
Which feature is replaced with 0,1,median respectively?
@manojrangera3 жыл бұрын
Sir I have 1 question in categorical features some of category having more than 90 ℅ of missing data . So can we remove those feat from dataset...? I saw EDA video also in which we get that missing data have important role in saleprice.. May be that y you used all features to get all the information.. Please sir reply me.. And get my doubt clear.. 🙏🙏
@rajsekharrouthu84384 жыл бұрын
Can we do feature engineering together for trains and test data at once
@rohitjaiswal61025 жыл бұрын
Please upload the 2nd part.
@amitbudhiraja74983 жыл бұрын
Sir u forgot to remove the outliers in the data
@sumironchatterjee62894 жыл бұрын
In the last part, my 'YearBuilt' and 'YearRemodAdd' got converted, but 'GarageYrBlt' did not. Someone Help Please.
@shreyapande92894 жыл бұрын
why to replace the' nan' values ONLY by median or mode in numerical feature ... is there any specific reason ?
@shashireddy73714 жыл бұрын
Becuase the feature has Outliers. If you take mean it will be completly wrong value .
@mranaljadhav82594 жыл бұрын
Because, the feature has outliers, here if you take mean it influenced by outliers and skewed distribution, so median is good to handle or deal with outliers.
@mansisarda22594 жыл бұрын
Sir why GarageYrBuilt values is getting converted to object datatype after handling its missing values ?
@mranaljadhav82594 жыл бұрын
I think you typed something wrong while handling missing value ,check your code ,my dataset [GarageYrBuilt ] is of float type.
@ramavathusrinivas82824 жыл бұрын
why we are keeping mean here dataset[feature].isnull().mean()
@aashiagarwal98704 жыл бұрын
I need help sir. If i do same code with test data than in place of saleprice which variable should I use?
@thisdot39553 жыл бұрын
I too have the same doubt.
@ranjan44954 жыл бұрын
Sir, The test.csv dataset, do not contain a feature named "Sale price". So how to proceed in this dataset.
@Joshua756234 жыл бұрын
Hi did you got the answer for this question??
@mranaljadhav82594 жыл бұрын
Hey test data doesn't contain target variable.
@lijindurairaj29823 жыл бұрын
GOD bless you
@Neuraldata4 жыл бұрын
Great video sir, I have also started the initiative to teach Data Science online for knowledge dissemination :)
@manishbolbanda98724 жыл бұрын
discrete_feature = [feature for feature in num_feature if len(dataset[feature].unique())
@christiansetzkorn6241 Жыл бұрын
nothing advanced even in part 2 )-:
@gulsanafatima4194 жыл бұрын
Sir plzz urdu m v De dety same yhi lecture h
@sandipansarkar92113 жыл бұрын
code finished
@devleenabanerjee40364 жыл бұрын
Hi Krish, Thanks for such a nice explanation. If I want to share my file with you, where should I send? Please share your email ID. Thankyou.
@krishnaik064 жыл бұрын
krishnaik06@gmail.com
@devleenabanerjee40364 жыл бұрын
@@krishnaik06 thankyou
@scott.bradley.169405 жыл бұрын
Put the camera so we can also see your mouth. It makes it easier to understand what you are saying.