Feature Selection in Python | Machine Learning Basics | Boston Housing Data

  Рет қаралды 50,133

PhD and Productivity

PhD and Productivity

Күн бұрын

Пікірлер: 71
@chantis1794
@chantis1794 4 жыл бұрын
Thanks for this video. It is very helpful to me especially for filtering the most important features.
@shannonbytelaar2918
@shannonbytelaar2918 Жыл бұрын
This really is an informative video! Thanks for the concise tutorial on feature selection.
@caraf6562
@caraf6562 3 жыл бұрын
I skipped this video when it was posted because I wasn't really doing anything related to machine learning, but lately I've been looking more into math modeling and linear regression to find ways it can support my experimental research. Excellent tutorial and I'm excited to try to implement some of these things into my own data sets! Still on the data collection & cleaning phase but making progress :)
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
That's great! It's seeming like more and more no matter what science (or even subject generally) you're involved with machine learning techniques will creep in!
@zeeshan3703
@zeeshan3703 3 жыл бұрын
Great Insights, and now I have SUBSCRIBED, and can't wait to see more from you. Huge Thank You!!!
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Thanks for subscribing Muhammad. Abs for your comment 😊
@vigneshpadmanabhan
@vigneshpadmanabhan 8 ай бұрын
We could look into data transformations as a way to enhance the model right…. I mean applying the right transformations to each column, we have the potential to maximize their effectiveness right??
@mengshuangfu8206
@mengshuangfu8206 3 жыл бұрын
Thanks for the detailed explanation! I have some questions, I'm running a unsupervised learning study. what if I have 200+ features, and most of them don't seem to have correlation with each other, how do I reduce features? And what's the best way to determine how many clusters I should choose?
@adeshinajohn3988
@adeshinajohn3988 2 жыл бұрын
Hi, Fu. Im sure you would have solved your challenge. can you state what you did? thanks
@musicbeast2079
@musicbeast2079 3 жыл бұрын
best video I have ever come across
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Extreme compliment 😆
@narothamreddy9310
@narothamreddy9310 2 жыл бұрын
I have a question should we remove outliers from data before feature selection process or not?
@da_ta
@da_ta 3 жыл бұрын
Apart low sound great explanation and clear steps thank you for doing this!
@rafidhaque2722
@rafidhaque2722 2 жыл бұрын
thank you so much for providing the python scripts...your video was very helpful for me
@arindamjain6892
@arindamjain6892 3 жыл бұрын
Loved it. Thanks
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Glad you enjoyed it!
@abhinavvivek3259
@abhinavvivek3259 3 жыл бұрын
I really appreciate this tutorial. Its really very helpful. However, I do have one quick question. Why haven't you use the train_test_split function to split the dataset to predict Y on the test dataset?
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Thanks! Because I used cross validation instead which splits into training and test several times then averages the error over the runs.
@info-dawg
@info-dawg 3 жыл бұрын
Amazing! Thanks
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Glad you like it!
@muhammadjamalahmed8664
@muhammadjamalahmed8664 4 жыл бұрын
Really helpful.. Thank you..
@data_first
@data_first 2 жыл бұрын
This video is great!
@maythamsaeed533
@maythamsaeed533 Жыл бұрын
Thanks very much for this informative post
@markgoh8302
@markgoh8302 3 жыл бұрын
Thanks for the tutorial. I am starting out in ML. I have a question. I notice in the video you used r2 to score your feature selection. When are you concerned about in the process that you over fitted the model?
@chanellioos
@chanellioos 2 жыл бұрын
This 🔥 Kira
@akashprabhakar6353
@akashprabhakar6353 4 жыл бұрын
Thanks for this video...I hv several doubts.. 1.Shouldn't we exclude features based on their correlation with other features and not the dependent variable?? 2.I am working on a similar housing dataset with 61 features...so will this forward and backward wrapper methods work? How will i find the best combination of features to select as u did here as the dataset has less no. of features. 3.Could u pls explain that pipeline code line with StandardScalar() and Kneighbors....is it automatically standardizing the numerical features and then implementing KNN? 4. My dataset has many categorical variables and after getting dummy variables, no. of features have gone up. So, how do I select using forward or backward selection as selecting levels of categories will not make any sense...so how do I select important categorical features? Your reply is highly appreciated :)
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
Hi Akash, sorry for the delay. 1. So the point of removing variables that have low correlation with the dependent variable is that they are variables that do not predict the dependent variable and so won't be useful to the model. The risk of only looking at correlation with other variables is that you could remove a potentially useful variable because of its correlation with a useless variable. 2. It works the same with larger number of features. 3. The pipeline uses cross validation with a StandardScaler so first it will divide the data into 5 splits of training and test data, train a scaler on the training data and apply to the test data, then train the Knn to the training data and test on the test data. 4. I would recommend with categorical features, if some feature levels are not useful then group some together e.g. levels 1-3, 4, 5-9 hope that helps.
@akashprabhakar6353
@akashprabhakar6353 4 жыл бұрын
@@PhDandProductivity thank you very much for this detailed reply
@tommy626
@tommy626 4 жыл бұрын
@@PhDandProductivity thanks for your video and reply! just a quick follow-up on this question, so for those variables that have low correlation with the dependent variable, conclusion is keep or remove them? thx!
@JinzhuWANG
@JinzhuWANG 4 жыл бұрын
Really helpful, Thanks!
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
I'm glad!
@zuhramajid
@zuhramajid 3 жыл бұрын
thank you so much😍 pleaseee make more machine learning videos
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Glad you enjoyed
@alifiaz7792
@alifiaz7792 3 жыл бұрын
Well Explained
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Thank you!
@xiaodongyang4841
@xiaodongyang4841 Жыл бұрын
Hi, your video is beneficial for my study. But I have a problem. Do your methods apply to financial data analysis? For instance, financial market predictions.
@voshark7586
@voshark7586 2 жыл бұрын
Sorry . Can you tell me the roles of Feature Selection, please??? Thank you so much.
@xmine9077
@xmine9077 2 жыл бұрын
Thanks a lot for this
@PhDandProductivity
@PhDandProductivity 2 жыл бұрын
My pleasure
@manishbolbanda9872
@manishbolbanda9872 4 жыл бұрын
EDA ,Feature Engg and Feature selection are these things irrespective of Machine learning Algo to be used?? any answer would be appreciated. Thanks.
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
manish bolbanda it can depends which procedure you use because different features will work better in different learning algorithms. So often wrapper feature selection is preferred over filtering when you are using a particular algorithm.
@m.randayandika3779
@m.randayandika3779 3 жыл бұрын
Thanks for the video! Its. really good. But i have problem with you code in wrapper section because i have different dataset. Can you help me to solve my problem? Thanks a lot
@uzhankocaman7457
@uzhankocaman7457 4 жыл бұрын
very informative. thanks!
@ajaysingh001
@ajaysingh001 4 жыл бұрын
thanks its really nice
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
I'm glad you like it
@riccardosecci2637
@riccardosecci2637 4 жыл бұрын
Thank you for this videos :)
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
Riccardo Secci glad you like it!
@riccardosecci2637
@riccardosecci2637 4 жыл бұрын
@@PhDandProductivity I am soon to be PhD student in a field similar to yours (Bioinformatics) 😊 so I'm trying to deepen my understanding of Machine learning! Your videos are very clear and explanatory
@pranavipatel9259
@pranavipatel9259 3 жыл бұрын
try to put codes in the webpage and put the link of it in discriptions
@harshithbangera7905
@harshithbangera7905 3 жыл бұрын
Thanks 👍...
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Thank you too
@simon-4530
@simon-4530 4 жыл бұрын
Great video 😊 thanks
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
Glad you enjoyed it :)
@dataaholic
@dataaholic 4 жыл бұрын
Really nice video. Loved it. Can you make videos on implementing Kernels and Gaussian Processes in python? Thanks in advance
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
Shubham Shakya unfortunately that’s not something that I’m knowledgable about.
@hiruki8
@hiruki8 2 жыл бұрын
@@PhDandProductivity is it something you've since become more knowledgable about? 👀
@finnzhang1323
@finnzhang1323 3 жыл бұрын
what should we use instead of R_squared in classification problems
@siddharthchauhan8285
@siddharthchauhan8285 3 жыл бұрын
will it be helpful to join this community for someone who is new / beginner to machine learning ?
@bouabdellahtahar5126
@bouabdellahtahar5126 3 жыл бұрын
how to select features using swarm in python
@dashsingh30095
@dashsingh30095 3 жыл бұрын
Doing great job
@hussainsalih3520
@hussainsalih3520 2 жыл бұрын
amazing keep it up
@akshitmiglani5419
@akshitmiglani5419 3 жыл бұрын
Hi there, Thank you for sharing your knowledge. I have a question: When we check the correlation, should we not check it only with continuous variables? I'm not sure what "correlation" tells us with categorical variables. Doubt(in a broad sense) : Should we not first check if our variables are continuous/categorical and then choose the methods? For example : t-test/correlation for continuous and chi-sq test for categorical variables after doing label encoding. This has always bothered me. Looking forward to find out where the gap in my understanding is. Thank you!
@lionelshaneyfelt5607
@lionelshaneyfelt5607 3 жыл бұрын
you're awesome tank u
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
You're awesome too. Thanks for the comment
@saeedhassiny7454
@saeedhassiny7454 3 жыл бұрын
please how i can doing feature selection on 1D feature vector that represent the features extracted from image
@saikiranalagatham3555
@saikiranalagatham3555 3 жыл бұрын
Can we select features based on gain ratio or information gain? Like a tree based approach
@PhDandProductivity
@PhDandProductivity 3 жыл бұрын
Yes that would be another option
@andisupriadichan5188
@andisupriadichan5188 4 жыл бұрын
Hi kira ,thank you for your video, i' chan ,but can i ask some question about u research phd, i'm still little bit confusing about novelty in phd research, i'm concern in data mining and ML. can u tell me or make a video about this ?
@PhDandProductivity
@PhDandProductivity 4 жыл бұрын
Andi Supriadi Chan hi thanks for commenting. I’m not really sure what you are asking.
@XX-vu5jo
@XX-vu5jo 4 жыл бұрын
PhD but doing basic stuff lol
@cgmiguel
@cgmiguel 4 жыл бұрын
That’s why she’s teaching. These are not PhD related.
Feature selection in machine learning | Full course
46:41
Data Science with Marco
Рет қаралды 26 М.
My PhD Viva Experience - What They Won't Tell You
24:56
PhD and Productivity
Рет қаралды 5 М.
Je peux le faire
00:13
Daniil le Russe
Рет қаралды 22 МЛН
SHAPALAQ 6 серия / 3 часть #aminkavitaminka #aminak #aminokka #расулшоу
00:59
Аминка Витаминка
Рет қаралды 420 М.
How do I select features for Machine Learning?
13:16
Data School
Рет қаралды 177 М.
Chi Square (Category) | Feature Selection | Python
10:44
Hackers Realm
Рет қаралды 11 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 301 М.
Feature Engineering Secret From A Kaggle Grandmaster
22:23
Forecastegy
Рет қаралды 37 М.
XGBoost in Python from Start to Finish
56:43
StatQuest with Josh Starmer
Рет қаралды 226 М.
I Studied Data Job Trends for 24 Hours to Save Your Career! (ft Datalore)
13:07
Thu Vu data analytics
Рет қаралды 223 М.
Feature Engineering Techniques For Machine Learning in Python
47:58
Je peux le faire
00:13
Daniil le Russe
Рет қаралды 22 МЛН