Project 6. Wine Quality Prediction using Machine Learning with Python

Project 6. Wine Quality Prediction using Machine Learning with Python | Machine Learning Project

Рет қаралды 84,962

Күн бұрын

Hi! I will be conducting one-on-one discussion with all channel members. Checkout the perks and Join membership if interested: / @siddhardhan Check membership Perks: / @siddhardhan
. This video is about Wine Quality prediction using Machine Learning with Python. This is one of the important Machine Learning projects.
All presentation files for the Machine Learning course as PDF for as low as ₹200 (INR): Drop a mail to siddhardhans2317@gmail.com
Enroll at One Neuron to learn from 100 courses in one subscription with 5% discount: courses.ineuro...
Hi guys! I am Siddhardhan. I work in the field of Data Science and Machine Learning. It all started with my curiosity to learn about Artificial Intelligence and the ability of AI to solve several Real Life Problems. I worked on several Machine Learning & Deep Learning projects involving Computer Vision.
I am on this journey to empower as many students & working professionals as possible with the knowledge of Machine Learning and Artificial Intelligence.
Hello everyone! I am setting up a donation campaign for my KZbin Channel. If you like my videos and wish to support me financially, you can donate through the following means:
From India 👉 UPI ID : siddhardhselvam2317@oksbi
Outside of India? 👉 Paypal id: siddhardhselvam2317@gmail.com
(No donation is small. Every penny counts)
Thanks in advance!
Let's build a Community of Machine Learning experts! Kindly Subscribe here👉 tinyurl.com/md...
I am making a "Hands-on Machine Learning Course with Python" in KZbin. I'll be posting 3 videos per week. 2 videos on Machine Learning basics (Monday & Wednesday Evening). 1 video on a Machine Learning project (Friday Evening).
Dataset file: www.kaggle.com...
Colab File Link: colab.research...
Download the Course Curriculum File from here: drive.google.c...
LinkedIn: / siddhardhan-s-741652207
Telegram Group: t.me/siddhardhan
Facebook group: www.facebook.c... Instagram: / siddhardhan23

Пікірлер: 123

@thabor7402 3 ай бұрын

Six Projects in and I'm so grateful that you committed to educate us in so much detail and repetitiveness, this is probably gonna change my life for the better and I might never even get to meet you. So Thank you Sidd!🙏

@digigoliath 3 жыл бұрын

Wine appreciation through machine learning. Fantastic to know what makes a good quality wine. TQVM. Had great fun with this one!!

@Siddhardhan 3 жыл бұрын

Glad you enjoyed it!😅 thanks 😇

@AR-gw2vo 7 ай бұрын

Thank you for helping me a lot about a ML project from scratch. I really appreciate you for your hard work. 🎉

@AkashSharma-my5hv 2 жыл бұрын

32:55 this changes everything in the deta set , great information.

@abundanceontheway2511 22 сағат бұрын

Great explanation on Random forest brother..Thanks a lot..now I understood everything!

@thirniyaprabaharan5835 4 ай бұрын

Thank you for helping me a lot to learn about a real world application related ML model

@cypher5873 2 жыл бұрын

Hi! I've 2 questions after watching this tutorial: How can I do labelling when I need more than 2 quality measures? How can I print the quality value of the output (ML generated) from my given input parameters?

@GhoshAnujit Жыл бұрын

Thankyou so much, this was a precise and fluid explanation, helped me a lot.

@renanboaventura9105 3 жыл бұрын

Another great video, congratulations. There are two charts that I like to plot: 1) this one show the distribution of each attribute (aka. column): for i, col in enumerate(wine_dataset.columns): plt.figure(i) sns.distplot(wine_dataset[col]) 2) and this one below, show the comparison for each possible pair of attributes by wine quality, it worth to plot, you can take some insights from it. plt.figure() sns.pairplot(wine, hue = 'quality') plt.show()

@oyeleyeolalekan4486 Жыл бұрын

Your videos are very great.... however, you don't fine tune the model....I have watched your hyperparameter tuning but you don't show it in projects. I sincerely love your videos

@alejandrosierra3743 3 жыл бұрын

Very cool, but random state is always 42. Do not go against "The Hitchhiker's Guide to the Galaxy". PD: Great job

@rohanshah8129 2 жыл бұрын

haha!

@hellothere6983 3 жыл бұрын

dude u deserve a sub thanks man helped a lot

@Siddhardhan 3 жыл бұрын

Glad I could help😇

@yashwanth3033 Жыл бұрын

I still confused about how to choose the correct algorithm for the dataset can you help me.

@RojaShankar1303 7 ай бұрын

Even me

@kakashi-yu7ok 5 ай бұрын

I think you should first see whether the target variable is regression based or classification based. and choose the subset of algorithms from that. then we would have to test the accuracy and rootmeansquareerror after training it with each model and select the best out of it.

@sagarchaudhary97610 Ай бұрын

Bro experiment kro sab

@AddisBelayneh 4 ай бұрын

Great video! Thank you so much!

@nidhi2212 5 ай бұрын

thank you so much... it is very useful for new ideas & learning..

@LoneWolf-rj1px 2 жыл бұрын

How do we know which machine learning model is better for which data set? You have shown Logistic Regression in one model, SVM in another, and Random Forest in this model.

@rohanshah8129 2 жыл бұрын

Simply create a project where you have tested all the models you have in mind and compare the results. Choose the one with better output and optimize it ahead. Try this video:- kzbin.info/www/bejne/baavq3qIob2Letk

@karishmasewraj6437 2 жыл бұрын

For the splitting of the data can we use the parameter stratify = y to equalize the target data ?

@Aalekh_Chaudhary Жыл бұрын

Yes

@sanmatipol3201 9 ай бұрын

Thank you very much !your teaching is really good

@ShortQuikies 8 ай бұрын

thank you sir so much this video helped me alot . i cant define it in terms thank you so much sir .

@leftmpl 3 жыл бұрын

Good explanation sir but your approach has some serius problems. 1. There are a lot of outliers 2. Accuracy is high but the other metrics are really bad. This is caused of the high imbalance of the dataset, in nearly all test data are classified as bad quality wine and this is why the accuracy is so high. Spliting the good and bad quality in the range of [3,5] and [6,8] would be a better approach for dealing with the imbalance problem. Treating the problem with regression modeling would be maybe a better sollution.

@afeezlawal5167 2 жыл бұрын

How can the outliers problem be solved sir?

@JeevanEG 8 ай бұрын

Thanks for valuable information

@9941521791 2 жыл бұрын

Hi Bro, Your videos are great and I really appreciate your effort. I have a question as follows. Do we need to standardize the data mandatorily, whenever there is a different range of values in independent variables? I am asking because we did the data standardization in project#2 but not in project#5 & 6. I personally feel that the data standardization using standardscaler will certainly help the model to improve the prediction accuracy. What do you think? Regards, Prakash

@Siddhardhan 2 жыл бұрын

Hi! Standardization is an important process. We don't have to do it if our dataset contains several categorical columns. Standardization should not be performed on categorical columns. I may not have done standardization in few videos. It's purely because of the length of the video. And about ur doubt on whether it will improve ur model's performance, it definitely helps. It's not obvious in certain cases. But in case of certain datasets, you can get a better accuracy and performance when u standardize the data

@9941521791 2 жыл бұрын

@@Siddhardhan Thanks for your reply.

@meghnarawat3820 2 жыл бұрын

Great video! Thank you so much!

@growingfire 4 ай бұрын

Thank you so much!

@MuhammadKamran-ii4rh 3 жыл бұрын

Once again a perfect video. Hats off.

@Siddhardhan 3 жыл бұрын

Thank you so much 😀

@tanvibamrotwar Жыл бұрын

Hi sidhathan . Thank for the video . I want to ask u . I apply different model to data set i have and build predectivr for every entry it's saying bad quality only. Can you tell where I'm going wrong. Because of i standardize my data. That's why I'm getting like this

@nikitasinha8181 Жыл бұрын

Thank you so much sir

@devanshujain4650 3 жыл бұрын

Sir this is a regression project . You changed the dependent values using lambda function into classification and then u applied randomfroestClassifier . How is this possible ? I did using regression got accuracy as 40 percent using random forest . I am not able to understand how have u got this much. Plus I applied r2 score as it was a regression model .

@Siddhardhan 3 жыл бұрын

hi! I took a classification approach. it depends on our problem statement and the outcome that we want. and R2 score is not percentage value. you need to do some research on that. if you get the R2 value as 0.4 then it's actually a good model. it's not 40 percentage.

@techyreport7992 3 жыл бұрын

Siddardhan sir please help me differentiate the algos which are specifically made for classifiaction and regression respectivley

@ToanvaKhoahocmaytinh Жыл бұрын

Thank you for sharing

@premalathas623 3 жыл бұрын

Very good explanation..

@Siddhardhan 3 жыл бұрын

Thanks 😇

@suruchikumari2360 3 жыл бұрын

great job ........keep it up ....and thanks a lot

@Siddhardhan 3 жыл бұрын

Most welcome😇

@sezermezgil9304 3 жыл бұрын

Hey great tutorial.And i have 2 question.First why we didn't standardize our data or should we ? Secondly, when we split out data sometimes we use a parameter 'statify' but here we didnt use it could you explain me why ? Thank you

@techyreport7992 3 жыл бұрын

stratify is required to equally distribute the dataset so that train and test have almost same data so that we can train the model correctly

@ashoka8929 2 жыл бұрын

This is very useful I want this project report

@shyampraveen4203 2 жыл бұрын

Can I get The PPT ;-; by the way Your Explanation was Awesome

@Namangen 3 жыл бұрын

thank you so much its exactly what I wanted.

@Siddhardhan 3 жыл бұрын

Glad I could help!😇

@sachinvithubone4278 3 жыл бұрын

This is the classification problem correct? And mostly we did study in classification problem I think.

@Siddhardhan 3 жыл бұрын

yeah, we also have Projects in Regression & one clustering Project. there will be separate playlists on those. kindly check.

@sachinvithubone4278 3 жыл бұрын

@@Siddhardhan sure, I will check it.. mention in project title it's classification problem or clustering or regression so people can find easily on KZbin..just suggession..😌

@sachinvithubone4278 3 жыл бұрын

In which use case or data set we can use RandomForstRegressor And RandomTreeEmbedding ?

@nithinkumbam5525 3 жыл бұрын

Hey! What about the count value of output variable y..? In the data analysis part you have shown graph of quality variable where most of the number are in between 4 & 6 and in label binarizaton you took mid values as 7 when means most of the quality variable data are converted to 0(zeros). There is a chance of imbalance dataset! Correct me if i am wrong 🙌

@Siddhardhan 3 жыл бұрын

hi! it's upto our consideration. you can take the values from 6 as label 1 as well.

@nithinkumbam5525 3 жыл бұрын

Okay thanks for the content.!

@sameerabanu3115 Жыл бұрын

You might even worked on outliers

@gauravfamily2209 3 жыл бұрын

great. But at first, you should complete all ML algo. theory.

@Siddhardhan 3 жыл бұрын

sure

@kanishkagour6356 2 жыл бұрын

@26:05. correlation is not working in Jupyter Notebook. Do you have any solution regarding this.

@joe_fu 3 жыл бұрын

Very detailed, thanks

@Siddhardhan 3 жыл бұрын

my pleasure 😇

@bharatm3195 2 жыл бұрын

Iam getting ' typeerror missing 1 required positional argument:'y''... while training model....can anyone explain??

@srinukomarapuri7441 2 жыл бұрын

Can you explain why not doing outliers reduced method in this dataset?

@koushikguptabonthala2429 3 жыл бұрын

Very good explanation

@Siddhardhan 3 жыл бұрын

thanks 😇

@sachinvithubone4278 3 жыл бұрын

Really helpful this project, thanks 😊

@Siddhardhan 3 жыл бұрын

You're welcome 😊

@sachinvithubone4278 3 жыл бұрын

In train test split when you did print( x.shape x_train.shape x_test.shape).. It's showing only rows, if I am not wrong it's should show the rows and features columns

@Siddhardhan 3 жыл бұрын

hi! here we are not printing x. we are printing only y. y contains only one column which represents the label. kindly check.

@sachinvithubone4278 3 жыл бұрын

@@Siddhardhan okay..

@magical5051 2 жыл бұрын

What is that green,red,violet representing.is it different bottles of wine

@MuhammadHamza-ki3ze 2 жыл бұрын

I want to classify this in three types medium good and bad but I cannot figure it out. If you know what to do please let me know.

@ahmedabid6799 3 жыл бұрын

thnx teacher but why the accuracy of trainig data is 100%..?

@roshankshirsagar8665 2 жыл бұрын

Sir how I get to know which model is suitable for a particular problem?

@rohanshah8129 2 жыл бұрын

@myparadise6137 4 ай бұрын

Wt is the language used for front-end

@ashwinizende7923 3 жыл бұрын

Very Good video but still i am not able to understand how can we choose which model is for what problem?

@Siddhardhan 3 жыл бұрын

hi! watch the videos in 7th module. (intuition behind models)

@sayanaajayan9471 3 жыл бұрын

Thank you so much :)

@Siddhardhan 3 жыл бұрын

You're welcome!😇

@swastikmohanty7370 2 жыл бұрын

I know this is supervising learning...but how can you choose it is random forest but not svm...I am in doubt while choosing the model...can you guide me

@Doraemon67812 2 жыл бұрын

he chooes all model and then finds this model helpful not shown in video

@Doraemon67812 2 жыл бұрын

helpful means high accuracy try to apply all model by yourself you will get your answer

@d_62_sourabhvankudre41 Жыл бұрын

in train test and split there is a error (not enough values to unpack (expected 5, got 4) PLZZ can somebody can help?? 'it would be great help'

@ashoka8929 2 жыл бұрын

What about this project report sir??

@rohitgaloth1547 3 жыл бұрын

Sir how can we input n values as input and reshape it?

@shwetharaju6496 2 жыл бұрын

In wine quality prediction by taking the different value its not predicting. im getting error

@siddharthsharma5162 3 жыл бұрын

Is this classification or regression ?

@Siddhardhan 3 жыл бұрын

classification

@shwetharaju6496 2 жыл бұрын

Random Forest Algorithm is not showing . how to fix the error

@bhagyashreenarwade355 3 жыл бұрын

can we do the same in some IDE?

@Siddhardhan 3 жыл бұрын

Yes, definitely.

@ketanpatil4921 3 жыл бұрын

Veryyyy good explanation 👍👍👍

@Siddhardhan 3 жыл бұрын

Thank you 😊

@vanshikarathi2356 Жыл бұрын

Hey can someone tell me why we did not standardize the data

@nadhiyakandaswami 2 жыл бұрын

48:20 Build a Predictive System

@riyashah2530 2 жыл бұрын

Sir pl give me dataset link here

@faizansaqeeb3390 3 жыл бұрын

Share resources where to learn ml for this project

@Siddhardhan 3 жыл бұрын

hi! watch videos in my machine learning course playlist: kzbin.info/aero/PLfFghEzKVmjsNtIRwErklMAN8nJmebB0I you will be able to understand this project.

@koushikguptabonthala2429 3 жыл бұрын

Can we create confusion matrix

@Siddhardhan 3 жыл бұрын

hi! yes, you can create

@dimitriskapsis6018 3 жыл бұрын

Vey nice video! I also tried SVM but it didnt seem to work proparly, it always predicted bad quality even though i did standarize the data after i reshaped it.

@Siddhardhan 3 жыл бұрын

hi! try changing the model and do some optimizations... in my future videos, I'll cover topics on optimization

@dimitriskapsis6018 3 жыл бұрын

@@Siddhardhan Is it normal for the SVM not working right though?

@Siddhardhan 3 жыл бұрын

Models working depends on the nature of the dataset also.. you can search in google regarding the pros and cons of svm and other models. Those informations will help you choose better model. There is not any exact rule for this all the time.

@dimitriskapsis6018 3 жыл бұрын

@@Siddhardhan it works if i label good quality for greater than 6. but i see what you mean. Keep up your great work! thank you very much!

@Siddhardhan 3 жыл бұрын

Yes! You can definitely try