House Price Prediction Project | Linear Regression

House Price Prediction Project | Linear Regression | Machine Learning

Рет қаралды 7,094

InsightsByRish

Күн бұрын

Пікірлер: 54

@InsightsByRish 7 ай бұрын

Correction : At 12:08 upper bound = Q3 + 1.5 * IQR

@hriupadhyay5763 6 ай бұрын

Please provide me a link for this particular data set

@ilyeskesri6524 5 ай бұрын

Good tutorial thanks

@ansumankumar4797 8 ай бұрын

Very nice explanation.

@InsightsByRish 8 ай бұрын

Thank you. Pleased to know you liked the content.

@ibrahimnazari8376 6 ай бұрын

nice explanation, thanks

@latifaharums 2 ай бұрын

hello, may i ask for predict testing?

@InsightsByRish 2 ай бұрын

If you're asking how to make predictions on test data, you just need to load your trained model in pickle format and pass the test data to it

@hoangha6680 3 ай бұрын

Thanks for the video. But I think filling missing values with the mean in the Target variable isn't a good idea. Since it can lead to bias or loss of information

@InsightsByRish 3 ай бұрын

So according to you, how should the missing values from target col be handled?

@hoangha6680 3 ай бұрын

@@InsightsByRish in this case, we can consider dropping those rows

@InsightsByRish 3 ай бұрын

@@hoangha6680 It can only be done when you have a large volume of data. If you're dropping rows from a dataset that's already small, you'll lose a significant portion of data, which can eventually hamper your model's performance.

@frootiiiiiiii 2 ай бұрын

Mam why this y=df['price '] used here insted of y=df[['price']] why 1D series used here

@InsightsByRish 2 ай бұрын

y = df['Price'] here the single square bracket symbolizes the data in 'array format' whereas in y = df[['Price]] the double square bracket symbolizes data in 'data frame format'. And traditionally the target column/feature needs to be in array format and not in data frame. (That's a good practice)

@frootiiiiiiii 2 ай бұрын

@@InsightsByRish Ma'am, after applying all these techniques, the Mean Absolute Error is still around 11k. How can I fix this? Could it be that the MinMax Scaler needs to be changed to Standard Scaler or something else? Please suggest a solution.

@ibrahimdenisfofanah6420 2 ай бұрын

@@frootiiiiiiii try changing the MinMax Scaler to Standard scaler

@vikashsingh6888 6 ай бұрын

It shows an error when I select_datatype('object') and when I change it to int then my column size changes please help 🙏 At line 27

@InsightsByRish 6 ай бұрын

It's not select_datatype('object'). It is select_dtypes('object').

@vikashsingh6888 6 ай бұрын

Yes I tried this one also but it's showing issue... While I run the encoder it shows an error that you have to select int or str

@vikashsingh6888 6 ай бұрын

I tried it also but it's showing error while I run the encoder part its showing an error that you should select int or str in encoder.fit(df[cat_col]) @@InsightsByRish

@InsightsByRish 6 ай бұрын

@@vikashsingh6888 Hey, not really able to understand what's going wrong with your code, but I would suggest that restart your kernel and again implement the code from beginning line by line as it is (I've provided the code on GitHub).

@vikashsingh6888 6 ай бұрын

If you will provide your insta or something for contact even email I will send you the ss .. if you can it will help me 🙏

@flavourlyfreshasmr151 7 ай бұрын

Can you explain, how you remove columns and rows from the original dataset, logic?

@InsightsByRish 7 ай бұрын

Logic is the columns that don't help us in any way for making predictions are removed. Like the User ID column which has zero correlation with the target column. And for rows they are either removed if they contain null values or outliers.

@flavourlyfreshasmr151 7 ай бұрын

@@InsightsByRish can you make separate video to discuss which features we have to take and which should remove , in our particular problem statement.you r videos are so helpful

@InsightsByRish 7 ай бұрын

Thank you! Will make one on that topic soon.

@flavourlyfreshasmr151 7 ай бұрын

@@InsightsByRish Great 👍

@alaa_abdestar 3 ай бұрын

kaggle dataset link plz??

@InsightsByRish 3 ай бұрын

I downloaded the dataset from GeeksforGeeks : media.geeksforgeeks.org/wp-content/uploads/20240905183434/HousePricePrediction.xlsx

@akshit_trivedi 7 ай бұрын

Can you please make a playlist of all projects?

@InsightsByRish 7 ай бұрын

Yes sure..!

@Luminouss12096 6 ай бұрын

from sklearn.linear_model import LinearRegression model = LinearRegression() model.fit(X_train, y_train) mam this part of code show error how can i solve this

@InsightsByRish 6 ай бұрын

What's the error?

@Luminouss12096 6 ай бұрын

@@InsightsByRish model.fit(X_train, y_train) Input X contains NaN. LinearRegression does not accept missing values encoded as NaN natively. For supervised learning, you might want to consider sklearn.ensemble.HistGradientBoostingClassifier and Regressor which accept missing values encoded as NaNs natively. Alternatively, it is possible to preprocess the data, for instance by using an imputer transformer in a pipeline or drop samples with missing values.

@InsightsByRish 6 ай бұрын

@@Luminouss12096 I think your input data frame (X) still consist of missing values. Try removing that before building the model. If you still encounter the error then mail your .ipynb file on insightswithrish@gmail.com.