Bigmart Sales Prediction Analysis (Regression) | Machine Learning | Python

  Рет қаралды 45,580

Hackers Realm

Hackers Realm

Күн бұрын

⭐️ Content Description ⭐️
In this video, I have explained about bigmart sales prediction analysis that includes data exploration, preprocessing, creating new attributes, exploratory data analysis, model training, etc., This project focus morely on advanced preprocessing techniques that will be useful in various competitions.
Text-based Tutorial: www.hackersrealm.net/post/big...
GitHub Code Repo: bit.ly/mlcoderepo
Dataset: www.kaggle.com/devashish0507/...
🌐 Website: www.hackersrealm.net
🔔 Subscribe: bit.ly/hackersrealm
🗓️ 1:1 Consultation with Me: calendly.com/hackersrealm/con...
📷 Instagram: / aswintechguy
🔣 Linkedin: / aswintechguy
🎯 GitHub: github.com/aswintechguy
🎬 Share: • Bigmart Sales Predicti...
⚡️ Data Structures & Algorithms tutorial playlist: bit.ly/dsatutorial
😎 Hackerrank problem solving solutions playlist: bit.ly/hackerrankplaylist
🤖 ML projects tutorial playlist: bit.ly/mlprojectsplaylist
🐍 Python tutorial playlist: bit.ly/python3playlist
💻 Machine learning concepts playlist: bit.ly/mlconcepts
✍🏼 NLP concepts playlist: bit.ly/nlpconcepts
🕸️ Web scraping tutorial playlist: bit.ly/webscrapingplaylist
Make a small donation to support the channel 🙏🙏🙏:-
🆙 UPI ID: hackersrealm@apl
💲 PayPal: paypal.me/hackersrealm
🕒 Timeline
00:00 Introduction to Big mart Sales Prediction Analysis
02:22 Import modules and load data using pandas
04:19 Statistics data information
06:50 Preprocessing the big mart sales data
26:41 Creation of new attributes for big mart sales
32:57 Exploratory Data Analysis of big mart sales
41:26 Correlation matrix for big mart sales
43:25 Label Encoding for data preprocessing
46:42 One Hot Encoding for data preprocessing
49:39 Splitting the data for training & testing
51:16 Model Training & Testing for big mart sales
#bigmartsalesprediction #machinelearning #hackersrealm #regression #deeplearning #datascience #model #project #artificialintelligence #beginner #analysis #python #tutorial #aswin #ai #dataanalytics #data #bigdata #programming #datascientist #technology #coding #datavisualization #computerscience #pythonprogramming #analytics #tech #dataanalysis #iot #programmer #statistics #developer #ml #business #innovation #coder #dataanalyst

Пікірлер: 166
@HackersRealm
@HackersRealm 3 жыл бұрын
In train function, i didn't use train test split for the prediction. You can able to see the difference of metrics for training the whole data and predicting it & the cross validation score. You can train the model with whole data only when you using seperate test data to predict!!!
@speedrun7000
@speedrun7000 3 жыл бұрын
Thank you so much mate! You're a hero
@HackersRealm
@HackersRealm 3 жыл бұрын
You're welcome!!!
@SMdAijaz
@SMdAijaz 3 жыл бұрын
I will be the 999 + 1th subscriber of this channel and this channel deserves it too...😁😁😁 Congrats for 1k subscribers🎉🎉
@HackersRealm
@HackersRealm 3 жыл бұрын
Thanks for subbing!!! :)
@connormac98
@connormac98 3 жыл бұрын
Brilliant and very helpful video
@HackersRealm
@HackersRealm 3 жыл бұрын
Glad you think so!
@pratikzanak8023
@pratikzanak8023 3 жыл бұрын
Cool!!.. Very Nice And Helpful Video😇
@HackersRealm
@HackersRealm 3 жыл бұрын
welcome!!!
@rishabhtewari4357
@rishabhtewari4357 3 жыл бұрын
Very well explained .. thanks
@HackersRealm
@HackersRealm 3 жыл бұрын
Glad you liked it!!!
@sodiqrafiu9072
@sodiqrafiu9072 3 жыл бұрын
Thanks a lot for the good job. More of it please?
@HackersRealm
@HackersRealm 3 жыл бұрын
Yeah sure, will update in weekly basis
@soumisarkar603
@soumisarkar603 3 жыл бұрын
Really appreciate your efforts... Thank you for your explanation and in the question it's written that 'find out sales of each products at particular store' what's the answer of this part?
@HackersRealm
@HackersRealm 3 жыл бұрын
if you see the xtest data, test product and store info will be there without output, we are trying to predict that. while deployment, we can use user input of the same attributes
@soumisarkar603
@soumisarkar603 3 жыл бұрын
@@HackersRealm ok thank you
@shriyasharma1095
@shriyasharma1095 2 жыл бұрын
Hey, it will be really helpful if you predict the sales values too. You did the modelling part (A big thank you!). Can you please do the prediction part too? Thanks in advance!
@HackersRealm
@HackersRealm 2 жыл бұрын
you can check how to do test prediction in the playlist video, do check that!!! Check the projects playlist and search for prediction on test data
@harshpreetsingh5474
@harshpreetsingh5474 2 жыл бұрын
@@HackersRealm could you please share a link ?
@HackersRealm
@HackersRealm 2 жыл бұрын
@@harshpreetsingh5474 kzbin.info/www/bejne/eIHSq2h-l5uWjac
@athanikarammy7476
@athanikarammy7476 3 жыл бұрын
Hey Ashwin ,can you make a video where in you can show us now counter different types of dataset with different strategies ,so that will be very helpful to all those who try to solve on their own and face issues ..Sincere Request..
@HackersRealm
@HackersRealm 3 жыл бұрын
Sure, I ll make one that contains the tips ad tricks
@adorablecheetah2930
@adorablecheetah2930 3 жыл бұрын
Hi Thanks for the video but I don't know why you said Linearregression is a better option when extratrees is giving much lower MSE and higher CV score? Can you please explain
@HackersRealm
@HackersRealm 3 жыл бұрын
Thanks for watching the video. For regression problem, CV score should also be lower.
@mrchief3383
@mrchief3383 2 жыл бұрын
If the CV score is much bigger than MSE it means the model is overfitting the training set, hence it will predict unaccurate output because of the lack of generalization
@viralwatch3891
@viralwatch3891 3 жыл бұрын
Can you please tell what item outlet sales colomn is about ? And how many days sales get predicted for?
@HackersRealm
@HackersRealm 3 жыл бұрын
item outlet sales represents the items that are sold in the shop. There is no time series in the dataset
@chiragghotikar2398
@chiragghotikar2398 3 жыл бұрын
Hello sir, just want to ask one things in above model why you have removed the item_identifier columns and Outlet_Identifier columns ?
@HackersRealm
@HackersRealm 3 жыл бұрын
The identifier column is simple id, and doesn't contain useful information, that's why in most of the projects, we need to remove id's
@chiragghotikar2398
@chiragghotikar2398 3 жыл бұрын
@@HackersRealm thank you sir
@utkarshkunwar4319
@utkarshkunwar4319 3 жыл бұрын
Nice work Ashwin! Could you pls elaborate some more on the code you've written in line 15 (which afterwards become 20). I am actually not able to comprehend the role of .loc[ ] here, and I really don't want to cram the logic just for the sake of it. TIA!
@HackersRealm
@HackersRealm 3 жыл бұрын
Thanks for watching the video!!! .loc function is used to locate the particular row or index in the dataframe.
@utkarshkunwar4319
@utkarshkunwar4319 3 жыл бұрын
@@HackersRealm thanks for the reply Ashwin, although I know what .loc[ ] does, I am not able to figure out as to how are we using this function to fill in the values. I hope I have framed my doubt clearly 😅
@spitfirelast8761
@spitfirelast8761 2 жыл бұрын
@@utkarshkunwar4319 syntax mismatched on the previous cell. I'm new here in jupyter.
@spitfirelast8761
@spitfirelast8761 Жыл бұрын
Im getting an error on cell 22 correlation matrix, it keeps saying could not convert string to float: "name" What am i doing wrong?
@vempatigopi5039
@vempatigopi5039 3 жыл бұрын
Bro I am getting an erro that is Input contains NaN ,infinity or a value too large for dtype ('float64')
@HackersRealm
@HackersRealm 3 жыл бұрын
On which part, you are getting the error?
@minecraftbutmemes4695
@minecraftbutmemes4695 3 жыл бұрын
Can you tell what output your model give that is 1 month sale prediction or Current sale prediction Could you elaborate that
@HackersRealm
@HackersRealm 3 жыл бұрын
sales based on input attributes
@manognanalluri931
@manognanalluri931 3 жыл бұрын
At 58th cell we have 3 categories right (high, medium, low) so can we create a new attribute..
@HackersRealm
@HackersRealm 3 жыл бұрын
It's already a category column, can you explain a little how can we create new attribute from this....
@manognanalluri931
@manognanalluri931 3 жыл бұрын
@@HackersRealm can we take (1987, 1997, 1999, 2002, 2004, 2007, 2009) as one category, 1985 as one category and 1998 as another category... So that we can get 3 categories
@HackersRealm
@HackersRealm 3 жыл бұрын
@@manognanalluri931 I don't think it won't improve the score that much because if we try this, we are just splitting the same category in different ones. Try this and if you get a improved score, share it in comments, it will be a good learning!!!
@WajidAfr
@WajidAfr 3 жыл бұрын
Sir how to implement inventory prediction in ecommerce website Please help me
@HackersRealm
@HackersRealm 3 жыл бұрын
you need to have past dataset, you can do the analysis similar to this
@WajidAfr
@WajidAfr 3 жыл бұрын
@@HackersRealm ok I will try
@minecraftbutmemes4695
@minecraftbutmemes4695 3 жыл бұрын
Hi Ashwin nice work, I tried completing you project but when I run model.predict and entered respected value it's giving 8.6 answer could you elaborate what's this 8.6 is Whether it is current value or next month sales value or the required percentage sales and also why this value is small if it is price value Thanks for this video
@HackersRealm
@HackersRealm 3 жыл бұрын
The value is in log, so you have convert that using exponentiation, and it represents the sales value. Each column represents a parameter for input, based on that, it's predicting the sales
@minecraftbutmemes4695
@minecraftbutmemes4695 3 жыл бұрын
Can you tell me until how many months or year prediction value it gives
@sasumsudha
@sasumsudha 3 жыл бұрын
dear sir, please do video of how to handle bigmartsales with two datasets train.csv and test.csv and how to train test split.please.... sir..
@HackersRealm
@HackersRealm 3 жыл бұрын
you can use model.predict for the test dataset. Anyway I will upload a video on how to do that
@sivasathvikpaturi209
@sivasathvikpaturi209 2 жыл бұрын
Hey can I use the same way for any dataset of my choice
@HackersRealm
@HackersRealm 2 жыл бұрын
no, you have to preprocess based on the dataset
@sagar8460830871
@sagar8460830871 3 жыл бұрын
can we use LSTM and Prophet for this data set
@HackersRealm
@HackersRealm 3 жыл бұрын
No, we can't use for this dataset
@sodiqrafiu9072
@sodiqrafiu9072 3 жыл бұрын
I am unable to play the video, could you please check if it can be fix?
@HackersRealm
@HackersRealm 3 жыл бұрын
The video is working, Can you check again?
@anonymously809
@anonymously809 3 жыл бұрын
If u hv showed how to deploy, it would be great
@HackersRealm
@HackersRealm 3 жыл бұрын
Just covering the basics now, deployment will be added in the future video
@Nostalgiccplay
@Nostalgiccplay 3 жыл бұрын
anyone know how to give input value during runtime in prediction program in ML???
@HackersRealm
@HackersRealm 3 жыл бұрын
you can take the input from user convert to dataframe or multidimensional array and get output from the model
@raviteja-op1nk
@raviteja-op1nk 3 жыл бұрын
@@HackersRealm can u please upload short video on that
@HackersRealm
@HackersRealm 3 жыл бұрын
@@raviteja-op1nk I explained that in a latest video of mine, still i will try to upload a seperate video for that
@raviteja-op1nk
@raviteja-op1nk 3 жыл бұрын
@@HackersRealm can i talk to you personally
@HackersRealm
@HackersRealm 3 жыл бұрын
@@raviteja-op1nk reach me via instagram or linkedin, links available in video description
@adorablecheetah2930
@adorablecheetah2930 3 жыл бұрын
I have a doubt please. Are you not training on complete data and then predicting on the same data?
@HackersRealm
@HackersRealm 3 жыл бұрын
no, we are not training on the complete data, there is data split for training and testing
@meetjamsutkar394
@meetjamsutkar394 2 жыл бұрын
17:24 my output is coming 1463 instead of 0, ive followed the video to a tea, what am i doing wrong?
@HackersRealm
@HackersRealm 2 жыл бұрын
I guess you would have made a mistake in previous cell, check it once
@spitfirelast8761
@spitfirelast8761 2 жыл бұрын
I saw it upon reviewing. It's the " ".
@vikasreddy8575
@vikasreddy8575 3 жыл бұрын
Is there any software or project for copying the captchas and pasting them..please tell me sir
@HackersRealm
@HackersRealm 3 жыл бұрын
For getting the captcha text, you can use OCR project
@vikasreddy8575
@vikasreddy8575 3 жыл бұрын
@@HackersRealm can u share the link of the OCR project
@katheashokkumar2072
@katheashokkumar2072 3 жыл бұрын
What is feature extraction in the vedio. Are the Attributes , features ?
@HackersRealm
@HackersRealm 3 жыл бұрын
yes, the input attributes are features
@katheashokkumar2072
@katheashokkumar2072 3 жыл бұрын
@@HackersRealm Thank you for your quick reply And How to do PCA for this Attributes ?
@HackersRealm
@HackersRealm 3 жыл бұрын
@@katheashokkumar2072 you can direct use pca module from sklearn
@katheashokkumar2072
@katheashokkumar2072 3 жыл бұрын
@@HackersRealm Thanks lot... 💐
@sreeharim5976
@sreeharim5976 3 жыл бұрын
which regression we are using here?
@HackersRealm
@HackersRealm 3 жыл бұрын
All basic regression algorithms
@karthick2880
@karthick2880 3 жыл бұрын
How to get accuracy from CV? CV score should be high or low?
@HackersRealm
@HackersRealm 3 жыл бұрын
Cross validation is the avg of different test accuracy. Higher is better
@karthick2880
@karthick2880 3 жыл бұрын
@@HackersRealm then the score of decision tree(0.56) is high but you said Linear Regression (0.28) is best,So Im confused
@HackersRealm
@HackersRealm 3 жыл бұрын
@@karthick2880 Sorry, I meant that for classification problem. For regression, we are considering the error. we need to have less error to make a best model. In this case, lower is better
@aheshjangala32
@aheshjangala32 3 жыл бұрын
@@HackersRealm that means linear regression is gud rt? In this project
@HackersRealm
@HackersRealm 3 жыл бұрын
@@aheshjangala32 for this project, yes
@chiragjangra4071
@chiragjangra4071 2 жыл бұрын
PEL DIYE SIR. MJA AA GYA VIDEO ATTEND KARKE. LAST M THODA CV SCORE THODA OR SMJA DETE TO ACHA LAGTA. BUT OVERALL ACHA THA.
@surenlmc494
@surenlmc494 2 жыл бұрын
You are using lighfm algorithm or which
@HackersRealm
@HackersRealm 2 жыл бұрын
I have used several algorithms which you can see it at the last part of the video
@Syedaquadri-lv2xp
@Syedaquadri-lv2xp Жыл бұрын
what is the meaning of coef is not defined
@HackersRealm
@HackersRealm Жыл бұрын
I think the cell before was not ran. Please run the cells in order to proceed without issues
@aheshjangala32
@aheshjangala32 3 жыл бұрын
Will u please provide documentation for this project..
@HackersRealm
@HackersRealm 3 жыл бұрын
Please refer github for documentation
@saivikas1125
@saivikas1125 2 жыл бұрын
Bro do u have the documentation for this project ??
@User-nd1qp
@User-nd1qp Жыл бұрын
what is the accuracy?
@HackersRealm
@HackersRealm Жыл бұрын
this is a regression problem, we can't use accuracy for this
@wrathshorts3552
@wrathshorts3552 2 жыл бұрын
what if we have a date variable too?
@HackersRealm
@HackersRealm 2 жыл бұрын
we can extract the date features and use it as input feature
@vadimsokolov3669
@vadimsokolov3669 3 жыл бұрын
Are there the sales prediction of each product, but not the whole number of sales?
@HackersRealm
@HackersRealm 3 жыл бұрын
sales of product for some store
@vadimsokolov3669
@vadimsokolov3669 3 жыл бұрын
Sales of one product or all list of sales of over 1000 products from categories in each store? Thank you
@HackersRealm
@HackersRealm 3 жыл бұрын
@@vadimsokolov3669 sales of different products
@nadiashahrin9573
@nadiashahrin9573 3 жыл бұрын
x is not found in check for categorical attributes! Now what to do?
@HackersRealm
@HackersRealm 3 жыл бұрын
can explain the issue clearly or you can reach me via insta or linkedin. share the image of the issue
@nadiashahrin9573
@nadiashahrin9573 3 жыл бұрын
@@HackersRealm Thanks for your reply. I have already solved the issue. Thanks.
@HackersRealm
@HackersRealm 3 жыл бұрын
@@nadiashahrin9573 Welcome!!!
@sahilmhatre5041
@sahilmhatre5041 2 жыл бұрын
WHICH IS ALGORITHM IS USED IN THIS VIDEO?
@HackersRealm
@HackersRealm 2 жыл бұрын
various ml models are used, you can check at the last part of the video
@khyatikapadiya1936
@khyatikapadiya1936 3 жыл бұрын
Can you please define cv score?
@HackersRealm
@HackersRealm 3 жыл бұрын
It uses different training splits and calculate the average score of the model. It helps to reduce the bias while splitting the data
@shivanshjayara6372
@shivanshjayara6372 3 жыл бұрын
Can i use these projects in resume?
@HackersRealm
@HackersRealm 3 жыл бұрын
Yes, you can use these projects in your resume if you're studying in college
@shivanshjayara6372
@shivanshjayara6372 3 жыл бұрын
@@HackersRealm thanks for response. Well im not in a college. Im looking for entry level job in DS/ML . Im taking a transition.
@HackersRealm
@HackersRealm 3 жыл бұрын
@@shivanshjayara6372 yeah you can can
@hariprasad1744
@hariprasad1744 3 жыл бұрын
Where did you do the train test split?
@HackersRealm
@HackersRealm 3 жыл бұрын
I didn't, It automatically splits data in cross validation part and test the data
@hariprasad1744
@hariprasad1744 3 жыл бұрын
Hackers Realm okay..And do you have any idea how can we categories transactions in Banking sector? I am planning to do a POC so
@pranav5273
@pranav5273 3 жыл бұрын
@@HackersRealm So we have to upload test.csv??
@HackersRealm
@HackersRealm 3 жыл бұрын
@@pranav5273 no, I used only the train dataset to validate the model. During competition only, you have to upload test data
@pranav5273
@pranav5273 3 жыл бұрын
@@HackersRealm Okay, sir how do we do that then . Is there video or any explanation ? of performing this because I am new to this.
@chiragparmar3678
@chiragparmar3678 3 жыл бұрын
Bro from where did u learned data science?
@HackersRealm
@HackersRealm 3 жыл бұрын
Started off with andrew ng course and doing mini projects like this
@SasukeAditya
@SasukeAditya 3 жыл бұрын
I am not able to understand some terminology. Can you suggest some videos, which I can see and then watch this video?
@HackersRealm
@HackersRealm 3 жыл бұрын
In the channel, there is a playlist for ml concepts, It covers all the theory concepts
@mrgz999
@mrgz999 3 жыл бұрын
@@HackersRealm please advise the link
@HackersRealm
@HackersRealm 3 жыл бұрын
@@mrgz999 bit.ly/mlconcepts
@SasukeAditya
@SasukeAditya 3 жыл бұрын
@@HackersRealm Is there a way to obtain accuracy percentage for every model in the model training process?
@HackersRealm
@HackersRealm 3 жыл бұрын
@@SasukeAditya yes you can use * 100 to get the percentage of the accuracy score
@PraveenKumar-ob7gd
@PraveenKumar-ob7gd 2 жыл бұрын
Which techniques using in this project
@HackersRealm
@HackersRealm 2 жыл бұрын
didn't get you, which part of the project you're asking, there are many techniques used here
@PraveenKumar-ob7gd
@PraveenKumar-ob7gd 2 жыл бұрын
@@HackersRealm bigmart sales prediction project which techniques using to build that one
@HackersRealm
@HackersRealm 2 жыл бұрын
@@PraveenKumar-ob7gd if you're asking about algorithms, that is linear regression, random forest, etc., You can check more details in the project readme in github or the website
@CursiveStarVlogs
@CursiveStarVlogs Жыл бұрын
Sir,graph representation values are based on which attributes
@HackersRealm
@HackersRealm Жыл бұрын
which graph representation you're referring? i would have mentioned the attribute name while creating the graph too
@CursiveStarVlogs
@CursiveStarVlogs Жыл бұрын
That is i want to a chart or a graphical representation while prediction of sales so how we can make it possible
@HackersRealm
@HackersRealm Жыл бұрын
@@CursiveStarVlogs you mean like a line chart with date and sales as attributes?
@CursiveStarVlogs
@CursiveStarVlogs Жыл бұрын
@@HackersRealm yes
@HackersRealm
@HackersRealm Жыл бұрын
@@CursiveStarVlogs this is not a time series data, so it would be a invalid representation here
@Amansingh-tr1cf
@Amansingh-tr1cf 3 жыл бұрын
why didnt you handle outliers?
@HackersRealm
@HackersRealm 3 жыл бұрын
I used log transformation to reduce the effect of outliers
@Amansingh-tr1cf
@Amansingh-tr1cf 3 жыл бұрын
How to know that we should keep outliers or discard them?
@HackersRealm
@HackersRealm 3 жыл бұрын
@@Amansingh-tr1cf using visualization, we can able to easily find few outliers, it's based on our analysis
@Arunav_vashista
@Arunav_vashista 11 ай бұрын
type Error : unhashable type: 'DataFrame' please reply earliest plt.figure(figsize=(30,6)) sns.countplot(x='Outlet_Size', data=big_mart_data) plt.title('Item_type count') plt.show() please reply earliest
@HackersRealm
@HackersRealm 11 ай бұрын
please check the dataframe you're using and try again... if you use the same code as it is, you wouldn't face this issue!!!
@akash_a_desai
@akash_a_desai 3 жыл бұрын
Deploy this on cloud
@HackersRealm
@HackersRealm 3 жыл бұрын
Will try in future
@HamzaKhan-vs2nd
@HamzaKhan-vs2nd 3 жыл бұрын
Hello Bro i have a semester project ( big mart sale prediction ) can you give me your project I'll pay you.?
@HackersRealm
@HackersRealm 3 жыл бұрын
Please reach me via insta or linkedin @aswintechguy
@saamali4332
@saamali4332 3 жыл бұрын
plz es ka code da do koi
@HackersRealm
@HackersRealm 3 жыл бұрын
Can you speak in english
@meetjamsutkar394
@meetjamsutkar394 2 жыл бұрын
@@HackersRealm he's asking for the code
@HackersRealm
@HackersRealm 2 жыл бұрын
@@meetjamsutkar394 please check the github link below
@saivikas1125
@saivikas1125 2 жыл бұрын
Bro can I have your mail Id !!!
@HackersRealm
@HackersRealm 2 жыл бұрын
You can reach me via linkedin or instagram. Link in the description
Double Stacked Pizza @Lionfield @ChefRush
00:33
albert_cancook
Рет қаралды 97 МЛН
Clowns abuse children#Short #Officer Rabbit #angel
00:51
兔子警官
Рет қаралды 78 МЛН
Amazing weight loss transformation !! 😱😱
00:24
Tibo InShape
Рет қаралды 60 МЛН
Machine learning on Big Mart Sales Dataset | Machine Learning Projects video 13
19:18
Stanford's FREE data science book and course are the best yet
4:52
Python Programmer
Рет қаралды 684 М.
DS Projects: 4d: Walmart sales prediction with prophet and lightgbm
42:34
Analysing the big mart sales data | EDA using python Video 3
17:28
The AI & DS Channel
Рет қаралды 788
Data Analytics vs Data Science
6:30
IBM Technology
Рет қаралды 419 М.
Predict Future Sales | Kaggle | Code review
22:57
Data with Larry
Рет қаралды 23 М.
Double Stacked Pizza @Lionfield @ChefRush
00:33
albert_cancook
Рет қаралды 97 МЛН