Predict Football Match Winners With Machine Learning And Python

  Рет қаралды 155,996

Dataquest

Dataquest

Күн бұрын

In this video, we'll use machine learning to predict who will win football matches in the EPL.
We'll start by cleaning the EPL match data we scraped in the last video (Web Scraping Football Matches From The EPL With Python [part 1 of 2]). Don't worry if you missed the last video - you'll still be able to download the data.
We'll create predictors and train a machine learning model to predict the winner of each of the football matches.
Then we'll end by measuring error and making improvements.
You can find the data and code here - github.com/dataquestio/projec...
Chapters
00:00 Introduction
00:59 Reading match data into pandas dataframe
02:58 Investigating missing data
05:55 Cleaning our data for machine learning
08:05 Creating predictors for machine learning
14:00 Creating our initial machine learning model
22:34 Improving precision with rolling averages
31:07 Retraining our machine learning model
34:08 Combining home and away predictions
42:12 Recap and next steps
------------------
Join 1M+ Dataquest learners today!
Master data skills and change your life.
Sign up for free: bit.ly/3O8MDef
#Dataquest #Tutorial #DataScience #MachineLearning #WebScraping #Python

Пікірлер: 202
@vikasparuchuri
@vikasparuchuri Жыл бұрын
Hi everyone! You can find the data and code for this tutorial here - github.com/dataquestio/project-walkthroughs/tree/master/football_matches .
@chasingwildlife6584
@chasingwildlife6584 Жыл бұрын
Great Video Vik. Love the work. Thanks for giving us this great resource. Now time to find the rest of the data.
@emmanuelteitelbaum
@emmanuelteitelbaum 2 жыл бұрын
I like that as the founder of Dataquest, you yourself are providing the tutorial (as opposed to hiring someone). Also, thanks for offering the free access to educators and students.
@Dataquestio
@Dataquestio 2 жыл бұрын
Thanks, Emmanuel! -Vik
@ifkica1822
@ifkica1822 Жыл бұрын
@@Dataquestio sorry, I just joined Dataquest. can you please tell me if the free option for students is still available?
@knotty2348
@knotty2348 Жыл бұрын
You are a hero. Had this project in mind for years. You saved me some hundreds of hours of research and learning :) Thanks a lot!
@user-ws7ky2mk8l
@user-ws7ky2mk8l Жыл бұрын
The video content shared by this author is very good, and it provides a lot of reference directions for predicting stocks. Thank you so much.
@user-jk8dx4me8x
@user-jk8dx4me8x 7 ай бұрын
i just started learning Python n Machine learning. I started learning from your tutorials and it is making me better in Data science day by day. Keep it up. you are best online teacher.
@rodi21
@rodi21 Жыл бұрын
Amazing, Vic! I'm following you! Great job and explanation!
@titrecords2294
@titrecords2294 10 ай бұрын
Been learning ML on provided data ever since, thank you sir for teaching me in the last tutorial how to curate my own data. 🙏
@sureshmakwana8709
@sureshmakwana8709 Жыл бұрын
You saved my this semester's Machine Learning mini Project ❤️❤️
@stephenwood6139
@stephenwood6139 Жыл бұрын
This is by far the best and most practical video on football predictions I've seen online, very well explained and actually leaves you with something useful afterward. Great work!
@stephenwood6139
@stephenwood6139 Жыл бұрын
I managed to resolve this :)
@aravindgpandey
@aravindgpandey Жыл бұрын
Very nice explanation. This is what I was looking for so long. Thanks much
@user-yw3zn7lf4s
@user-yw3zn7lf4s Жыл бұрын
Bro. Literally learnt to play with data in just 2 videos. Thanks.
@ukaszhangiel7610
@ukaszhangiel7610 Жыл бұрын
Does this model completely ignore who the opponent is?! From what I see, the features used are: a) general match features - time of the game, home/away b) rolling averages for one team As a result the program tries to predict the outcome of the game completely ignoring who the opponent is. It will come with a predictions which is purely based on general match factors, and the past performance of one team, completely ignoring the specific opponent features. I.e. for a Arsenal game it will give me the same result retrospectively if Arsenal plays the 1st or the last team in the table. Do I get it right? If so, how can it make sense?
@InvestorLondon
@InvestorLondon Жыл бұрын
Amazing Video! Your really helping me Through my ML journey!
@avibm948
@avibm948 9 ай бұрын
Nice video Vic, learned a lot from your videos recently my only criticism is that some of the viewers may feel that they can generate positive returns based on probability higher than 50 or 60 percent. It would be better to predict the probability of winning because the betting reward is based on probability. So assuming we predict that a team wins is 70 percent and the odd reward is less than 7/10 we are going to lose on average, even though our model was right. The reason the model is able to predict with a probability of higher than 50 percent is that some teams are better than others and the betting odds reflect it. One can scrap the odds also and do the analysis but I believe the betting companies already use AI to predict the initial odds. There will be opportunities when the odds differ substantially from a good predictive model.
@goober-ll1wx
@goober-ll1wx 7 ай бұрын
yeah its basically a massive nothing burger, you'll still lose money and if by some miracle you can model it well, then your bookie will back you off before you make any money!
@pratiek8s
@pratiek8s 2 жыл бұрын
Very informative. Thank you sir.
@chigstardan7285
@chigstardan7285 2 жыл бұрын
This video came at the right time i trying to figure how to get rolling averages for a dataframe and especially that part with the 'left' argument, Thanks so much.
@Dataquestio
@Dataquestio 2 жыл бұрын
Glad it helped! -Vik
@alemassa6632
@alemassa6632 9 ай бұрын
Wonderful, I litterally have understood nothing but.... wonderful!
@madhuacharyya6963
@madhuacharyya6963 Жыл бұрын
Hi, I have enjoyed watching your demonstration of predicting the EPL game results. However, the predicted results don't reflect the actual results. So my question is, how can I predict more accurate results, and how can I train the dataset. Looking forward to hearing your reply.
@zuzekavova4651
@zuzekavova4651 8 ай бұрын
i hope you dont stop making these videos
@kenneth_wu
@kenneth_wu Жыл бұрын
Great video. Thanks for sharing. I think I am going to have a try.
@jacobdebrone
@jacobdebrone 6 ай бұрын
interesting stuff bro You just got yourself a subscriber
@thiagotms1
@thiagotms1 Жыл бұрын
This some quality video! Thanks!
@leomerremorin7905
@leomerremorin7905 Жыл бұрын
Hi! Can I make the predicted value into probability? or logistic regression? Thank you!
@avikpal6508
@avikpal6508 2 жыл бұрын
I generally opposed to the idea of using AI/ML model for EPL or in any sports , but definitely concept can be reused in multiple business cases . Great job mate !
@harryhaz4629
@harryhaz4629 Жыл бұрын
Great video thanks. But I was wondering how do you get the model to predict the upcoming football matches. Let's say Manchester United vs Liverpool etc.
@mhch77
@mhch77 Жыл бұрын
Hey Vick, Great Video! Wanted to ask how would I go about making predictions for a single match?
@samdowns4786
@samdowns4786 Жыл бұрын
Hi, great video. I am just wondering how to implement this onto matches in the future, predicting who would win the game this weekend for example
@russelldavis5248
@russelldavis5248 Жыл бұрын
Excellent tutorial. As a C# guy, I really appreciated seeing your code for interacting with the pandas.
@ibukunalade4286
@ibukunalade4286 5 ай бұрын
I really love this work. I will try with 10 seasons and make my train 70% of the dataset and my test 30%. But I want to ask, after all is done. How do I predict specific upcoming matches. I plan on adding upcoming games I want to predict to the test part and then predicting from there.
@ctrl-shift-run8681
@ctrl-shift-run8681 Жыл бұрын
This is a very cool project! I ran it across 7 leagues and it is interesting how the same set of predictors get very different results. In England and France, it does pretty well but in Brazil and Japan, not so much.
@Dataquestio
@Dataquestio Жыл бұрын
That is interesting! I wonder if there is more variance there due to transfers, less data, etc.
@StartupPickMeUps
@StartupPickMeUps Жыл бұрын
This is so good! It would be good to see a video on exactly how to feed in future fixtures as I'm unclear on how this is achievable :D
@Dataquestio
@Dataquestio Жыл бұрын
Hi Liam - thanks for the suggestion. What you need to do is pass in future data to the predict methods, the same way we're passing in the test set now. I can look into making a video.
@StartupPickMeUps
@StartupPickMeUps Жыл бұрын
@@Dataquestio after asking this question, I actually gave it a go myself but unless I add future data to my test data, I’m unsure how to do it, and it takes the accuracy is way off for me :D
@pain-nw5lo
@pain-nw5lo Жыл бұрын
@@Dataquestio Yes please! Im also stuck on passing future data :c
@_craig_
@_craig_ 2 жыл бұрын
Hi Vic, Excellent excellent video. So many tips and tricks. Thank you. A few clarifications, 1) the value counts is 1500+ , number of matches is 20 C 2 *2(home and away) *2(year 21&22) 2) it's not temporal data until rolling averages was included 3) I'm being silly here... matches played on the 1st of Jan are not in the train/test set because you didn't use >= 4)
@Dataquestio
@Dataquestio 2 жыл бұрын
Thanks, Craig! For 1, it should be 38 (total matches per team) * 20 (total teams) * 2 seasons. For 2 - I agree, there probably won't be any issues if you do cross validation without taking the temporal aspect into account. But the opponent code in particular can leak future data into the past. I generally prefer to treat any time-ordered data carefully. 3 - yes, you're right!
@KabirKohli-rm7xm
@KabirKohli-rm7xm Жыл бұрын
Hi, Thanks for the awesome video. I had one doubt (might be stupid) The aim of the model is to predict the winner of match between two teams (suppose team A vs team B). But for training the model on a single match result , we are only giving the stats for home team (A). Would'nt it make more sense to add stats for team B also in the same row , and then ask it to make the prediction.
@megwedgomaa7831
@megwedgomaa7831 2 жыл бұрын
Amazing work!!!
@Dataquestio
@Dataquestio 2 жыл бұрын
Thank you! -Vik
@pstryq224
@pstryq224 Жыл бұрын
Great tutorial! Do you have any advice for future matches - what values ​​should I add to the data in my CSV file in a situation when I want to predict the results of future matches? I mean the values ​​that we do not know yet, such as distance, shots on target, etc. All test data in the video have these data supplemented, so I wonder what to put in these "empty" columns. Thank you.
@Dataquestio
@Dataquestio Жыл бұрын
Hi there - distance, shots on target, etc, are only looked at for prior matches. If you're trying to predict future matches, you would use the rolling average of those columns from previous matches (this is what the video shows).
@FlisB
@FlisB Жыл бұрын
Interesting. I was running a similar model on football matches, except that I had rolling attributes of both teams as the predictors and the class was home_win, draw, away_win. A match is included only once. However I think your approach might be better.
@kiss-my-axe8810
@kiss-my-axe8810 3 ай бұрын
what was your win%??
@francescoscalia3541
@francescoscalia3541 Жыл бұрын
hey @Dataquest amazing content. i created the algo to predict games using your tutorial. im asking now what i have to do to make the algo do the predictions for the futures games since i noticed of course it predicted the past games. Could u tell me? thanks!
@siraatmedia8348
@siraatmedia8348 3 ай бұрын
What you did with the rolling averages was impressive. Is there such a thing as when a ML algo creates such features for you? I.e. it randomly multiply/dividing this by that or rolling averages or random features to create a new feature?
@Qubitmyst
@Qubitmyst Жыл бұрын
Inspiring well done ! Can you use gf and ga direct columns in your predictors with no using rolling_avarage function ? Now imagine you can get a very good algorithm for prediction after you save the model , how do you use this algorithm for the next season to predict games ?? Can you give me a clue ? For example sesson 2022 - 2023 to predict one game? thank You
@tomi4tv126
@tomi4tv126 3 ай бұрын
You have to use rolling averages because when you try to predict the outcome of the match (before it has started) you wont know gf and ga yet. But we know average gf and ga of last 3 games the team has played. Model can be used for new seasons, but the problem is data. You will have to gather data about games after this video. That is the tricky part, but he made also video before this one about Web scraping (getting new data direct from web). Or maybe you can find some updated data set online (maybe Kaggle). From my experience, those data sets you find online wont have more detailed statistics of game, so it would be best to web scrape the data yourself.
@cevikyi
@cevikyi 2 жыл бұрын
Hi, thanks for the great video. Why didn't you involve "team" as a predictor in each model as you've used opponent team information? Doesn't this miss the relationship between team A vs team B and so on?
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Yigit - great question. You are welcome to try it with team and measure error. The reason I didn't use it is because using a column like that can have a tendency to overfit. Some teams have performed really well in the last few seasons, but that doesn't necessarily mean they'll perform well in the future.
@cevikyi
@cevikyi 2 жыл бұрын
@@Dataquestio Thanks for the guidance!
@paulohss2
@paulohss2 Жыл бұрын
Great content! May I just ask why you did the division at the end of the tutorial? It was 27 / 40. From where the '40' figure came from?
@rishavmishra5786
@rishavmishra5786 Жыл бұрын
its 27 for 1 and 13 for 0 , totaling 27+13=40. and weight of 1 in total weight of 40. 27/40
@danielgonzalez5052
@danielgonzalez5052 Жыл бұрын
Hi Vikas! When doing the rolling part I'm facing an issue that says: "closed only implemented for datetimelike and offset based windows" You know what can be the problem? Thank you!
@chrissherman6591
@chrissherman6591 3 ай бұрын
Love the video, once I finish the model how do I feed in data from new games
@johnowusukonduah2305
@johnowusukonduah2305 Жыл бұрын
Is it positive to add the concept of time series to model the performance behavior of teams in the epl?
@tomkmb4120
@tomkmb4120 10 ай бұрын
What's a good way to split data for training, test if it doesn't contain something like a DateTime component?
@johanBe75
@johanBe75 Жыл бұрын
So many great Reviews, but yet just youtube!
@mirror1023
@mirror1023 Жыл бұрын
When creating the new columns using rolling_averages, we lost the first few games of the season when we dropped na rows. We also carried rolling averages into other seasons. How do we fix this?
@uncaged3076
@uncaged3076 Ай бұрын
Is there anyway I can reference your work? I am trying to use the idea of rolling averages on a project
@jamespapworth1477
@jamespapworth1477 3 ай бұрын
Why do you use RandomForest Classifier for this? Is it superior in someway for this application as compared to other Machine Learning models eg KNN, ANN etc
@youtubeuser4878
@youtubeuser4878 Жыл бұрын
Hi Vikas. Thanks for the tutorial. At the end, you mentioned we can use other data points however we can't use attendance because we only know that after the game is over. Isn't that the same for other data points like shots on target, distance, etc?
@Dataquestio
@Dataquestio Жыл бұрын
Hi there - some of the data points, like whether the game is home or away, you can use data from the current game as predictors. So if you're trying to predict if Arsenal will win on 7-10-2022, you can use data about whether the game on 7-10-2022 is at home or away. For other columns, like shots on target and attendance, we don't know the data for the current match until it ends. So we instead use an average of data from past matches (before 7-10-2022).
@chasingwildlife6584
@chasingwildlife6584 Жыл бұрын
Yes the data points like shots can't be known in advance. We use the old data, let's say the last three games in like in the video. The attendance of previous matches has no bearing (none that we know about) on the next match. However the number of shots taken in the last three games can be an indicator of what it might be in the game we are predicting.
@matilda_aaaaa
@matilda_aaaaa Жыл бұрын
Hi Excellent video and thanks for this. I want to know how I can calculate the rolling averages on sql as I’m not proficient in python
@alessandrocerri5668
@alessandrocerri5668 4 ай бұрын
HI, I have a question, everything was built without taking into consideration the matches that still have to be played so there is no real prediction of future matches but only on those already actually played, correct?
@karolkowalewski9832
@karolkowalewski9832 5 ай бұрын
Great video
@adrianfong4347
@adrianfong4347 Жыл бұрын
Hi Vik! I am learning so much through this video and decided to try adopt it to NBA data too:) . I am running into an issue where I merging the combined dataframe with on left_on = game_date, team and right = game_date, opponent. However, my new merged table is blank. My theory is that despite my data having the same 3 letter abbreviations for the teams (LAL, WAS, CHI, etc) in both the team and opponent, python is saying they aren't the same and not joining the tables. They are both 'object' data types (if that matters...). Any recommendations on how I can make them identical? Thank you!
@Dataquestio
@Dataquestio Жыл бұрын
Hi Adrian - do you actually have data from both sides of the match? For example, if LAL played WAS, you would need a row where WAS is the team and LAL is the opponent, and a row where LAL is the team and WAS is the opponent for the same game day. If you don't have this, you would need to create those rows (by duplicating the dataframe then swapping team and opponent) before merging.
@Dataquestio
@Dataquestio Жыл бұрын
You would also need to swap points for/against, etc.
@FlisB
@FlisB Жыл бұрын
Did you scrape the data from basketball-reference?
@danielgonzalez5052
@danielgonzalez5052 Жыл бұрын
Hi Vika, amazing tutorial! I have one question, how should we treat the ties in this model? Thank you!
@Dataquestio
@Dataquestio Жыл бұрын
It's up to you. You could make this a 3-class classification problem, and code loss as 0, tie as 1, win as 2. You can also do what's done in the video, and code a tie as a loss.
@kevinbarnes4474
@kevinbarnes4474 Жыл бұрын
This is great, using goal-scoring/concession stats more (home and away) could also help with refining accuracy.
@stephenbube965
@stephenbube965 Жыл бұрын
am new to this.....was asking how one can get the predictions from the machine learning, am stuck at the combined precision stage and cant find a way of extracting future predictions.any help will be highly appreciated
@bn_ln
@bn_ln Жыл бұрын
this is seriously underrated content
@Dataquestio
@Dataquestio Жыл бұрын
Thanks, Ben!
@Makako_Loko
@Makako_Loko 2 ай бұрын
First of all, thank you for this video. I have a doubt, how do I apply this to future matches that will happen? How do I put it in the ML?
@kevwhiteford5167
@kevwhiteford5167 8 ай бұрын
Is there a quick way to add and predict up and coming matches?
@Skeeyeee613
@Skeeyeee613 Жыл бұрын
Thank you very much for such wonderful content. When I try running your line 65 I'm getting an error saying mapping is not defined. Any suggestion?
@johanBe75
@johanBe75 Жыл бұрын
it is fake tutorials with clickbait. Just look at reviews so many of then so great isn´t it?
@berrauniverse
@berrauniverse 6 ай бұрын
Did this using logistic regression with binary classification and achieved a 70% precision. Used different parameters for training the model though. Also had to put the sleep time to 10 seconds when scraping to avoid 429 HTTP response.
@cgruita
@cgruita Ай бұрын
Wow, 70% precision is very impressive! What did you use? XGBoost, LightGBM?
@sushik.8043
@sushik.8043 4 ай бұрын
Where can I find a whole spreadsheet like this but for the NFL or NBA?
@alexjamarco
@alexjamarco 2 жыл бұрын
Hi Vikas. Very nice tutorial. I was able to code all along and i was my first ML project. Seems awesome how the computer predicts stuff like this. I have a question: we have our training and testing datasets, right? How can we ask the algorithm to predict an event that it's not on the training data? For example, let's say I have a csv of next weekend's matches. How Can I ask the algorithm to try to predict the winner? Sorry if it seems a silly question, but I actually couldn't find a more clearer way to ask. Thanks and well done once again!
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Alexandre - you'd basically put the information for next weekend's matches (opponent code, venue code, rolling averages, etc) into a new testing set, and then make predictions on that set.
@kennedyogutu4099
@kennedyogutu4099 2 жыл бұрын
Feed your data into your trained model.
@amragl
@amragl Жыл бұрын
@@Dataquestio Hi Vikas, would it be possible to explain it in a different way? I still don't understand it. Many thanks for your videos!!
@madebymate4870
@madebymate4870 Жыл бұрын
This is a very great video, but i don't understand exactly how to predict the individual matches. what parameters and how should i put in rf.predict() if i want to have the outcome of a single match?
@royalzikhali5295
@royalzikhali5295 8 ай бұрын
did you ever find the answer
@obaidulmostafa3384
@obaidulmostafa3384 Жыл бұрын
Which algorithm did you use to complete this project, Brother?
@acegameboy6232
@acegameboy6232 Жыл бұрын
I just finished writing this out and for the most part it works except for this line: combined, error = make_predictions(matches_rolling, predictors + new_cols) error: ValueError: Found array with 0 sample(s) (shape=(0, 12)) while a minimum of 1 is required This line in particular is giving me trouble in both the one I hand wrote myself and copying and pasting your program. I've looked through the code and some forums but nothing seems to be wrong. I think maybe it could be a year issue in that the way to write this out has changed as time went on and that this form of writing it is old. I'm not sure what the issue is so if someone could help me out that would be great. I'm planning to use this as an American Football predicter to see if the program will be able to predict which team will win. I'm doing it primarily because of my cousin and his fondness for fantasy football. It got me a little interested in the sport but I figured I'd create a model to make things a little fun for me.
@benjaminmwangi6872
@benjaminmwangi6872 Жыл бұрын
Hi, 1. Kindly suggest a roadmap for me to adequately comprehend this project. I have no experience in the field nor programming background. 2.How do I run this project in the meantime as i upscale my skills? Awsome tutorial. Got yourself a believer.
@Dataquestio
@Dataquestio Жыл бұрын
I would recommend following the data scientist path at Dataquest - www.dataquest.io/path/data-scientist/ . This will help you learn all of the skills (including programming) to build this model.
@mrcaljoe1
@mrcaljoe1 Жыл бұрын
37:50 what does the ** before map_values do?
@bonifaceboban368
@bonifaceboban368 Жыл бұрын
i got an error like this after writing below code can you please explain how to resolve it preds = rf.predict(test[predictors]) NotFittedError: This RandomForestClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
@Captain_Roy16
@Captain_Roy16 2 ай бұрын
Can we implement something like Fixture difficulty code and predict more accurately?
@user-tg7mz2qh7s
@user-tg7mz2qh7s Жыл бұрын
Big thanks for this video! Helped me a lot! Tried this method on my project with soccer data analysis and everything went fine until this function: "def make_predictions(data, predictors):". Got KeyError: "["rolling_cols"] not in index". Any advice on solving this issue? Thanks in advance!
@bigtomDW
@bigtomDW Жыл бұрын
" predictions + new_cols " seems to be my issue. having predictions by itself doesnt throw the error.
@dedisupardi2815
@dedisupardi2815 Жыл бұрын
Cool 👍
@nonsobismark1846
@nonsobismark1846 Жыл бұрын
Great work... By is there any prediction sites where you update the predictions
@Dataquestio
@Dataquestio Жыл бұрын
Thanks! There is no live site yet, but someone can make one with this code :)
@gabriel.o.michael9549
@gabriel.o.michael9549 2 жыл бұрын
I have to say, you're a natural educator. If you haven't, please consider teaching a younger audience. I bet you'll be good at it.
@Dataquestio
@Dataquestio 2 жыл бұрын
Thank you, Gabriel! I really appreciate that. -Vik
@torezo9028
@torezo9028 3 ай бұрын
Is there a recently updated data set?
@amragl
@amragl Жыл бұрын
Hi!, I don't think I understand how you can use the rolling_average cols on the predict dataset, you wouldn't have that information until after you match is finished, right? so, how can those columns be used in the predict dataset? , Many thanks for your great videos and content! Well explained and very educative.
@Dataquestio
@Dataquestio Жыл бұрын
Hi there - the rolling average is computed on matches prior to the current one. We don't use any knowledge of the current match. -Vik
@amragl
@amragl Жыл бұрын
@@Dataquestio many thanks for taking the time to respond!! You and your learning platform are awesome 😎!!!
@Kiirby1x
@Kiirby1x 3 ай бұрын
Hello, could someone explain to me how I could input future games for it to make a prediction?
@chottomtaki
@chottomtaki Жыл бұрын
Thanks for the very interesting training, can you please provide the one relating credit scroring modeling for
@Dataquestio
@Dataquestio Жыл бұрын
Thanks for the suggestion - I'll consider it for a future video.
@hussainwali8801
@hussainwali8801 Жыл бұрын
awesome
@akanntuk9727
@akanntuk9727 Жыл бұрын
this awesome
@sakariyaqaase6773
@sakariyaqaase6773 Жыл бұрын
thanks Vic, i tried to run the rolling average function but it's give me this error value ValueError: closed only implemented for datetimelike and offset based windows
@martincal7115
@martincal7115 Жыл бұрын
I'm having the same issue. Did you find a way to fix it? Thanks
@velsiu
@velsiu 4 ай бұрын
how to use it to predict future matches from like today or tomorrow ?
@agdaltarek
@agdaltarek Жыл бұрын
hello, my question is how would you deal with predicting newly promoted teams results ? especially teams that maybe are promoted for the first time in a very long time.
@Dataquestio
@Dataquestio Жыл бұрын
This is a tricky one. You could build a separate model to predict how well a team will do in the first season after promotion based on lower league results.
@agdaltarek
@agdaltarek Жыл бұрын
@@Dataquestio yep maybe based on previous promoted teams, i thought about that
@tomphillips5513
@tomphillips5513 2 ай бұрын
I have seen a lot of other people ask this in the comments, but there hasn't really been a solid reply... how can you apply this to predict the results of matches that haven't occurred yet? Because this is all well and good to split the data into parts that the ML algorithm sees and does not see, but it is pretty useless when applying it to life because we already know the result of that game that occurred, even if the ML doesn't. Could someone either explain to me what I am missing, or suggest the next steps for predicting matches of which there is limited data recorded already?
@meetupadhyay9687
@meetupadhyay9687 11 ай бұрын
Hey what is train test percentage?
@user-qk3tt4fs9n
@user-qk3tt4fs9n 2 жыл бұрын
Hi, thanks for amazing video. can you give me the link to the website/anything that you took the csv file from?
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi there - there's a previous video about scraping the data - kzbin.info/www/bejne/hKWaiH2XZ86oZtU .
@user-qk3tt4fs9n
@user-qk3tt4fs9n 2 жыл бұрын
@@Dataquestio Thanks!!
@ericmckee8007
@ericmckee8007 Жыл бұрын
Thank you greatly, this has been extremely helpful. I ran into a KeyError issue when running make_predictions telling me that all of the rolling columns were not in index (gf_rolling,..). Do you have an idea as to why this is happening? I followed the code exactly, so I'm not sure what is causing this... If I remove "+ new_cols" when calling the function it works fine. Thanks again
@Dataquestio
@Dataquestio Жыл бұрын
Hi Eric- this would happen if the new columns aren't in the matches_rolling dataframe. This is the code that adds the columns - "matches_rolling = matches.groupby("team").apply(lambda x: rolling_averages(x, cols, new_cols))"
@joshuakanatt7552
@joshuakanatt7552 2 жыл бұрын
Sir can you help me out on this ? How to get the data set of player salary, Contract period and statistics in a single dataset from NBA ? if its seperate data sheets it might not be easy to combine it .
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Joshua - I don't know of a single table where you can get all of that data, but you might want to look at www.basketball-reference.com/ . You could either scrape and combine multiple datasets, or find a single table with the data you want.
@joshuakanatt7552
@joshuakanatt7552 2 жыл бұрын
​@@Dataquestio Thanks, Got it. Really help full contents from your channel.
@DarkCode
@DarkCode 2 ай бұрын
I'm trying to predict who will win the NHL championship, their divisions, and the rest of the regular season l. I need help with this project, I will be using machine language. I'm using colab. I need help with this. Any takers? Any and all help, would help!
@hristolakov3563
@hristolakov3563 Жыл бұрын
Why are we only looking at matches that have been played? I mean, i understand it for the learning part and the back testing, but the machine hasn't actually predicted a match, that hasn't been played, from the date of the video going forward. That would have been useful. Is it like we just have to add these upcoming matches to the matches.csv? It is what i am trying to do, but it is pretty tough for a beginner, like me. Will push harder, hopefully find a solution. Thank you for the video and the great explanations.
@hristolakov3563
@hristolakov3563 Жыл бұрын
When we merge the 'matches' with 'shooting', we basically get rid of all the future matches. I should probably keep the not-played matches in the list somehow with NaN values under shooting?
@Dataquestio
@Dataquestio Жыл бұрын
If you want to predict future matches, you can just feed them into the prediction methods. The reason we remove the rows where matches haven't been played is because we can only use data for training if we know the outcome. But once we train a model, you can feed that data in to get future predictions (the same way we feed in the test set).
@shaunhankey
@shaunhankey 2 жыл бұрын
How could you include draws? I've been playing around with the code and data from another source, but I seem to only be able to predict 'win' or 'lose' e.g. 1 or 0. Pls help! Tutorial was awesome!
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Shaun - you would have a multiclass classification problem. So just code `win` as 1, draw as `2`, and loss as `3`, then you can use the same technique in the video. Alternatively, you can try to frame this as a regression problem where you're predicting the point spread. So your target would be the difference between the team's score and the opponents score (win would be greater than 0, draw is 0, loss is less than 0). This will be more accurate. Either way, you're changing the target (what the algorithm is trying to predict).
@shaunhankey
@shaunhankey 2 жыл бұрын
@@Dataquestio Thank you! That’s very helpful, really appreciate it 👍🏼
@dcr7417
@dcr7417 Жыл бұрын
@@shaunhankey Hi Sean, have you tried this? How would one change this code: matches["target"] = (matches["result"] == "W").astype("int") to add draws as a target ?
@user-zr4ue2iv8l
@user-zr4ue2iv8l Жыл бұрын
is it ANN model???
@anlgoy9386
@anlgoy9386 Жыл бұрын
My English is weak, so I'm using Chatgpt for translation. Can we combine data from different websites to create a CSV file and analyze it to increase our chances of winning? For example, we could gather match data and odds from Flashscore, voting results from Oddsportal for each match , and win/loss probabilities from Tablesleague. Then, we could use artificial intelligence to create a prediction program. Would you be interested in this?
@manasseholowoyeye3236
@manasseholowoyeye3236 9 ай бұрын
did you later discover any means or do you use any app currently?
@laus-thecurious4120
@laus-thecurious4120 Жыл бұрын
where can i get this dataset other than your github . i want dataset for indian super league .
@Dataquestio
@Dataquestio Жыл бұрын
This video shows how to scrape the data - kzbin.info/www/bejne/hKWaiH2XZ86oZtU . You can modify this for other leagues.
@PeterKrusz91
@PeterKrusz91 2 жыл бұрын
At line 30, on the 17:49 mark, when we run, preds = rf.predict(test[predictors]) , I get a ValueError, "ValueError: Found array with 0 sample(s) (shape=(0, 4)) while a minimum of 1 is required." Is anyone running into a similar issue?
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Peter - I'm guessing your test set is empty. You might want to check your code that splits the train and test set up. -Vik
@acegameboy6232
@acegameboy6232 Жыл бұрын
@@Dataquestio what about line 58? I get a ValueError saying ValueError: Found array with 0 sample(s) (shape=(0, 12)) while a minimum of 1 is required What can I do to fix this? I typed everything in correctly and I even did it 5 times and it gives the same result.
@doll0101
@doll0101 Жыл бұрын
Please somebody help me to plot a graph for output!(source code) pls pls
@NguyenNamDuong-kx4gu
@NguyenNamDuong-kx4gu Жыл бұрын
can you do it for the future :( i really need it
@jamshidnoori1496
@jamshidnoori1496 2 жыл бұрын
why I get this error = TypeError: list indices must be integers or slices, not list after I write this code rf.fit(train[predictors],train['target']).Thanks
@xsquirrel7091
@xsquirrel7091 2 жыл бұрын
Because you are putting a list as a list indice. In this case you have probably forgot to put the quotation marks in train['predictors'].
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Jamshid - `train` should be a DataFrame, but it looks like you might have it stored as a list. The full code is here if you want to compare - github.com/dataquestio/project-walkthroughs/blob/master/football_matches/prediction.ipynb .
@jamshidnoori1496
@jamshidnoori1496 2 жыл бұрын
@@xsquirrel7091 Hi, Thank you very much. I have already put " predictors" as variable to choose de columns name. like this ( predictors = ['venue_code','opp_code','hour','day_code']).
@jamshidnoori1496
@jamshidnoori1496 2 жыл бұрын
@@Dataquestio Great work thanks
@jamshidnoori1496
@jamshidnoori1496 2 жыл бұрын
Yes , you are right. I passed the 'train ' and " test " as a list not dataframe. train = [matches[matches["date"] < '2022-01-01']] test = [matches[matches["date"] > '2022-01-01']] But should be like this train = matches[matches["date"] < '2022-01-01'] test = matches[matches["date"] > '2022-01-01']
Predict The Stock Market With Machine Learning And Python
35:55
Dataquest
Рет қаралды 623 М.
100😭🎉 #thankyou
00:28
はじめしゃちょー(hajime)
Рет қаралды 37 МЛН
Como ela fez isso? 😲
00:12
Los Wagners
Рет қаралды 29 МЛН
когда достали одноклассники!
00:49
БРУНО
Рет қаралды 3,4 МЛН
Omega Boy Past 3 #funny #viral #comedy
00:22
CRAZY GREAPA
Рет қаралды 33 МЛН
Computer Vision for Football Analysis in Python with Yolov8 & OpenCV
23:51
Machine Learning With Hamza
Рет қаралды 42 М.
The Super Easy Way to Scrape FBREF for Free Soccer Data
11:22
McKay Johns
Рет қаралды 3,8 М.
Predict the Outcome of Football Matches Using this Model
21:02
Kie Millett
Рет қаралды 160 М.
Predicting the Winning Team with Machine Learning
29:37
Siraj Raval
Рет қаралды 222 М.
Using Machine Learning for Predicting NFL Games | Data Dialogs 2016
37:16
Berkeley School of Information
Рет қаралды 49 М.
How to Scrape SofaScore for Football Data
14:51
McKay Johns
Рет қаралды 25 М.
100😭🎉 #thankyou
00:28
はじめしゃちょー(hajime)
Рет қаралды 37 МЛН