Baseball Prediction using Machine Learning - Predicting Runs Scored: First Model

  Рет қаралды 960

numeristical

numeristical

Күн бұрын

This is the eleventh video in a series where we will attempt to predict the winning probabilities for (MLB) Major League Baseball games using modern machine learning techniques. In this video, we shift our focus to predicting the number of runs scored. This requires us to wrangle our data into a new format where we have one teams hitting features, the opposite teams pitching features and try to predict the number of runs scored. We want to predict the distribution of runs scored, not just a point estimate, so that we can use these models to get probabilities for the over/under. As such we will employ a probabilistic regression technique called Coarsage (available in the StructureBoost package) to model these predicted conditional distributions.
Data: www.retrosheet.org, www.oddsshark.com
Notebooks: github.com/num...
github.com/num...
Personal links:
Consulting: www.numeristical.com
Github: github.com/num...

Пікірлер: 11
@tshock22
@tshock22 Жыл бұрын
Another great vid. Thanks!
@numeristical
@numeristical Жыл бұрын
You bet!
@prophecysports
@prophecysports 5 ай бұрын
i built something similar and i'm getting similar results as you... the majority of the probability mass is centered around 2 runs so when trying to produce an actual prediction taking the class with the highest probability yields 2 runs for most games. not ideal... were you able to improve this? curious to see how others are tackling this problem. would love an invite to the discord!
@numeristical
@numeristical 5 ай бұрын
Here’s an invite - will be easier to engage in discussion on the discord! discord.gg/2pTX472x
@malone1020
@malone1020 Жыл бұрын
in 11b: preds_test = cr1.predict_distributions(X_test) I get: ValueError: Buffer dtype mismatch, expected 'long' but got 'long long'
@malone1020
@malone1020 Жыл бұрын
maybe a linux vs. windows thing? long for linux, and long long for windows? do you use linux?
@numeristical
@numeristical Жыл бұрын
yeah - I don’t get this error but others have reported it. I can fix it (meant to do it with the latest release but forgot). Give me a couple days and I can upgrade the package to fix it
@numeristical
@numeristical Жыл бұрын
@Brian Malone This should (hopefully) be fixed now. Please upgrade to structureboost 0.4.1 and let me know if that resolves the problem
@malone1020
@malone1020 Жыл бұрын
@@numeristical upgraded to 0.4.1 but still getting the long long vs. long error. appreciate you putting in the time to investigate and correct!
@numeristical
@numeristical Жыл бұрын
@Brian Malone - ok, give it a shot now - structureboost 0.4.2. If you still get the error, try to email me the error message so i can diagnose a bit better. Fingers crossed!
Machine Learning Algorithm- Which one to choose for your Problem?
21:33
БЕЛКА СЬЕЛА КОТЕНКА?#cat
00:13
Лайки Like
Рет қаралды 2,2 МЛН
Calculus at a Fifth Grade Level
19:06
Lukey B. The Physics G
Рет қаралды 8 МЛН
Machine Learning for Everybody - Full Course
3:53:53
freeCodeCamp.org
Рет қаралды 6 МЛН
Predict NBA Games With Python And Machine Learning
58:33
Dataquest
Рет қаралды 48 М.
Simple Code, High Performance
2:50:14
Molly Rocket
Рет қаралды 251 М.
Natural Language Processing with spaCy & Python - Course for Beginners
3:02:33
House Price Prediction in Python - Full Machine Learning Project
40:40
ChatGPT for Data Analytics: Full Course
3:35:30
Luke Barousse
Рет қаралды 279 М.
Errichto Stream, POI 22/1
3:55:08
Errichto Algorithms
Рет қаралды 157 М.
An Introductory QGIS Workshop for Beginners
3:49:41
QGIS North America
Рет қаралды 523 М.