tennis-crystal-ball
tennis-crystal-ball copied to clipboard
Question on backtesting of predictions
Great tool - do you have any stats on how the prediction model performs on different surfares, rounds etc? Basically, on average if the model says odds of a player winning is 2, does that player win 50% of the time?
Here is the current state, tested on a sample seasons 2005 to 2018 (May/June)
HARD_OUTDOOR PredictionResult{rate=71.104%, predictable=96.518%, brier=0.19026, logLoss=0.56009, score=1.26950, calibration=0.98534, matches=15769}
CLAY PredictionResult{rate=69.526%, predictable=98.591%, brier=0.19893, logLoss=0.58052, score=1.19763, calibration=0.97839, matches=12914}
GRASS PredictionResult{rate=71.279%, predictable=95.711%, brier=0.19000, logLoss=0.56030, score=1.27215, calibration=0.99381, matches=3987}
HARD INDOOR / CARPET PredictionResult{rate=68.879%, predictable=98.231%, brier=0.20350, logLoss=0.59259, score=1.16233, calibration=0.99636, matches=6274}
For your question about probability 50% (odds 2) and if outcome is really 50-50%:
This is answered by 'calibration' attribute, the closed it is to 1, the evener is the distribution of outcomes compared to predicted probabilities. If it is lower then 1, that means favorites are a little bit underestimated, if it is over 1, that it meas favorites are overestimated.
As it can be seen, grass and hard outdoor are most predictable, while clay and hard indoor are less. However, hard indoor score is skewed because there is small amount of best-of-5 matches on hard indoor (i.e. no GSs, just DC). Usually, best-of-5 gives about 8% more prediction accuracy compared to best-of-3.
Best-Of | Hard Outdoor | Clay | Grass | Hard Indoor / Carpet | Overall |
---|---|---|---|---|---|
3 | 68.822% | 67.813% | 68.738% | 68.671% | 68.422% |
5 | 76.597% | 76.118% | 75.918% | 76.582% | 76.336% |
TODO: Add a blog entry with more info about the TCB predictor performance and tuning.
What would also be cool is to compare your backdated odds with actual outcomes. So say eg. you created bins for odds between 1.5 and 1.52, and then you compare the actual outcomes (so you'd expect to see a player with those odds to win between ~65.8% and ~66.7% of the time if the odds are "fair").
This kind of metrics is measured with Log-Loss and Brier Score, some info is above.