tennis-crystal-ball Question on backtesting of predictions

Question on backtesting of predictions

Open larssl780 opened this issue 6 years ago • 5 comments

Great tool - do you have any stats on how the prediction model performs on different surfares, rounds etc? Basically, on average if the model says odds of a player winning is 2, does that player win 50% of the time?

Jul 17 '18 13:07 larssl780

Here is the current state, tested on a sample seasons 2005 to 2018 (May/June)

HARD_OUTDOOR PredictionResult{rate=71.104%, predictable=96.518%, brier=0.19026, logLoss=0.56009, score=1.26950, calibration=0.98534, matches=15769}

CLAY PredictionResult{rate=69.526%, predictable=98.591%, brier=0.19893, logLoss=0.58052, score=1.19763, calibration=0.97839, matches=12914}

GRASS PredictionResult{rate=71.279%, predictable=95.711%, brier=0.19000, logLoss=0.56030, score=1.27215, calibration=0.99381, matches=3987}

HARD INDOOR / CARPET PredictionResult{rate=68.879%, predictable=98.231%, brier=0.20350, logLoss=0.59259, score=1.16233, calibration=0.99636, matches=6274}

For your question about probability 50% (odds 2) and if outcome is really 50-50%:

This is answered by 'calibration' attribute, the closed it is to 1, the evener is the distribution of outcomes compared to predicted probabilities. If it is lower then 1, that means favorites are a little bit underestimated, if it is over 1, that it meas favorites are overestimated.

As it can be seen, grass and hard outdoor are most predictable, while clay and hard indoor are less. However, hard indoor score is skewed because there is small amount of best-of-5 matches on hard indoor (i.e. no GSs, just DC). Usually, best-of-5 gives about 8% more prediction accuracy compared to best-of-3.

Jul 18 '18 10:07 mcekovic

Best-Of Hard Outdoor Clay Grass Hard Indoor / Carpet Overall

3 68.822% 67.813% 68.738% 68.671% 68.422%

5 76.597% 76.118% 75.918% 76.582% 76.336%

Best-Of	Hard Outdoor	Clay	Grass	Hard Indoor / Carpet	Overall
3	68.822%	67.813%	68.738%	68.671%	68.422%
5	76.597%	76.118%	75.918%	76.582%	76.336%

Jul 18 '18 12:07 mcekovic

TODO: Add a blog entry with more info about the TCB predictor performance and tuning.

Aug 01 '18 08:08 mcekovic

What would also be cool is to compare your backdated odds with actual outcomes. So say eg. you created bins for odds between 1.5 and 1.52, and then you compare the actual outcomes (so you'd expect to see a player with those odds to win between ~65.8% and ~66.7% of the time if the odds are "fair").

Aug 01 '18 16:08 larssl780

This kind of metrics is measured with Log-Loss and Brier Score, some info is above.

Aug 08 '18 08:08 mcekovic

tennis-crystal-ball tennis-crystal-ball copied to clipboard

Question on backtesting of predictions

tennis-crystal-ball
tennis-crystal-ball copied to clipboard