trueskill
trueskill copied to clipboard
TrueSkill Through Time
It would be nice if this module implemented TrueSkill Through Time (TTT), which fixes a few issues in the original TrueSkill algorithm, namely:
- Inference within a rating period depends on the random order selected within that rating period.
- Inference can only go forwards in time, so if A beat B then B beats C, and C is highly ranked, then A should also be highly ranked.
I assume you didn't implement this yourself anywhere @MarkLodato ? Have you found an implementation of it anywhere? Thank you!
Unfortunately no, I haven't found an implementation anywhere.
It's surprising, given how old the paper is. I wish I could do it myself but it's probably a bit beyond my abilities, and I doubt I'd have the time even if I felt able to tackle it.
No problem. If someone reads this and has the inclination, you'd be open to a pull request, I assume?
I'm not the original author Mark, I just found your issue and commented on it.
Oh :)
Sorry for my late response. Currently, I don't have a plan to implement TTT. But I will read the paper tonight to estimate the cost. Anyway, this project is open to your contributions. You guys can contribute with TTT.
At some point if you're going to calculate ratings going "backwards" in time, doesn't that negate the point of Elo in the first place? I mean, why not just use logistic regression at that point?
@EvanZ logistic regression assumes that there's a single, static skill parameter for every player. TTT assumes that the player's skill can change over time. This is orthogonal to the fact that in TTT, information can also flow backwards.
Having just read the paper I admit I don't see TTT is not something that "can be implemented" meaningfully without some significant qualification and work.
The qualification is that it's not a replacement for the existing code at all, but a different body of code. TrueSkill is what we're using to asses skill today based on history, and form leaderboards as does the Xbox community.
What TTT offers is a revision, of the time history. That is, as new data arrives, you could go and perform updates on all the historic ratings for a player. This presupposes a complete new data structure for ta module like this, which can provide it with the whole history of game interactions.
It's worth noting this is a potential nightmare and certainly not trivial, as TTT concerns itself specifically with chess a two player game. In our scenario for an N player game, and players mixing and matching you end up with what is an influence tree. That is in assessing the skill of player A you perforce end up reassessing the skill of all players that A has played with in their history, and here it gets fun, recursively ... that is if B played with A in the past and B needs updating, we need also to do this whole schmozzle for B.
In short order what it boils down to is you need to run a TTT update on a whole data set tat is assumed closed. That is has only internal references ... all players that are playing have their whole (assesed) histories recorded and all these is what TTT walks through. So suddenly TTT is not a python module that just takes N players skill ratings and a game result to produce an update, but it needs access to a whole history of all those players (and all the players they played with and so on) if it's going to propagate the new information backwards.
Now it would be an interesting thing to do, indeed, but non-trivial, be running in the background and presumably consume a pile of computing resource on every run so might depending on traffic (update) volume be a nightly run or such, to provide updated estimates of historic skills.
Any robust database would choose to store the original estimates (basic TrueSkill rankings seen on the way here) and store the latest update beside it, so that a comparison is possible (we could draw a graph of the players Skill evolution as it happened, and the latest full TTT improved history estimate). It would indeed be awesomely cool to see how they differed.
In case anyone's interested, I have forked the original TTT code (written in F#) with some instructions on how to run it on a modern Linux system:
https://github.com/lucasmaystre/chessanalysis
Very awesome. Without the time to study it now, do you think:
-
it generalizes to N players in M teams easily. Or is is fairly heavily build on the 1 on 1 assumption (which the paper suggests)?
-
it would port to Python easily? Probably hard to say without giving it a shot.
Hmmm, just reading about TrueSkill 2. Interesting ... a significant upgrade of sorts, with a detailed paper, but can't find any code on offer yet. Paper only published in March 2018. Of note though is that they generate the code using Infer.NET, apparently a package that lets you specify the factor graph in an abstract way, and will generate code that solves the factor graph. Sounds neat. I see a future possibly for more easily tuned Baysian skill models. Who knows?
Just got confirmation from Minka that there are no plans to release TrueSkill 2 code the way TTT code was released. To wit if the community wants to leverage learnings Microsoft Research are sharing in the TrueSkill 2 paper it seems for now it demands a reimplementation. But the pointers to Infer.NET and the way these factor graphs are created is enough to get someone with time and motivation started. On my todo list but somewhere down at that level that might be done in retirement ;-).
Hi all, I have just publicly released a Python package for estimating time-varying skills based on the entire history of matches. It can be thought of as an extension of TTT (the time dynamics are more general). You can find it here: https://github.com/lucasmaystre/kickscore
It's based on a paper I'll present later this summer at the KDD conference: https://arxiv.org/abs/1903.07746
Note that at the moment it only supports pairwise comparisons (either 1-vs-1, or many-vs-many).
Just got confirmation from Minka that there are no plans to release TrueSkill 2 code the way TTT code was released. To wit if the community wants to leverage learnings Microsoft Research are sharing in the TrueSkill 2 paper it seems for now it demands a reimplementation. But the pointers to Infer.NET and the way these factor graphs are created is enough to get someone with time and motivation started. On my todo list but somewhere down at that level that might be done in retirement ;-).
Hey do you have any idea whethter there is any implementation of TrueSkill 2 available?
Hi all,
We have implemented [1] and documented [2] the first TrueSkill Through Time packages for Python, Julia and R. I wish to express my gratitude to Heungsub Lee for having published the basic TrueSkill model in Python. I tried to keep the same interface until I was forced to change it (some changes I made could have been avoided). I remain at your disposal to collaborate in case you wish to update the original package based on the TrueSkill Through Time model that we have implemented.
python3 -m pip install trueskillthroughtime
[1, 2] https://github.com/glandfried/TrueSkillThroughTime [2] https://github.com/glandfried/TrueSkillThroughTime/releases/download/doc/landfried-learning.pdf