lila icon indicating copy to clipboard operation
lila copied to clipboard

Unwinnability Detector (adjudicating timeouts is now possible)

Open miguel-ambrona opened this issue 3 years ago • 36 comments

Deciding whether a game should be drawn after a timeout is a fundamental problem which chess servers could only solve partially.

I propose a solution: https://github.com/miguel-ambrona/D3-Chess

I have analyzed the full Lichess database of standard rated games with the above tool, identifying ALL the games that have been unfairly classified. The tool can handle about 3000 games per second (running on 1 core) and I estimate that Lichess only needs about 12 games per second.

Do you think the above numbers are good enough for considering the integration of this tool into Lichess?

Relevant links: https://github.com/ornicar/lila/issues/4640 https://chasolver.org

miguel-ambrona avatar Jun 24 '21 19:06 miguel-ambrona

I selected a random game from the "unfair" list: https://lichess.org/pHeKcmmM#126 and I actually agree with the current result, i.e. black wins.

ornicar avatar Jun 28 '21 18:06 ornicar

But there is indeed no sequence of moves that ends in a black win. (White is forced to capture the rook, and black cannot win from there.)

niklasf avatar Jun 28 '21 18:06 niklasf

An interesting position from the "unfair" list: https://lichess.org/lob5aMiS#89

White has only a king, but got adjucated a win, because black left. If punishing a player leaving with a loss (even if timeout would have been a draw) is intended behaviour of the lichess server, then it might be helpful to filter games with one of the players leaving from the list.

I really like the idea of using this tool for better timeout adjucation (in the sense of following FIDE rules instead of the simpler "not enough material" rule most chess servers use). Maybe its even possible to run this tool on every draw offer? This would allow a player to effectively "claim" a draw by Article 9.6 of the FIDE Laws of Chess. Article 9.6 says that the game ends immediately in unwinnable positions (similar to mate), but searching for a helpmate on every move does not seem computationally feasible. Therefore claiming a draw by pressing the draw offer button seems like a good compromise.

Nevsor avatar Jul 14 '21 11:07 Nevsor

Hi @Nevsor . I agree with you, Lichess may want to punish a player for leaving the game, even if their opponent cannot win. But what if the player unintentionally lost connection?

All games in the unfair list are flagged with "Time forfeit" termination. I guess we could detect whether it was a real timeout by additionally checking the PGN final comment. What do others think about this?

With respect to immediately terminating games in case of dead draws, I guess this can actually be done with the current tool: About 35 games per second were finished in Lichess in the last month, say a game is on average 150 plies long (I think this is a safe upper-bound). In that case you would need to analyze at most 35 * 150 = 5250 positions per second. That is doable with one single core if the "quick" mode is used. (Maybe searching for dead draws only after the amount of material crosses a threshold or every n moves could make this analysis much lighter.)

miguel-ambrona avatar Jul 14 '21 13:07 miguel-ambrona

There's no need to do any messing around with PGNs. Lila is well aware whether a game was lost by flagging or leaving.

benediktwerner avatar Jul 14 '21 13:07 benediktwerner

There's no need to do any messing around with PGNs. Lila is well aware whether a game was lost by flagging or leaving.

I mean to filter out my list.

miguel-ambrona avatar Jul 14 '21 13:07 miguel-ambrona

After implementing a few ideas that I had, the performance of the quick routine has significantly improved: The quick version of the algorithm is as complete as before but it now takes 3 us on average per position and a maximum of 400 us. (The last version used to take about 140 us on average and a maximum of 6 ms.)

This means the tool can now can handle more than 300K positions per second in quick mode.

miguel-ambrona avatar Jul 18 '21 20:07 miguel-ambrona

I guess we could deploy a new service for it, with a public API, that lichess would also use.

ornicar avatar Jul 27 '21 07:07 ornicar

Some thoughts on where to use the logic:

  • If there is a draw offer I would definitely want to check this. However, this is likely insufficient as I believe you are not allowed to offer a draw if you tried to offer it earlier in the game. (Of course it should be checked asynchronously so people cannot induce lag by clicking the draw button).
  • On game end in case of timeout: Again definitely should happen. This should be cheap enough that it won't impact the hardware requirements. (I guess its ok if people need to wait a ms longer sometimes to see the end of game result, I would be hesitant to first show the win and then later update it to a draw.)

I also feel that doing a full analysis every single move will likely be a bit heavy (given the comparative infrequency of the problem), so perhaps start by doing a full analysis when a draw is offered or a game ends. Essentially this will make sure no game ends with the wrong result. (At least as far as we can detect it).


The only problem I could foresee with NOT checking every move is that players may actually depend on something to be adjudicated a draw, let the clock run out and then hitting a corner case which is not picked up by the logic (though this may literally be 1 in a million games?). Though this cannot be avoided technically unless we reach 100% perfection, I think this can be simply addressed with some wording which includes a small disclaimer, e.g. "If Lichess recognizes the position is unwinnable, the game is drawn" opposed to something like "If the position is unwinnable, the game is drawn."

dennisjaheruddin avatar Jul 30 '21 12:07 dennisjaheruddin

You can offer a draw every 10 moves iirc

But I think for now the question of when to check can be left until later. We can just start with time-out games.

Although completeness is indeed something that needs to be considered. I'd say if we start doing this, we should really aim for total completeness if possible. It doesn't seem worth it to go through all this effort to adjudicate games correctly and then still miss some in the end. Although, if I understand this correctly, there have only been two games ever on Lichess that the quick version didn't judge correctly which if true I guess still seems acceptable, especially given the huge increase in performance. But if we can support the full version, I'd rather go for that.

benediktwerner avatar Jul 30 '21 14:07 benediktwerner

How about doing a full analysis on timeout and a quick analysis on every move?

That would ensure, that nearly every game is automatically ended as soon as the FIDE rules say they should end. In the rare circumstances that the quick mode misses an unwinnable position the players would have to play on until there is either a timeout, a draw by other means or a position that the analyzer can prove to be unwinnable. The important point from my point of view is that no game which reached an unwinnable position will ever be adjucated as won or lost.

Using the numbers provided by miguel-ambrona this should be easily possible from a performance point of view:

Quick mode: 5250 positions per second * 3 μs per position = 15.75 ms of single-core CPU time per second.

Full mode (assuming that every game ends in a timeout): 35 positions per second * 340 μs per position = 11.90 ms of single-core CPU time per second.

Both of those estimates are quite conservative, but they do only look at the average case and there are probably times where many more games are played than on average.

Nevsor avatar Jul 30 '21 14:07 Nevsor

@Nevsor, using the quick mode during the game and the full mode only after timeouts is a nice idea! I think it is the way to go in the long term.

In the short term, as @benediktwerner suggest, having a check only after timeouts could be good, possibly only with the quick algorithm to see how it performs.

By the way, let me clarify that the reported times do not consider the time needed for parsing the position, i.e., converting the FEN string into the data structure for a position. A small test suggests that parsing takes also around 3 us, but we should perform more complete benchmarks.

miguel-ambrona avatar Aug 01 '21 23:08 miguel-ambrona

@miguel-ambrona, presumably the code could be integrated in a way such that parsing the position is never needed? Though C++ and Scala interop seems a bit painful, so I'm wondering about how the code would perform if translated to Scala. My understanding is that it uses a lot of bitboards and thus popcount instructions and so would probably be significantly slower?

anematode avatar Sep 24 '21 18:09 anematode

@anematode, is what you suggest an alternative to the API idea? (By integrating the decision procedure inside the Lichess source?)

The latest "quick algorithm" is fairly simple and could be surely adapted to Scala. But, as you say, we may lose some performance, probably more than 3 us per position.

It could be better to write Scala bindings to the C++ library, but we would still need to convert the position between the two representations. How does Lichess represent positions?

miguel-ambrona avatar Sep 25 '21 09:09 miguel-ambrona

My Scala knowledge is not the best, but it seems that Lichess uses a Map[Pos, Piece] to represent the piece positions. Should not be too hard to convert those to bitboards: https://github.com/ornicar/scalachess/blob/master/src/main/scala/variant/Standard.scala

Nevsor avatar Sep 25 '21 10:09 Nevsor

What's the current status of this? A quick check every move and a full check on a timeout sounds reasonable.

Egroegw avatar Mar 01 '22 06:03 Egroegw

What communication mechanism should we use? stdin/stdout similar to bbpPairings should do. Could also create an HTTP API wrapper. Either of those are probably better than FFI or a Scala port.

niklasf avatar Apr 10 '22 16:04 niklasf

I could help with the implementation of either option. What would be necessary? Can you point to how bbpPairings (or other external services) are integrated into Lichess?

miguel-ambrona avatar Apr 12 '22 20:04 miguel-ambrona

https://github.com/lichess-org/lila/blob/8c21d7f869cf4cf6933f7cc640154fe09fa4a79b/modules/swiss/src/main/PairingSystem.scala#L27-L39

ornicar avatar Apr 12 '22 20:04 ornicar

Ahh ... mhh ... that doesn't quite look like I expected, since it's using temporary files and reading all output at once.

I assume this detector would do better as a long running process. The old modules/ai (deleted after we switched to fishnet) was an implementation of the UCI protocol. Maybe that's something to go by.

It set up the process like that: https://github.com/lichess-org/lila/blob/8de7364cb470593cc3b85f59408e9f41cc8892ed/modules/ai/src/main/Process.scala

And hooked it up to an actor: https://github.com/lichess-org/lila/blob/8de7364cb470593cc3b85f59408e9f41cc8892ed/modules/ai/src/main/ActorFSM.scala#L16-L19

niklasf avatar Apr 13 '22 09:04 niklasf

Would it be useful to have a modified tablebase that includes only those positions where random moves still draw against perfect play? For instance, if black runs out of time but the tablebase lookup shows that random moves by black draw against God, then the game is a draw.

I doubt that would catch every case we'd want to catch, but maybe it could reduce runtime resource consumption by doing some of the computation up front.

rpdelaney avatar May 11 '22 12:05 rpdelaney

What positions would you store in such tablebase?

miguel-ambrona avatar May 11 '22 17:05 miguel-ambrona

I suppose that I'd drop positions where white has any move that loses AND black has any move that loses. I've never gone plumbing into how tablebases are usually structured, but it would probably be simpler than existing EGTBs since all you need is a fast way to look up a position and find out which side(s) have a necessary draw.

rpdelaney avatar May 12 '22 15:05 rpdelaney

But we cannot store every single position (of interest) in a database. There are too many!

We could probably do it for all positions up to 7 pieces, but this won't be very helpful since most positions have more than 7 pieces. Also, the lookup time in such a table could be even take longer than the current quick algorithm.

miguel-ambrona avatar May 12 '22 17:05 miguel-ambrona

Yup, lookup would have to be faster than the general solution or there's no point. I suspect quite a bit more than 7 pieces would be do-able since a huge number of branches would get pruned, though.

rpdelaney avatar May 12 '22 18:05 rpdelaney

Also, tablebase lookup need not necessarily be faster than the quick solver. The critical advantage of a tablebase is that if a position is found in it, then the tb is absolutely authoritative about whether the position is a certain draw. So it seems more like an alternative approach to the slow solver.

I've been thinking about this more and it seems possible that it might actually be a complete solution for a slow solver. At some point, as more pieces are added to the board, the number of necessary draws would decrease rather than increase. So, unlike a conventional egtb, this one would have its complexity recede to a limit.

In other words, it's possible that with contemporary hardware, a tablebase could be generated and stored that has every necessarily drawn position. I don't know how to prove that besides doing it though.

rpdelaney avatar May 13 '22 13:05 rpdelaney

I think your idea is interesting, but I invite you to perform some calculations and realize how big such tablebase should be.

The critical advantage of a tablebase is that if a position is found in it, then the tb is absolutely authoritative about whether the position is a certain draw.

But how do you build the tablebase? With a solver, right? So it would be as good as the solver. The solvers that we propose (both quick and full) are sound, so they provide the same guarantees you mention.

At some point, as more pieces are added to the board, the number of necessary draws would decrease rather than increase.

I am not sure about this, the more pieces, the more combinations on parts of the board unrelated to the unwinnability motif.

In other words, it's possible that with contemporary hardware, a tablebase could be generated and stored that has every necessarily drawn position.

Assuming that the number of chess positions where White can't win is small enough so that they can be stored (if only this would be true), how could we generate them? We can't just enumerate all chess positions and filter them out, there are too many! Directly enumerating all unwinnable positions seems out of the scope too.

miguel-ambrona avatar May 13 '22 17:05 miguel-ambrona

I was thinking it would be generated similar to how EGTBs are generated, as I understand it: i.e., start at the game-end position and search backward, filtering as we go. A position that isn't a necessary draw is pruned from the search tree. Engines or other heuristics aren't used to generate EGTBs either, so I don't expect they would be needed there, or even helpful. In my head, it seems to me that having more pieces on the board means more ways to eventually lose, which means more branches getting pruned as we work backward.

But -- I think it's clear that we couldn't know for sure how big this draw-tablebase would get without actually generating it, and doing that implementation is a new, unorthodox thing that would take some significant work to even figure out how to do it (much like the approach you've taken). This is an open-source project, and nobody should be doing work that they don't find both interesting and promising, so I'll leave it there for now. You actually have a prototype, so that's what this issue should be about - I already feel too much time has been spent discussing my diversion.

rpdelaney avatar May 13 '22 18:05 rpdelaney

I’ve been working on helpmate tablebases for a few months now and make good progress despite limited time: https://github.com/kraktus/helpmate-tb

Knowing which positions are drawn in absolute is not enough because imo the FIDE rule takes the 50 (or at least 75)move rule into account, even if that is not very clear.

kraktus avatar May 13 '22 22:05 kraktus

@rpdelaney

I think it's clear that we couldn't know for sure how big this draw-tablebase would get without actually generating it.

We could try to estimate a lower-bound before actually going through the effort of implementing it.

I already feel too much time has been spent discussing my diversion.

It is worth exploring different options and I appreciate your comments, don't consider them a diversion. Your idea is great and brings a new perspective for addressing this problem, which is so hard and will only be completely solved by collaboration.

miguel-ambrona avatar May 15 '22 21:05 miguel-ambrona