lichess-bot
lichess-bot copied to clipboard
bot appears offline and doesn't accept challenges
After having been online and accepting changes for a while, my bot (https://lichess.org/@/simpleEval) doesn't accept challenges and appears offline (empty dot, instead of green dot next to the name). (The webpage mentions the bot was active 1h ago, about 3h after the last game). This has been happening always. The verbose log doesn't indicate anything that strikes me as special:
2020-09-25 18:18:07,163: <PopenProcess at 0xb5aeab50 (pid=19875)> >> info depth 18 seldepth 19 multipv 1 score cp 599 nodes 233216 nps 656946 tbhits 0 time 355 pv b7b5 a2a3 d8b6 g1f2 b6b7 f1g1 e7g8 g1h1 h8h1 g2h1 b7c8 d2f1 e8f7 f1g3 g4f3 h1f3 a5a4 f3h5 f7e7
2020-09-25 18:18:07,443: <PopenProcess at 0xb5aeab50 (pid=19875)> >> info depth 19 seldepth 23 multipv 1 score cp 599 nodes 419584 nps 660762 tbhits 0 time 635 pv b7b5 a2a3 d8b6 g1f2 b6b7 f1g1 e7g8 g1h1 h8h1 g2h1 b7c8 d2f1 e8f7 f1g3 a5a4 d1d3 f7g7 d4e5 d6e5 h1g2 a4b3
2020-09-25 18:18:07,889: <PopenProcess at 0xb5aeab50 (pid=19875)> >> info depth 20 seldepth 22 multipv 1 score cp 599 nodes 717037 nps 663923 hashfull 20 tbhits 0 time 1080 pv b7b5 a2a3 d8b6 g1f2 b6b7 f1h1 h8h1 d1h1 e7g6 h1h6 b7g7 h6h1 c6c5 d4d5 a5a4 b3b4 g4f3 d2f3 g7f7 b4c5 d6c5
2020-09-25 18:18:08,866: Ignoring illegal move h5g4 on board rn1qk2r/1p2n3/2pp1p2/p3p3/4P1b1/1P1PPN2/PBPN2B1/3R1RK1 w kq - 0 18
2020-09-25 18:18:08,871: --- https://lichess.org/pU512AqJ/black Game over
2020-09-25 18:18:08,872: <PopenProcess at 0xb5aeab50 (pid=19875)> << stop
2020-09-25 18:18:08,873: <PopenProcess at 0xb5aeab50 (pid=19875)> >> info depth 21 seldepth 22 multipv 1 score cp 599 nodes 1370551 nps 664026 hashfull 41 tbhits 0 time 2064 pv b7b5 a2a3 d8b6 g1f2 b6b7 f1h1 h8h1 d1h1 e7g6 h1h6 b7g7 h6h1 c6c5 d4d5 a5a4 b3b4 g4f3 d2f3 g7f7 b4c5 d6c5
2020-09-25 18:18:08,875: <PopenProcess at 0xb5aeab50 (pid=19875)> >> bestmove b7b5 ponder a2a3
2020-09-25 18:18:08,876: <PopenProcess at 0xb5aeab50 (pid=19875)> << quit
2020-09-25 18:18:08,915: +++ Process Free. Total Queued: 0. Total Used: 0
systemd seems to suggest all is up and running
$ sudo systemctl status lichessbot
● lichessbot.service - lichess-bot
Loaded: loaded (/etc/systemd/system/lichessbot.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-09-25 15:59:08 CEST; 6h ago
Main PID: 18564 (bash)
Tasks: 17 (limit: 4915)
CGroup: /system.slice/lichessbot.service
├─18564 /bin/bash /home/pi/chess/lichess-bot/start_bot.sh
├─18567 python3 lichess-bot.py -v -l out
├─18572 python3 lichess-bot.py -v -l out
├─18580 python3 lichess-bot.py -v -l out
├─18583 python3 lichess-bot.py -v -l out
├─18584 python3 lichess-bot.py -v -l out
├─18586 python3 lichess-bot.py -v -l out
└─18588 python3 lichess-bot.py -v -l out
Sep 25 18:18:01 raspberrypi bash[18564]: nodes: 336566
Sep 25 18:18:01 raspberrypi bash[18564]: score: {1: Score(cp=301, mate=None)}
Sep 25 18:18:04 raspberrypi bash[18564]: depth: 19
Sep 25 18:18:04 raspberrypi bash[18564]: nps: 665028
Sep 25 18:18:04 raspberrypi bash[18564]: nodes: 505422
Sep 25 18:18:04 raspberrypi bash[18564]: score: {1: Score(cp=600, mate=None)}
Sep 25 18:18:06 raspberrypi bash[18564]: depth: 21
Sep 25 18:18:06 raspberrypi bash[18564]: nps: 637030
Sep 25 18:18:06 raspberrypi bash[18564]: nodes: 1481096
Sep 25 18:18:06 raspberrypi bash[18564]: score: {1: Score(cp=600, mate=None)}
and I can ping lichess.org.
Any ideas what I might have done wrong. The code is at the current master with minimal changes to config.yml.default:
---
> token: "xxxxxxxxxxxxxxxx" # lichess OAuth2 Token
6c6
< name: "stockfish.simpleEval" # binary name of the engine to use
---
> name: "engine_name" # binary name of the engine to use
28c28
< Threads: 1 # max CPU threads the engine can use
---
> Threads: 2 # max CPU threads the engine can use
44c44
< abort_time: 60 # time to abort a game in seconds when there is no activity
---
> abort_time: 20 # time to abort a game in seconds when there is no activity
46c46
< move_overhead: 200 # increase if your bot flags games too often
---
> move_overhead: 2000 # increase if your bot flags games too often
49c49
< concurrency: 3 # number of games to play simultaneously
---
> concurrency: 1 # number of games to play simultaneously
51c51
< accept_bot: true # accepts challenges coming from other bots
---
> accept_bot: false # accepts challenges coming from other bots
66d65
< - ultraBullet
70c69
< - classical
---
> # - classical
last things in the logs now:
2020-09-26 14:14:47,591: <PopenProcess at 0xb49512f0 (pid=26287)> >> bestmove d4f3
2020-09-26 14:14:47,619: https://lichess.org:443 "POST /api/bot/game/nL8SXN2r/move/d4f3 HTTP/1.1" 200 11
2020-09-26 14:14:47,627: --- https://lichess.org/nL8SXN2r/black Game over
2020-09-26 14:14:47,629: <PopenProcess at 0xb49512f0 (pid=26287)> << quit
2020-09-26 14:14:47,670: +++ Process Free. Total Queued: 0. Total Used: 0
2020-09-26 14:56:45,348: Accept Blitz rated challenge from BOT oof231(2534)
2020-09-26 14:56:45,352: Resetting dropped connection: lichess.org
2020-09-26 14:56:50,495: https://lichess.org:443 "POST /api/challenge/Y0OJT6Sk/accept HTTP/1.1" 404 None
2020-09-26 14:56:50,497: Giving up api_post(...) after 1 tries (requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://lichess.org/api/challenge/Y0OJT6Sk/accept)
2020-09-26 14:56:50,497: Skip missing Blitz rated challenge from BOT oof231(2534)
and the bot appears offline.
I have similar issues, sometimes the bot just stops responding to anything - without any error in the console. A restart helps, but it would be good to add some internal check.
yes, restart helps. I have a cron job in place that restarts the bot if there has been no ouput for 30min or so.
Good idea, I will probably do something similar.
This problem occurs mostly when lichess is restarting itself, and apparently, the client loses connection with API without any warning + hangs on. Were there any tries to resolve this issue in the latest commits?
Perhaps some features added lately might help resolve this (if anyone is still watching this old issue).
Before the matchmaking feature was added, which lets bots issue challenges instead of waiting for them, lichess-bot passively listened for messages from lichess.org. Maybe, once lichess resets itself, it simply stops sending messages to old connections. Now, with matchmaking sending messages to lichess, this should keep the connection active, or cause an error in communication that would cause lichess-bot to establish a new connection.
Can the reporters here test the latest version of lichess-bot with matchmaking enabled?
Hi, thanks for the update. When I had this issue over a year ago, I ended up with writing my own little program which was handling both matchmaking and resetting lichess-bot (every few hours). I have a plan to update my oldish lichess-bot instance somewhere in the future, but not sure when it will actually happen - I will let you know eventually.
I worked around this issue by checking online status from the api, and if it's false and doesn't throw http exception restarts the service.
With the merge of PR #525, this should be fixed. We would appreciate it if the latest version could be tested by those who have had this problem. Feel free to reopen this issue if problems persist.
...
02:29:54 lichessbot[218390]: [08/13/22 02:29:54] INFO +++ Process Freed. Total Queued: lichess-bot.py:158
02:29:54 lichessbot[218390]: 0. Total Used: 0
04:02:14 lichessbot[218390]: [08/13/22 04:02:14] INFO Will reset connection with lichess-bot.py:276
04:02:14 lichessbot[218390]: lichess
05:02:15 lichessbot[218390]: [08/13/22 05:02:15] INFO Will reset connection with lichess-bot.py:276
05:02:15 lichessbot[218390]: lichess
06:02:15 lichessbot[218390]: [08/13/22 06:02:15] INFO Will reset connection with lichess-bot.py:276
06:02:15 lichessbot[218390]: lichess
07:02:15 lichessbot[218390]: [08/13/22 07:02:15] INFO Will reset connection with lichess-bot.py:276
07:02:15 lichessbot[218390]: lichess
08:02:16 lichessbot[218390]: [08/13/22 08:02:16] INFO Will reset connection with lichess-bot.py:276
08:02:16 lichessbot[218390]: lichess
09:02:16 lichessbot[218390]: [08/13/22 09:02:16] INFO Will reset connection with lichess-bot.py:276
09:02:16 lichessbot[218390]: lichess
10:02:17 lichessbot[218390]: [08/13/22 10:02:17] INFO Will reset connection with lichess-bot.py:276
10:02:17 lichessbot[218390]: lichess
11:02:17 lichessbot[218390]: [08/13/22 11:02:17] INFO Will reset connection with lichess-bot.py:276
11:02:17 lichessbot[218390]: lichess
12:02:18 lichessbot[218390]: [08/13/22 12:02:18] INFO Will reset connection with lichess-bot.py:276
12:02:18 lichessbot[218390]: lichess
13:02:18 lichessbot[218390]: [08/13/22 13:02:18] INFO Will reset connection with lichess-bot.py:276
13:02:18 lichessbot[218390]: lichess
14:02:19 lichessbot[218390]: [08/13/22 14:02:19] INFO Will reset connection with lichess-bot.py:276
14:02:19 lichessbot[218390]: lichess
15:02:19 lichessbot[218390]: [08/13/22 15:02:19] INFO Will reset connection with lichess-bot.py:276
15:02:19 lichessbot[218390]: lichess
16:02:20 lichessbot[218390]: [08/13/22 16:02:20] INFO Will reset connection with lichess-bot.py:276
16:02:20 lichessbot[218390]: lichess
@MarkZH don't see issue reopening button, but seems that li.reset_connection
is bust. DNS and pinging appear to be working fine, local time is in sync with ubuntu servers.
Can you run lichess-bot with verbose mode (python lichess-bot.py -v
) to get more info?
While making #525, I made a version that restarts lichess-bot (https://github.com/AttackingOrDefending/lichess-bot/tree/restart). You can try it out, although I don't know if it fixes the problem.
@theoden8 When lichess was restarting about 25 minutes ago, I opened my bot to test if it works. The check for if the bot was online /api/users/status?ids=attackingordefending
caused a Resetting dropped connection: lichess.org
which reconnected the bot to lichess and was able to receive challenges and play games. MarkZH was correct in saying that matchmaking would cause the bot to reconnect, but #525 made this run for every bot even if matchmaking is disabled. I don't know why this doesn't work for your bot but hopefully verbose mode will help. Can you run it with verbose mode?
I am, but it takes a few days usually for the bot to appear offline. My matchmaking is turned off.
Verbose output:
...
lichessbot: [08/20/22 18:59:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 19:09:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 19:19:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 19:29:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 19:39:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 19:49:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: DEBUG Starting new HTTPS connectionpool.py:1001
lichessbot: connection (1):
lichessbot: lichess.org:443
lichessbot: DEBUG https://lichess.org:443 "GET connectionpool.py:456
lichessbot: /api/users/status?ids=theoden
lichessbot: 8_uci HTTP/1.1" 200 None
lichessbot: [08/20/22 19:49:16] INFO Will reset connection with lichess-bot.py:276
lichessbot: lichess
lichessbot: [08/20/22 19:59:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 20:09:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 20:19:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 20:29:15] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 20:39:16] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: [08/20/22 20:49:16] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
lichessbot: DEBUG Starting new HTTPS connectionpool.py:1001
lichessbot: connection (1):
lichessbot: lichess.org:443
lichessbot: DEBUG https://lichess.org:443 "GET connectionpool.py:456
lichessbot: /api/users/status?ids=theoden
lichessbot: 8_uci HTTP/1.1" 200 None
lichessbot: INFO Will reset connection with lichess-bot.py:276
lichessbot: lichess
lichessbot: [08/20/22 20:59:16] DEBUG Event: {'type': lichess-bot.py:176
lichessbot: 'correspondence_ping'}
...
It's on repeat, with very little variation.
I'm experiencing something similar (not sure if exactly the same), but the bot stops accepting challenges.
From about line 4300 onwards challenges are no longer accepted.
Logs: https://gist.github.com/ianagbip1oti/fade2db9c5e296f1748c49641716192e
@theoden8 Can you test the version that restarts lichess-bot with verbose mode?
@AttackingOrDefending switched to your branch for now Note that lichess-bot disconnected exactly once in the past two weeks so will take some time.
@theoden8 Any updates on whether it is working or not?
It hasn't failed yet, I do wonder if there's a way to tell if it's an improvement or not.
If you see Will restart lichess-bot
followed by
. _/|
. // o\
. || ._) lichess-bot 1.2.0
. //__\
. )___( Play on Lichess with a bot
in the logs, then your bot restarted successfully. Otherwise, your bot just didn't disconnect.
I was able to find the following in the logs:
...
Oct 01 04:26:24 lichessbot[1236937]: DEBUG Resetting dropped connection: connectionpool.py:273
Oct 01 04:26:24 lichessbot[1236937]: lichess.org
Oct 01 04:26:24 lichessbot[1236937]: DEBUG https://lichess.org:443 "GET connectionpool.py:456
Oct 01 04:26:24 lichessbot[1236937]: /api/users/status?ids=theoden
Oct 01 04:26:24 lichessbot[1236937]: 8_uci HTTP/1.1" 200 None
Oct 01 04:26:24 lichessbot[1236937]: INFO Will restart lichess-bot lichess-bot.py:315
Oct 01 04:26:24 lichessbot[1236937]: INFO Terminated lichess-bot.py:322
Oct 01 04:26:35 lichessbot[1236937]: [10/01/22 04:26:35] INFO lichess-bot.py:1104
Oct 01 04:26:35 lichessbot[1236937]: . _/|
Oct 01 04:26:35 lichessbot[1236937]: . // o\
Oct 01 04:26:35 lichessbot[1236937]: . || ._) lichess-bot
Oct 01 04:26:35 lichessbot[1236937]: 1.2.0
Oct 01 04:26:35 lichessbot[1236937]: . //__\
Oct 01 04:26:35 lichessbot[1236937]: . )___( Play on Lichess
Oct 01 04:26:35 lichessbot[1236937]: with a bot
Oct 01 04:26:35 lichessbot[1236937]:
Oct 01 04:26:35 lichessbot[1236937]: DEBUG Starting new HTTPS connectionpool.py:1001
Oct 01 04:26:35 lichessbot[1236937]: connection (1):
Oct 01 04:26:35 lichessbot[1236937]: lichess.org:443
Oct 01 04:26:35 lichessbot[1236937]: DEBUG https://lichess.org:443 "GET connectionpool.py:456
Oct 01 04:26:35 lichessbot[1236937]: /api/account HTTP/1.1" 200
Oct 01 04:26:35 lichessbot[1236937]: None
Oct 01 04:26:35 lichessbot[1236937]: INFO Welcome theoden8_uci! lichess-bot.py:1111
Oct 01 04:26:35 lichessbot[1236937]: INFO You're now connected to lichess-bot.py:125
Oct 01 04:26:35 lichessbot[1236937]: https://lichess.org/ and
Oct 01 04:26:35 lichessbot[1236937]: awaiting challenges.
Oct 01 04:26:35 lichessbot[3647389]: DEBUG Starting new HTTPS connectionpool.py:1001
Oct 01 04:26:35 lichessbot[3647389]: connection (1):
Oct 01 04:26:35 lichessbot[3647389]: lichess.org:443
Oct 01 04:26:35 lichessbot[1236937]: DEBUG https://lichess.org:443 "GET connectionpool.py:456
Oct 01 04:26:35 lichessbot[1236937]: /api/account/playing
Oct 01 04:26:35 lichessbot[1236937]: HTTP/1.1" 200 17
Oct 01 04:26:35 lichessbot[3647389]: DEBUG https://lichess.org:443 "GET connectionpool.py:456
Oct 01 04:26:35 lichessbot[3647389]: /api/stream/event HTTP/1.1"
Oct 01 04:26:35 lichessbot[3647389]: 200 None
...
Which suggests that connection was successfully re-established. Should I provide any additional information?
If that was the only occurrence, then I don't think that we need anything else.
As far as I can read the logs, it appears so.
I assume that this behavior occurs when Lichess deploys changes... but I don't know how this client can automatically reconnect.
The idea of this change is that if the bot is offline according to lichess, we restart lichess-bot. I am not 100% sure it works, but it is really hard to test a change if the problem happens once every 2 weeks.
The bot did restart itself successfully, as per the most recent logs. It used to happen much more often before, every 2-3 days or so. This is what I had previously: https://gist.github.com/theoden8/2ff9dbe3957b105cedf421d5d960cd34
Awesome! I'm predicting a lila-ws restart sometime tomorrow, but I fully trust that your code works and would like to install it on my machine too...
@AttackingOrDefending I changed the way I filter logs, and discovered a drop on 20th, 24th, 27th, 1st, 8th, and 15th. All restarted after a dropped connection similar to log above.
That's good. So the change is working.
The changes from @AttackingOrDefending (PR #584) have been merged, so I'm closing this issue as successful.