browsertrix-crawler Screencasts with multiple workers eventually fail

I noticed that screencasts eventually fail when running with this configuration for archiving a set of page URLs with four workers with screencasting turned on for port 9037:

docker run -p 9037:9037 -it --rm -v $PWD:/crawls/ webrecorder/browsertrix-crawler:latest crawl --config /crawls/crawl.yaml

Things start out well, with four active screencasts. But eventually the screencasts start to disappear from the page, and eventually only one is left which seems stuck. I can see from the console that the four workers are still actively crawling. When I reload the screencast page http://localhost:9037 I can see in the terminal that the workers appear to be stopping/starting the screencasts, but the page doesn't reflect that anything has changed.

{"logLevel":"info","timestamp":"2023-08-11T13:46:43.985Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":3}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:43.985Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":0}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:43.986Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":2}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:44.078Z","context":"screencast","message":"Started Screencast","details":{"workerid":1}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:44.078Z","context":"screencast","message":"Started Screencast","details":{"workerid":3}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:44.079Z","context":"screencast","message":"Started Screencast","details":{"workerid":0}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:44.079Z","context":"screencast","message":"Started Screencast","details":{"workerid":2}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:58.940Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":1}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:58.941Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":3}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:58.941Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":0}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:58.941Z","context":"screencast","message":"Stopping Screencast","details":{"workerid":2}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:59.020Z","context":"screencast","message":"Started Screencast","details":{"workerid":1}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:59.021Z","context":"screencast","message":"Started Screencast","details":{"workerid":3}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:59.021Z","context":"screencast","message":"Started Screencast","details":{"workerid":0}}
{"logLevel":"info","timestamp":"2023-08-11T13:46:59.022Z","context":"screencast","message":"Started Screencast","details":{"workerid":2}

Aug 11 '23 13:08 edsu

@edsu to confirm, you're still seeing other messages that the crawler is running, just not any screencasts here? eg. the console indicates that crawl is progressing? I wonder if its a memory issue -- some of those pages seem to be fairly CPU/memory intensive. By default here (unlike in Browsertrix Cloud), there's no memory constraints. Have you tried running with less workers?

Aug 30 '23 21:08 ikreymer

Fixed a few issues in 0.11.1 that could have caused this, including closing the screencast never returning, page crashes, and browser crashes. Hopefully won't be getting stuck, if you have a chance to retry, and it happens again, let us know here.

Sep 18 '23 22:09 ikreymer

The screencasts are much more reliable now, thanks!

Sep 19 '23 15:09 edsu

I spoke too soon. After a few hours they all disappeared :-( i can share the log if it's helpful?

Sep 19 '23 19:09 edsu

I spoke too soon. After a few hours they all disappeared :-( i can share the log if it's helpful?

This is with 0.11.1? Yes, that would be helpful! Assume reloading the page didn't help, right?

Sep 19 '23 19:09 ikreymer

Yes, I did a a docker pull browsertrix-crawler:latest today. Here's the log!

crawl-20230919145843766.log.gz

You can see near the end of the log I tried to reload the page which seemed to trigger some messages like:

...
{"timestamp":"2023-09-19T18:31:57.155Z","logLevel":"info","context":"screencast","message":"Stopping Screencast","details":{"workerid":2}}
{"timestamp":"2023-09-19T18:31:57.155Z","logLevel":"info","context":"screencast","message":"Stopping Screencast","details":{"workerid":4}}
{"timestamp":"2023-09-19T18:31:57.155Z","logLevel":"info","context":"screencast","message":"Stopping Screencast","details":{"workerid":1}}
{"timestamp":"2023-09-19T18:31:57.155Z","logLevel":"info","context":"screencast","message":"Stopping Screencast","details":{"workerid":5}}
{"timestamp":"2023-09-19T18:31:57.155Z","logLevel":"info","context":"screencast","message":"Stopping Screencast","details":{"workerid":0}}
{"timestamp":"2023-09-19T18:31:57.233Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":3}}
{"timestamp":"2023-09-19T18:31:57.233Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":2}}
{"timestamp":"2023-09-19T18:31:57.233Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":4}}
{"timestamp":"2023-09-19T18:31:57.233Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":1}}
{"timestamp":"2023-09-19T18:31:57.234Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":5}}
{"timestamp":"2023-09-19T18:31:57.234Z","logLevel":"info","context":"screencast","message":"Started Screencast","details":{"workerid":0}}
...

It ran for at least an hour without a problem, which was an improvement on the prior behavior. I noticed that the CPU usage tapered off in htop, but I'm not sure what the cause of that was.

Sep 19 '23 20:09 edsu

This should be fixed in 1.x release, have not seen this issue for a while.

Jun 15 '24 18:06 ikreymer