Stopped containers with no background job can exhaust the queue
This is the second time we've had this problem. Users report it as, "my scrapers are stuck running".
What's actually happening is the queue is exhausted so runs are just retrying over and over. The queue is exhausted because there's a whole load of stopped containers with no corresponding background job. These count toward the number of slots free so even though the server isn't really running 20 jobs, it's still not taking on more scraper runs.
#1093 and #1092 were raised last time I was fixing this problem.
We're not really seeing this anymore because the regular docker prune -af is destroying all those containers. I'm going to close this, reopen if you think otherwise :)
We're not really seeing this anymore because the regular docker prune -af is destroying all those containers. I'm going to close this, reopen if you think otherwise :)
We're switched off those prunes and are seeing a buildup of stopped containers with no jobs again, so I'm reopening this.
I'm seeing this issue arise when disk space is very low. Disk space has been pretty good (I've been managing it) for the last few days and containers don't seem to be falling off the queue.
This could also be related to https://github.com/openaustralia/morph/issues/1056#issuecomment-286611098 being fixed recently.