Fix memory issues on batch workers
- move batch report generation to slow worker so it can use a little more memory and will not run as concurrent
- fix large memory leak in web_conn task
- move minor memory leak tasks to nassl worker
- add dashboard for batch monitoring
- disable swap on all containers
resolves #1420
Thanks! Could you (or @mxsasha) test what the effect of this change is on memory usage and report generation duration, especially with large batches of domains? Thanks.
I don't have a test setup ready for this with large amount of domains, so I don't have a baseline to compare this against. I can deploy this to the dev instance so we can test it there?
Note that, documented in https://github.com/internetstandards/Internet.nl/issues/1395#issuecomment-2100317495 - 3G, which it looks like you're setting in defaults.env. was nowhere near enough to generate a report. And, misleadingly, concurrency appeared to affect CPU load, but not total memory load.
Can we more clearly document what containers, workers and queues there are and what their respective default resource settings are (memory, concurrency, periodic restarts, etc.)?
For example: in Makefile there is also various concurrency settings, also for batch_slow: https://github.com/internetstandards/Internet.nl/blob/ef097b03914cd7566861f2a525be32a8c506903b/Makefile#L156
How do these relate to the concurrency settings in default.env, like WORKER_SLOW_CONCURRENCY=2?
@baknu they don't relate at all. Those are leftovers from the development environment before Docker and only apply when running in development without Docker.
@mxsasha shall I make a PR to remove outdated Makefile commands and clean it up in general or do you still use those commands in your workflow?
I'll add a table to documentation with the relevant settings per container, maybe good to extend it with all options you can/might overwrite in local.env?
@mxsasha shall I make a PR to remove outdated Makefile commands and clean it up in general or do you still use those commands in your workflow?
I do not use them, so please do clean out the cruft. We can always dig it back up in git history.