Memory not released by hq-server after running hq job forget
Hi HyperQueue team,
First off, thank you for the excellent work on HyperQueue. It has been working great for me -- especially when used with a compatible version of Nextflow (hq 0.17.0, nextflow 24.10.2). HyperQueue has really improved our workflow pipeline by efficiently managing many small tasks, which is important for us since our cluster penalizes large numbers of small jobs submitted individually.
I’ve encountered a potential issue regarding memory usage on the hq server. Over time, as more jobs are submitted, the server’s memory usage keeps increasing. For example, our current hq-server process is using over 4.6 GB (RES column in htop) of RAM on the login node, which accounts for nearly 30% of the node’s total memory. The newest version of HyperQueue 0.22.0 also follow this pattern, although memory usage increases much slower.
I tried running:
hq job forget --filter finished,failed,canceled all
This appears to help temporarily -- newly submitted jobs don’t cause a further increase in memory usage for a while. However, it seems that the memory already used by the hq-server process is not released after the forget operation. My current workaround is to use hq 0.22.0 and repeatly run hq job forget, such as watch -n 60 'hq job forget --filter finished,failed,canceled all'.
I’m wondering if this is expected behavior or a potential area for improvement. Conceptually, if job metadata is stored in a structure like Vec<JobInfo>, maybe the vector is being cleared without calling shrink_to_fit() or similar? That could explain why memory usage doesn’t drop even after forgotten jobs are removed. Or is this a behavior from the Rust/system allocator that it won't return memory to the system?
Would it be feasible to add an option to explicitly release unused memory after forgotten jobs are removed? I understand memory management can be complex and sometimes OS-level caching is involved, but any insight would be appreciated.
Thanks again for all your hard work on this project!
Best, Bing
Hi, thanks for the issue report. HQ uses the jemalloc allocator by default (unless you recompile HQ yourself without it), which might be a bit reluctant in returning memory the OS, indeed.
That being said, we weren't calling shrink_to_fit before, and it sounds like a good idea. Implemented that in https://github.com/It4innovations/hyperqueue/pull/865.
Thank you for your reply. I believe the additional changes and suggested recompilation make sense. I will try the allocator suggestion and see how it works.
Btw: How many workers were connected in your case? We are offering forgetting the tasks but not workers. These data are not big, but they still may grow.
@Kobzol Maybe we should provide some command "forget everything what is not running/waiting" that would also forget the workers.
@bguo068 Could you please check if this issue still persists with the latest nightly release? Thank you!
Hi @Kobzol, thank you for the updates. In this nightly release, I've noticed similar behaviors to the last stable version. I still need to test a version compiled without jemalloc. Do you have any examples of how to replace jemalloc with alternative allocators? I can recompile it and test.
cargo build --release --no-default-features
Could you please provide us with the submit commands that you use, or at least some statistics, e.g. how many jobs/tasks do you submit, if you attach stdin to submits, if you use task array with some input data?
Hi @Kobzol, for an experiment, I'm using the following to submit jobs:
while true; do ./hq submit echo 1 --stream /dev/null; done
After submitting 36,422 jobs, the memory usage reached 70,628 Kb (approximately 70 Mb). If we keep hq server running and continue submitting a large number of jobs over time, the memory usage will increase further.
.
I'm wondering if it would be better to cache the finished, waiting, canceled, and failed jobs to disk, while only keeping the running ones in memory.
Hmm, I guess that's kind of expected, that if you submit a large number of jobs, he memory will keep increasing :) I was more wondering if the memory usage gets reduced after you hq job forget a large number of already finished jobs.
Btw, HQ is more designed for having a smaller number of jobs, and a large number of tasks per job, if you want to achieve maximum efficiency.
With the nightly release or recompilation of the main branch, the "hq job forget" command still doesn't return much memory to the system but effectively prevents memory from increasing until new jobs occupy the unreleased memory.
If HQ isn't designed for a larger number of jobs, that's okay. I may continue running "hq job forget" to prevent memory accumulation. I will also look into tasks.
Thanks!
HQ is optimized for millions of tasks while having relatively few jobs. A job (in HQ terminology) is kind of user "namespace" for tasks. So it will have smaller memory footprint when you create a large number of tasks rather than jobs. Nevertheless, even with tasks, you still need to call hq job forget to release the memory.
@Kobzol Maybe, we can implement something like "auto-forget" mechanism, so a finished job is automatically forgot after some specific time.
btw: There are some jmalloc tunning options that may force the jmalloc to return memory to OS more actively: https://github.com/jemalloc/jemalloc/blob/dev/TUNING.md