fcs icon indicating copy to clipboard operation
fcs copied to clipboard

FCS-GX bypasses memory limits set in the bsub command on the LSF platform

Open eeaunin opened this issue 1 month ago • 1 comments

Hello. This issue is a follow-up for issue #69. When running FCS-GX, the LSF logs always underreport how much memory FCS-GX uses. Below is an example of an LSF log from an FCS-GX run with a tiny FASTA file. The run completed successfully.

Successfully completed.

Resource usage summary:

    CPU time :                                   510.08 sec.
    Max Memory :                                 49 MB
    Average Memory :                             42.99 MB
    Total Requested Memory :                     512000.00 MB
    Delta Memory :                               511951.00 MB
    Max Swap :                                   -
    Max Processes :                              16
    Max Threads :                                33
    Run time :                                   518 sec.
    Turnaround time :                            520 sec.

The output (if any) is above this job summary.

The log says max memory use was 49 Mb but I think this is not true and FCS-GX really uses at least 470 Gb memory for each run. When submitting LSF jobs, there is a maximum memory use limit set in the bsub command, e.g. bsub -n1 -R"span[hosts=1]" -M5000 -R 'select[mem>5000] rusage[mem=5000]', and normally LSF terminates jobs that go over this this limit and emits the TERM_MEMLIMIT: job killed after reaching LSF memory usage limit message. However, this doesn't work properly with FCS-GX. FCS-GX ignores the memory limit set by the user in the bsub command. It uses ~470 Gb memory of the compute node anyway, regardless of whether the LSF job's memory limit permits it or not. When this causes the compute node to run out of memory, Linux scheduler has to kill the FCS-GX process. I don't know any other software that behaves like this on the LSF platform. Do you know what causes this and if there is a way to fix it?

These are the software versions used in my recent runs: OS: Ubuntu 22.04.4 LTS Singularity: v3.11.4 FCS image: 0.5.0 Python: 3.8.12 Platform: LSF

eeaunin avatar May 09 '24 04:05 eeaunin