abyss icon indicating copy to clipboard operation
abyss copied to clipboard

Job using merged read pairs and unmerged read pairs gets killed

Open desmodus1984 opened this issue 1 year ago • 5 comments

Please report

  • [ ] version of ABySS with abyss-pe version abyss-pe (ABySS) 2.3.5
  • [ ] distribution of Linux with lsb_release -d Red Hat Enterprise Linux Server release 7.9 (Maipo)

Assembly error

  • [ ] complete abyss-pe command line abyss-pe k=96 B=600G name=PESU lib='pesu-a'
    pesu-a='/fs/scratch/PHS0338/PESU/PESU.nmrg.fwf.fq /fs/scratch/PHS0338/PESU/PESU.nmrg.rev.fq'
    se='/fs/scratch/PHS0338/PESU/PESU-merged.fq'
  • [ ] last 20 lines of the output of abyss-pe abyss-stack-size 65536 abyss-bloom-dbg -k96 -q3 -b600G -j48 /fs/scratch/PHS0338/PESU/PESU.nmrg.fwf.fq /fs/scratch/PHS0338/PESU/PESU.nmrg.rev.fq /fs/scratch/PHS0338/PESU/PESU-merged.fq > PESU-1.fa Running with max stack size of 65536 KB: abyss-bloom-dbg -k96 -q3 -b600G -j48 /fs/scratch/PHS0338/PESU/PESU.nmrg.fwf.fq /fs/scratch/PHS0338/PESU/PESU.nmrg.rev.fq /fs/scratch/PHS0338/PESU/PESU-merged.fq bash: line 1: 124468 Killed abyss-stack-size 65536 abyss-bloom-dbg -k96 -q3 -b600G -j48 /fs/scratch/PHS0338/PESU/PESU.nmrg.fwf.fq /fs/scratch/PHS0338/PESU/PESU.nmrg.rev.fq /fs/scratch/PHS0338/PESU/PESU-merged.fq > PESU-1.fa make: *** [/users/PHS0338/jpac1984/.conda/envs/abyss/bin/abyss-pe.Makefile:555: PESU-1.fa] Error 137 make: *** Deleting file 'PESU-1.fa' slurmstepd: error: Detected 1 oom-kill event(s) in StepId=12556334.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
  • [ ] number of sequenced bases 126895447800
  • [ ] estimated genome size and ploidy Diploid - 2.5GB
  • [ ] estimated sequencing depth of coverage ~ 55X

Build error

Consider installing ABySS using Homebrew on either Linux or macOS with brew install abyss, or using Bioconda with conda install abyss.

  • [ ] Have you tried installing ABySS using Brew or Bioconda? Yes. with conda
  • [ ] version of GCC or compiler with gcc --version gcc (GCC) 8.4.0
  • [ ] complete ./configure command line
  • [ ] last 20 lines of the output of ./configure
  • [ ] last 20 lines of the output of make

Hi, I am trying to assemble my genome (bat) using short-reads. I have trimmed the adaptor sequences. I also, heard that it is beneficial to merge overlapping read pairs to get "better" information, so I am interested in using them as single-reads. I have tried abyss before and it didn't use so much memory but now, I am assigning even 600 GB of Ram and the job gets killed. Any suggestion on how much memory I would need or I should assign to the job?

Thanks;

desmodus1984 avatar Aug 29 '22 19:08 desmodus1984

It looks like your job was terminated by the operating system because of lack of memory by the machine. How much RAM does the machine you're running on have? If it's less than 600G then that's the culprit, in which case lowering the B parameter to under the RAM limit of your machine should fix it. If you have, say 500G RAM, set B to something a bit lower than 500G (e.g. 450G) since ABySS needs a bit extra memory other than what is used for the Bloom filters (as set by the B parameter). For your genome size and coverage, though, you likely don't need that much memory. Something like B=100G should be more than sufficient if your machine has that much RAM.

vlad0x00 avatar Aug 29 '22 20:08 vlad0x00

Ok. Then, would it be okay to request a node with 150 GB and set the B parameter to 140 GB? Thanks;

desmodus1984 avatar Aug 29 '22 20:08 desmodus1984

Also, I wanted to try to parameterize a run using the paired de Bruijn graph mode. I sent my sample for sequencing and I got the following QC info for the library. The distribution of the insert sizes is very irregular: image As you can see by the QC the "peak" at 536 and the mean is 604 bp, but there are also plenty of ~1000 bp. The insert size values is the value minus the adapter sequences of ~120 bp, meaning a peak at about 416, and mean of 484. I used kmergenie to estimate the optimal k-mer size and it is 101.

Thanks.

desmodus1984 avatar Aug 29 '22 20:08 desmodus1984

Hi @desmodus1984,

For the memory request, I'd give yourself a bit more wiggle room in terms of memory. It's hard to say exactly how much extra memory on top of the Bloom filter the process will use for your data, but I'd request at least 25GB more than your Bloom filter size to be safe. (ex. B=125G, request 150GB)

Also, just so you know, we recommend running ABySS in the Bloom filter de bruijn graph mode (as you seemed to be using given the questions about parameters) over the paired de bruijn graph mode.

lcoombe avatar Aug 31 '22 16:08 lcoombe

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your interest in ABySS!

github-actions[bot] avatar Sep 22 '22 02:09 github-actions[bot]