hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

HiFiasm out of memory

Open charlottellcc opened this issue 2 years ago • 10 comments

Hello,

I have a memory problem when trying to assemble my heterozygous plant genome. I managed to get the haplotypes with 220G of memory and 32 CPUs but I can't produce the primary/alternative assembly. I have 115 Gb of corrected data (< Q20) as input. The quality of the data seems to be correct and the distribution of the Kmers also :

image

I tried the -f39 option to save memory but it didn't work. I did not find a solution in the issues that were listed in the FAQ.

Thanks for your help,

Ch

charlottellcc avatar Jan 27 '22 15:01 charlottellcc

Could you please show the log of hifiasm?

chhylp123 avatar Jan 27 '22 15:01 chhylp123

Sorry I cannot find it...

chhylp123 avatar Jan 27 '22 15:01 chhylp123

Thanks a lot. I guess it is just caused by too much data, and hifiasm should work with a little bit more RAM.

chhylp123 avatar Jan 27 '22 16:01 chhylp123

Indeed, an assembly with less data works (I have already tried), but we would like to use all the data, if possible. I will ask about requesting more memory on our cluster.

Thanks for your help,

Charlotte

charlottellcc avatar Jan 27 '22 16:01 charlottellcc

Hi Charlotte,

I may encounter a similar issue. But I wonder when you say "115 Gb of corrected data (< Q20)", did you refer to Pacbio reads.bam from ccs? https://ccs.how/faq/reads-bam

fengli-eGen avatar Mar 14 '22 22:03 fengli-eGen

Hi Fengli,

Indeed, I use ccs reads from PacBio or I make a correction on raw reads (sequencing output without correction) and I correct with the ccs script from SMRTLink.

Charlotte

charlottellcc avatar Mar 15 '22 08:03 charlottellcc

Cool! Thank you for replying, Charlotte. I wonder when you ran ccs, did you run it with min_passes>=3 and min_quality (Q) 20 (HiFi standards)? I'm trying to run CLR ccs (not raw reads) using hifiasm, since my data don't have enough coverage for HiFi standards. Thanks!

fengli-eGen avatar Mar 15 '22 14:03 fengli-eGen

Hi,

Yes I used the PacBio recommendations by correcting my data in 3 passes (=reads hifi > Q20). I have never used CLR reads with HiFiasm but I am not sure if it works: See Issue "Can hifiasm take CLR and ccs.bam?"

Good luck,

Charlotte

charlottellcc avatar Mar 16 '22 08:03 charlottellcc

Thanks Charlotte!

fengli-eGen avatar Mar 16 '22 20:03 fengli-eGen

I also have 160GB data and I have 232GB RAM but it is still failing. Is there a trick around the solve this problem?

Jokendo-collab avatar Oct 20 '22 15:10 Jokendo-collab

@Jokendo-collab No. You need a machine with 512GB RAM, or with less data but that will make assembly worse.

lh3 avatar Oct 20 '22 15:10 lh3