hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Assembly much shorter than expected

Open Zazzyre opened this issue 1 year ago • 3 comments

Hi! I'm trying to reduce the number of contigs in an assembly I have that was initially put together in SMRT analysis software but getting a much shorter assembly than expected from both related species and the initial assembly. I'm using the raw css reads as the imput to hifiasm.

We know the genome should be around 1.2Gbp from 4 other closely related birds and the initial assembly had 3791 contigs and a length of 1305233108 bases which agrees with that length.

From hifiasm I am consistently getting a length of around 3646273bp with 145 contigs.

My run command is currently:

hifiasm -o /hifiasm_out/moplhifi.asm -l0 -t 16 --hg-size 1.2g -D 10 /1803-24278.ccs.fasta.gz

What can I do to retain the length? It's not worth the contig reduction if we lose almost the whole genome.

Zazzyre avatar Oct 11 '22 21:10 Zazzyre

Could you please show the log file? Thanks a lot.

chhylp123 avatar Oct 12 '22 21:10 chhylp123

Sure! Here's the log hifiasm.txt

Zazzyre avatar Oct 13 '22 00:10 Zazzyre

The k-mer plot is weird. A good HiFi dataset should have a k-mer plot like issue10 or issue49. In contrast, low quality HiFi data often lead to weird k-mer plot like issue93. For more detail, please see: https://hifiasm.readthedocs.io/en/latest/faq.html#why-does-hifiasm-stuck-or-crash. Could you please double check if the input HiFi reads have been processed by pbccs?

chhylp123 avatar Oct 13 '22 00:10 chhylp123