hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Different N50 and assembly lenght with and without -l0.

Open harmeet1990 opened this issue 2 years ago • 1 comments

I am assembling a fairly homozygous polyploid plant genome with and without -l0 (purge duplicates) flag using ~30x CCS data. I am getting slightly different N50 and assembly size in the *.bp.p_ctg.gfa file. Without -l0 the total assembly size is 14.5 Gb and N50 is 29 Mb. However, with -l0 the assembly size is 14.6 Gbp with an N50 of 27 Mb. Which assembly should be I rely upon? In my understanding, the assembly without purge duplicates (-l0) should have been more contiguous?

harmeet1990 avatar Feb 26 '22 20:02 harmeet1990

For homozygous genomes, it would be better to use -l0. Purging with -l3 may collapse some repeats, resulting in a little bit better N50.

chhylp123 avatar Feb 27 '22 06:02 chhylp123