hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Question of abnoamal large HiFi-only assemble output in heterozygous insect.

Open DO-T opened this issue 2 years ago • 3 comments

Hi! We try to assemble a diploid insect by HiFi-only mode in hifiasm(0.15) but the assembly result size both in hap1/2 are triple of we evaluation by kmer survey in illumian short reads.

We first adjusted the "-s" option in scale 0.4,0.3,0.35,0.2,0.1, however, the outputs are still remain in at least twifold of our evaluation.

And then we noticed the "--hom-cov" option so we reruned the assembly with "-s" option in scale 0.55,0.35,0.1 and " --hom-cov 52" which is our kmer distrubution result used hifi-reads. Unfortunately, no matter how can I adjust these options, the final results were still two/three-fold of our expectation.

This insect was 600Mb size and the heterozygosity about 4.2% in kmer survey by illumian short reads.

In outputs, we select the -s 0.55,0.3,0.1 output try to explaining why hap1 and hap2 are two/three-fold large. We used Mummer to find the synteny of itself with its relatives. The hap1 result has partial 2 contigs map to the one homeologous chromosome of its relative, and the hap2 has the same pattern, too.

How can I clean these redundant contigs or any options in hifiasm recommended? Waiting for your reply!

DO-T avatar Mar 21 '22 13:03 DO-T

Could you please have a try with the latest version (0.16.1)? Version 0.15 might have some issues for partial phased assemblies if the heterozygosity rate is extremely high.

chhylp123 avatar Mar 21 '22 14:03 chhylp123

Sorry for the late reply. I had used hifiasm(0.16) with options -s in scale 0.55,0.3,0.1 and --hom-cov 52 to assembly this insect genome in these days, yet the results were still upon our evaluation at least once.

DO-T avatar Mar 24 '22 07:03 DO-T

I see. Then you should have a try with purge_dups: https://github.com/dfguan/purge_dups.

chhylp123 avatar Mar 25 '22 00:03 chhylp123