hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Duplicated chrX in male parental assemblies with hifiasm 0.16.1

Open projectoriented opened this issue 1 year ago • 3 comments

Hello 👋 ,

We've encountered multiple instances with our non-trio male assemblies, where both haplotypes contain contigs that map to chrX. Here's an example with HG00731 (pacbio ccs reads, 33X coverage):

hifiasm HG00731.asm HG00731.fq.gz

The fasta output is aligned to T2Tv2 with minimap2 and then visualized with SafFire. Figure below is a view of chrX in both haplotypes for HG00731. Complementary contigs are unique contigs that should theoretically be aggregated along with the duplicated ones on one haplotype. image

Dotplot below is one of the duplicated contigs that show 100% sequence identity. image

Below figure is what the chrX of HG00731 should look like. image

Do you know why this is happening? Please let me know if there is anything I can do to resolve the problem. Thanks!

projectoriented avatar Oct 11 '22 22:10 projectoriented

Do you use the dual assemblies or Hi-C phased assemblies?

chhylp123 avatar Oct 12 '22 21:10 chhylp123

I'm not sure what you mean by dual assembly. This is a standalone assembly for HG00731 with HiFi reads and no additional read support from other technologies or family members. Does that answer the question? 😅

projectoriented avatar Oct 12 '22 23:10 projectoriented

Thanks. We called it the dual assembly. Actually, it is possible that hifiasm may put contigs of chrX into two haplotypes. In this mode, hifiasm could only utilize the similarity among contigs to cluster/phase, but the similarity is not as reliable as Hi-C or trio. Tuning the parameter -s might be helpful, but the final results are still by chance. Probably you can utilize both haplotypes at the same time...?

chhylp123 avatar Oct 13 '22 00:10 chhylp123