hifiasm
hifiasm copied to clipboard
Homozygous diploid trio assembly assigning largely ambiguous nodes to only one haplotype
Hi,
I'm assembling a 3 Gb diploid mammal, but we know it is likely to be quite homozygous. This is confirmed by the k-mer peak, and hifiasm "correctly" identifies the peaks as [M::ha_pt_gen] peak_hom: 46; peak_het: 24
.
However, I noticed the haplotype-resolved assemblies (using parental k-mers from yak) are quite unbalanced at times. Below is the dip.p_utg.gfa graph. There are for example 2 large nodes (yellow and green) that are given to both hap1 and hap2. These nodes have a similar amount of maternal (m) or paternal (p) assigned reads, or all ambiguous (a). However, the blue and red nodes are only present in hap2/maternal and near completely missing from hap1/paternal, despite the fact they are the single entry/exit nodes. The yak k-mers suggest these nodes are slightly more maternal than paternal, but are clearly overwhelmingly ambiguous, but are not assigned to both haplotypes. Is this the expected behaviour, or should the blue and red nodes (with ambiguous reads >> maternal reads) be assigned to both haplotypes?
I'm not sure if changing --hom-cov
would help since the peak is at 46, and since it is a trio -s
/-l
aren't on by default.