hifiasm
hifiasm copied to clipboard
Clarification of Hi-C Haplotypes
Hi,
I just wanted to clarify my understanding of the haplotypes produced in Hi-C mode.
Based on the paper and the docs, my interpretation of the hap1/hap2 output files when both HiFi and Hi-C data are used in the assembly process is:
- The contigs should be haplotigs, and typically
hifiasm
will correctly output all the haplotigs that form a chromosome in the same haplotype file - The Hi-C data can phase within chromosomes (e.g. the haplotig example above), but it can't cluster between chromosomes. To do this would require trio data.
- Therefore the haplotype files should typically consist of phased contig (i.e haplotigs) sequences that will constitute a chromosome, but the combination of chromosomes within a haplotype file are likely to be a mix of maternal and paternal origin.
- e.g.
hap1.p_ctg
might contain maternal chromosome 1 haplotigs, but paternal chromosome 2 haplotigs etc...
- e.g.
Is that roughly correct?
Thanks for the help Al
Yes, there is no inter-chromosome information in Hi-C, so that hifiasm cannot do that.