hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Please clarify scaffolding with HiC

Open tallnuttrbgv opened this issue 1 year ago • 2 comments

Hi,

I have a 0.9Gbp plant genome with HiFi and Hi-C data. HiFiasm has given me an assembly the correct size with 1000 contigs. Can you please clarify this paragraph in the documentation: https://hifiasm.readthedocs.io/en/latest/hic-assembly.html. It says to use SALSA (single cell data) or 3d-DNA to scaffold the contigs. However, 3d-DNA requires juicer mapping output, which in turn requires known chromosome lengths, which I do not know (or the number of chromosomes). Does this mean that with a non-model genome the contigs cannot be scaffolded using the Hi-C data? I'm new to Hi-C and not sure how Hifiasm uses it. I assume that it is phasing the chromosomes but as it says in docs, not scaffolding with it.. if so do you know if this can be done?

Thanks.

tallnuttrbgv avatar Jan 12 '24 03:01 tallnuttrbgv

Hi, take a look at YaHS: https://github.com/c-zhou/yahs

It is quite well regarded and easy to use.

It is true that hifiasm only phases the contigs (hap1 and hap2), but you can scaffold these separately, and get both genomes out of a diploid species.

Ole

olekto avatar Jan 12 '24 08:01 olekto

Yes, I agree with @olekto

chhylp123 avatar Jan 13 '24 10:01 chhylp123