hifiasm
hifiasm copied to clipboard
Mis-identified homo peaks
Hi,
I am working on a plant species with an estimated genome size of 2.2 Gb and heterozygous rate of 0.5% (estimated by genomesCope based on HiFi data). I ran hifiasm with HiFi data and Valid Hi-C data. The k-mer histgram has three peak, which are heter_peak (at 28), homo_peak (at 57), and repeat_peak (at 110?). I found that after read correction, hifiasm mis-identified the repeat peak as homo_peak and outpout a hap1 assembly of 316 Mb and a hap2 assembly of 2567 Mb. Log file was attached hifiasm.log .
I also took the advice from #55 and manually set --purge-cov 73. This time hifiasm output a hap1 assembly of 2628 Mb and a hap2 assembly of 1956 Mb. It seems that there are 400Mb of hap2 sequence mis-assigned to hap1. I wonder whether it is caused by the wrong homo_peak?
Could you please set "--purge-cov 57"? "--purge-cov" should be set to the hom peak. I'm also writing a detailed manual about this.
I set --purge-cov 57. Now hap1 assembly is 2637 Mb and hap2 assembly is 2005 Mb.
I see. Then could you please set smaller value for '-s'? The default value of '-s' is 0.55, you can try '-s 0.4' or even '-s 0.3'. The unbalanced issue is usually caused by high heterozygosity rate.
BTW, hifiasm prefers raw hi-c reads or valid hi-c contact reads (filtered by hic-Pro with pre-assembled contigs)?
We have only tried raw reads. By utilizing trio as ground truth, the assemblies produced by raw hi-c reads look good enough. Probably valid hi-c contact reads might be also helpful, just haven't tried that.
One thing I forgot to mention: tuning '-s' won't affect the utg graph so that hifiasm can reuse Hi-C bin files. However, tuning '--purge-cov' may affect utg graph so that you need to delete 'hicbin'.
Is --purge-cov in the latest release? I'm using 0.15.5-r350 and get an unknown option error. Can't see it in the code either.
Yean, we changed it to two separate options: --hom-cov
and --purge-max
. See https://hifiasm.readthedocs.io/en/latest/faq.html#how-can-i-tweak-parameters-to-improve-hi-c-integrated-assembly.