hifiasm
hifiasm copied to clipboard
Two hap are larger than survey and flow cytometry result
Dear Haoyu
I have a species that survey and flow cytometry are 2.4G.
This is PacBio data survey.
This is next generation sequence data survey.
But the hifiasm default parameter results, hap1 is 5.5G, hap2 is 3.4G, p.ctg is 4.7G, p.utg is 8.7G.
hap1 BUSCO is
C:97.2%[S:25.2%,D:72.0%],F:0.9%,M:1.9%,n:425
413 Complete BUSCOs (C)
107 Complete and single-copy BUSCOs (S)
306 Complete and duplicated BUSCOs (D)
4 Fragmented BUSCOs (F)
8 Missing BUSCOs (M)
425 Total BUSCO groups searched
hap2 BUSCO is
C:87.0%[S:53.6%,D:33.4%],F:5.6%,M:7.4%,n:425
370 Complete BUSCOs (C)
228 Complete and single-copy BUSCOs (S)
142 Complete and duplicated BUSCOs (D)
24 Fragmented BUSCOs (F)
31 Missing BUSCOs (M)
425 Total BUSCO groups searched
Here is log.
Next, I add '--hom-cov 31' this parameter. I'm still not satisfied with the result. Here is new log 12723223.txt
hap1 is 4.8G, hap2 is 3.5G, p.ctg is 3.8G, p.utg is 8.4G. I haven't done a busco evaluation yet, but I don't think it will make a difference.
I'm confused. Can you give me some advice? I am looking forward to your response and will greatly appreciate any guidance or advice you provide in this situation.
Best Zhang
Sorry for the late reply as I was quite busy last month. In general, please see: https://hifiasm.readthedocs.io/en/latest/faq.html#for-hi-c-integrated-assembly-why-the-assembly-size-of-both-haplotypes-are-much-larger-than-the-estimated-genome-size. In addition, please make sure there is no issue for the dataset itself by looking at the log file: https://hifiasm.readthedocs.io/en/latest/faq.html#why-does-hifiasm-stuck-or-crash.