hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

Suggestions on Large Genome Assembly

Open haiyun-fan opened this issue 1 year ago • 6 comments

Dear authors, We have a genome about 5G, and the amount of HIFI data is about 134G. We want to get a better preliminary assembly result with hifiasm software with the default parameter: hifiasm - o xx - t 142 hifidata. But a week has passed, and the program has still been k-mer analysis. Do you have any optimization parameters for large genome? I look forward to your reply and help !!

Here are tail part information of log : [M::ha_hist_line] rest: ***************************************************************************************************> 69072 [M::ha_analyze_count] left: none [M::ha_analyze_count] right: none [M::ha_pt_gen] peak_hom: 180; peak_het: -1 [M::ha_pt_gen::12436.69114.28] ==> indexed 3603072461 positions

haiyun-fan avatar Oct 09 '22 08:10 haiyun-fan

Could you please show the whole log file? Thanks a lot.

chhylp123 avatar Oct 09 '22 19:10 chhylp123

Thanks a lot ! Here are the log :

[M::ha_analyze_count] lowest: count[203] = 154269 [M::ha_analyze_count] highest: count[4095] = 807568 [M::ha_hist_line] 2: ****************************************************************************************************> 1021944427 [M::ha_hist_line] 3: ****************************************************************************************************> 354448672 [M::ha_hist_line] 4: ****************************************************************************************************> 189648304 [M::ha_hist_line] 5: ****************************************************************************************************> 126729238 [M::ha_hist_line] 6: ****************************************************************************************************> 97914893 [M::ha_hist_line] 7: ****************************************************************************************************> 83650926 [M::ha_hist_line] 8: ****************************************************************************************************> 75093363 [M::ha_hist_line] 9: ****************************************************************************************************> 69406939 [M::ha_hist_line] 10: ****************************************************************************************************> 65233416 [M::ha_hist_line] 11: ****************************************************************************************************> 61254129 [M::ha_hist_line] 12: ****************************************************************************************************> 58283811 [M::ha_hist_line] 13: ****************************************************************************************************> 55494917 [M::ha_hist_line] 14: ****************************************************************************************************> 53086730 [M::ha_hist_line] 15: ****************************************************************************************************> 51221542 [M::ha_hist_line] 16: ****************************************************************************************************> 49450933 [M::ha_hist_line] 17: ****************************************************************************************************> 47908648 [M::ha_hist_line] 18: ****************************************************************************************************> 46613079 [M::ha_hist_line] 19: ****************************************************************************************************> 45393525 [M::ha_hist_line] 20: ****************************************************************************************************> 44071534 [M::ha_hist_line] 21: ****************************************************************************************************> 42956681 [M::ha_hist_line] 22: ****************************************************************************************************> 41788201 [M::ha_hist_line] 23: ****************************************************************************************************> 40843446 [M::ha_hist_line] 24: **************************************************************************************************> 39658823 ... [M::ha_hist_line] 4079: 462 [M::ha_hist_line] 4080: 437 [M::ha_hist_line] 4081: 491 [M::ha_hist_line] 4082: 445 [M::ha_hist_line] 4083: 440 [M::ha_hist_line] 4084: 454 [M::ha_hist_line] 4085: 472 [M::ha_hist_line] 4086: 449 [M::ha_hist_line] 4087: 469 [M::ha_hist_line] 4088: 450 [M::ha_hist_line] 4089: 434 [M::ha_hist_line] 4090: 457 [M::ha_hist_line] 4091: 439 [M::ha_hist_line] 4092: 440 [M::ha_hist_line] 4093: 433 [M::ha_hist_line] 4094: 468 [M::ha_hist_line] 4095: **************************************************************************************************** 807568 [M::ha_hist_line] rest: 0 [M::ha_analyze_count] left: count[204] = 154683 [M::ha_analyze_count] right: none [M::ha_ft_gen] peak_hom: 4095; peak_het: 204 [M::ha_ft_gen::9492.775[email protected]] ==> filtered out 808036 k-mers occurring 4094 or more times [M::ha_opt_update_cov] updated max_n_chain to 20475 [M::ha_pt_gen::11605.11512.16] ==> counted 521111167 distinct minimizer k-mers [M::ha_pt_gen] count[4095] = 0 (for sanity check) [M::ha_analyze_count] lowest: count[179] = 10422 [M::ha_analyze_count] highest: count[180] = 10540 [M::ha_hist_line] 1: ****************************************************************************************************> 365777518 [M::ha_hist_line] 2: ****************************************************************************************************> 39765949 [M::ha_hist_line] 3: ****************************************************************************************************> 14175841 [M::ha_hist_line] 4: ****************************************************************************************************> 7673936 [M::ha_hist_line] 5: ****************************************************************************************************> 5184283 [M::ha_hist_line] 6: ****************************************************************************************************> 4028496 ... [M::ha_hist_line] 1588: * 54 [M::ha_hist_line] 1589: * 80 [M::ha_hist_line] 1590: * 81 [M::ha_hist_line] 1591: * 69 [M::ha_hist_line] 1592: * 81 [M::ha_hist_line] 1593: * 64 [M::ha_hist_line] 1594: * 83 [M::ha_hist_line] 1595: * 91 [M::ha_hist_line] rest: ***************************************************************************************************> 69072 [M::ha_analyze_count] left: none [M::ha_analyze_count] right: none [M::ha_pt_gen] peak_hom: 180; peak_het: -1 [M::ha_pt_gen::12436.69114.28] ==> indexed 3603072461 positions

haiyun-fan avatar Oct 10 '22 01:10 haiyun-fan

I don't know how to handle this situation,any suggestions about it are very appreciated!

haiyun-fan avatar Oct 10 '22 11:10 haiyun-fan

Could you please upload the whole log file? One possibility is that the input HiFi reads are not such clean (see: https://hifiasm.readthedocs.io/en/latest/faq.html#why-does-hifiasm-stuck-or-crash). The k-mer plot outputted by hifiasm is able to help us with quick debugging.

chhylp123 avatar Oct 10 '22 13:10 chhylp123

Here are all the log :

log.txt

haiyun-fan avatar Oct 10 '22 14:10 haiyun-fan

The k-mer plot is weird, please see FAQ here: https://hifiasm.readthedocs.io/en/latest/faq.html#why-does-hifiasm-stuck-or-crash.

chhylp123 avatar Oct 10 '22 14:10 chhylp123