hifiasm icon indicating copy to clipboard operation
hifiasm copied to clipboard

possible regression in using HiC reads in --h1 --h2

Open jbh-cas opened this issue 2 years ago • 8 comments

Version hifiasm_0.19.8-r603 using --h1 --h2 is stopping at a Partition step after the hic.p_ctg gfa is written.

Last version that we had run that worked for us was hifiasm_0.19.5-r590 so we reran using that and h1 and h2 gfas were successfuly created. Relevant log extracts below. (We have not run versions in between to triangulate but could if you would like. Also input HiC was not exactly the same as we tried different lanes but additional tests showed same stopping at Partition step with r603.)

Note [M::dedup_hits::0.000] ==> Dedup in r603 compared to [M::dedup_hits::2.269] ==> Dedup [M::dedup_hits::1.126] ==> Dedup in r590

Thanks for any insight, Jim Henderson

version 0.19.8-r603 log extract

hifiasm 0.19.8-r603
...
Writing hifiasm.asm.hic.p_ctg.gfa to disk... 
[M::ha_opt_update_cov] updated max_n_chain to 200
[M::gen_trans_base_count_comp::504.346] ==> Qualification
[M::build_unitig_index::336.160] ==> Counting
[M::build_unitig_index::0.000] ==> Memory allocating
[M::build_unitig_index::345.438] ==> Filling pos
[M::build_unitig_index::0.000] ==> Sorting pos
[M::build_unitig_index::681.598] ==> HiC index has been built
[M::write_hc_pt_index] Index has been written.
[M::alignment_worker_pipeline::705.241] ==> Qualification
[M::dedup_hits::0.000] ==> Dedup
[M::adjust_weight_kv_u_trans_advance::0.000] 
[M::mc_solve:: # edges: 0]
[M::mb_solve_core::0.006] ==> Partition
[M::mc_solve_core_adv::0.008] ==> Partition

hifiasm program just stops here. No error shown.

version 0.19.5-r590 log extract

hifiasm 0.19.5-r590
...
Writing hifiasm.asm.hic.p_ctg.gfa to disk... 
[M::ha_opt_update_cov] updated max_n_chain to 200
[M::gen_trans_base_count_comp::568.114] ==> Qualification
[M::build_unitig_index::239.886] ==> Counting
[M::build_unitig_index::59.715] ==> Memory allocating
[M::build_unitig_index::183.955] ==> Filling pos
[M::build_unitig_index::1.304] ==> Sorting pos
[M::build_unitig_index::484.863] ==> HiC index has been built
[M::write_hc_pt_index] Index has been written.
[M::alignment_worker_pipeline::414.203] ==> Qualification
[M::dedup_hits::2.269] ==> Dedup
[M::dedup_hits::1.126] ==> Dedup
[M::stat] # misjoined unitigs: 28 (N50: 1516338); # corrected unitigs: 56 (N50: 938380)
[M::adjust_weight_kv_u_trans_advance::4.329] 
[M::mc_solve:: # edges: 7272466]
[M::mb_solve_core::19.670] ==> Partition
[M::mc_solve_core_adv::71.107] ==> Partition
[M::adjust_weight_kv_u_trans_advance::6.831] 
[M::mc_solve:: # edges: 7283428]
[M::mb_solve_core::22.105] ==> Partition
[M::mc_solve_core_adv::28.921] ==> Partition
[M::adjust_weight_kv_u_trans_advance::6.789] 
[M::mc_solve:: # edges: 7283434]
[M::mb_solve_core::21.063] ==> Partition
[M::mc_solve_core_adv::25.306] ==> Partition
[M::stat] # heterozygous bases: 6726685291; # homozygous bases: 300550520
[M::reduce_hamming_error_adv::7.843] # inserted edges: 83806, # fixed bubbles: 423
[M::adjust_utg_by_trio] primary contig coverage range: [34, infinity]
[M::recall_arcs] # transitive arcs::262
[M::recall_arcs] # new arcs::387894, # old arcs::248568
[M::clean_trio_untig_graph] # adjusted arcs::0
[M::adjust_utg_by_trio] primary contig coverage range: [34, infinity]
[M::recall_arcs] # transitive arcs::428
[M::recall_arcs] # new arcs::395048, # old arcs::252238
[M::clean_trio_untig_graph] # adjusted arcs::0
[M::output_trio_graph_joint] dedup_base::11654549, miss_base::0
Writing hifiasm.asm.hic.hap1.p_ctg.gfa to disk... 
Writing hifiasm.asm.hic.hap2.p_ctg.gfa to disk... 
Inconsistency threshold for low-quality regions in BED files: 70%
[M::main] Version: 0.19.5-r590
[M::main] CMD: hifiasm_0.19.5-r590 --write-ec --write-paf -t 64 --h1 input/Nfusc_a1009_L4_R1_clean.fq.gz --h2 input/Nfusc_a1009_L4_R2_clean.fq.gz input/hifiasm.asm.ec.fa
[M::main] Real time: 48887.154 sec; CPU: 2376941.041 sec; Peak RSS: 231.264 GB

jbh-cas avatar Feb 15 '24 23:02 jbh-cas

I reran with same HiFi, HiC inputs as on hifiasm 0.19.5-r590 using hifiasm 0.19.6-r595 and 0.19.7-r598 and h1, h2 gfa files were created with both these versions.

As a reminder 0.19.8-r603 stops after two Partition steps and does not create the h1 or h2 gfa files as shown in log above.

I don't have r599 thru r602 built. Any ideas about why the program just ends after the Partitions steps.

thank very much.

jbh-cas avatar Feb 18 '24 23:02 jbh-cas

I have a similar issue with the current master (commit 1ac574adc78fbdaed2d2dcd49d5ea3deed7478de), but with an unnice signal 11:

...
[M::ha_print_ovlp_stat] # overlaps without large indels: 540614033
[M::ha_print_ovlp_stat] # reverse overlaps: 94144072
[M::ha_opt_update_cov_min] updated max_n_chain to 225
Writing reads to disk... 
Reads has been written.
Writing ma_hit_ts to disk... 
ma_hit_ts has been written.
Writing ma_hit_ts to disk... 
ma_hit_ts has been written.
bin files have been written.
[M::purge_dups] homozygous read coverage threshold: 44
[M::purge_dups] purge duplication coverage threshold: 56
[M::ug_ext_gfa::] # tips::37
Writing raw unitig GFA to disk... 
Writing processed unitig GFA to disk... 
[M::adjust_utg_by_primary] primary contig coverage range: [37, infinity]
Writing DBA2J.hic.p_ctg.gfa to disk... 
[M::ha_opt_update_cov] updated max_n_chain to 225
[M::gen_trans_base_count_comp::939.770] ==> Qualification
[M::build_unitig_index::58.395] ==> Counting
[M::build_unitig_index::28.325] ==> Memory allocating
[M::build_unitig_index::81.649] ==> Filling pos
[M::build_unitig_index::0.268] ==> Sorting pos
[M::build_unitig_index::168.642] ==> HiC index has been built
[M::write_hc_pt_index] Index has been written.
[M::alignment_worker_pipeline::1472.315] ==> Qualification
[M::dedup_hits::15.710] ==> Dedup
[M::dedup_hits::7.796] ==> Dedup
[M::stat] # misjoined unitigs: 1 (N50: 693811); # corrected unitigs: 2 (N50: 527464)
[M::adjust_weight_kv_u_trans_advance::0.920] 
[M::mc_solve:: # edges: 2259920]
[M::mb_solve_core::2.778] ==> Partition
Command terminated by signal 11
        Command being timed: "hifiasm -o DBA2J -t 96 -l0 --h1 DTG-HIC-408_R1_001.fastq.gz,DTG-HIC-410_R1_001.fastq.gz --h2 DTG-HIC-408_R2_001.fastq.gz,DTG-HIC-410_R2_001.fastq.gz TBG-4829_m84078_231116_130625_s1.hifi_reads.default.fastq.gz MouseStrainD2_TBG_4829_1.hifi_reads.fastq.gz"

AndreaGuarracino avatar Feb 20 '24 07:02 AndreaGuarracino

To be able to work with Hi-C data, I have to revert to the version with commit 94a284b4309837417dd9951a5f72a13d513d826e.

AndreaGuarracino avatar Mar 24 '24 13:03 AndreaGuarracino

Hi @AndreaGuarracino , is it possible that you can share the data with me? I could fix this issue as soon as possible. Sorry for the late reply.

chhylp123 avatar Apr 04 '24 05:04 chhylp123

@chhylp123, I will send you something soon!

AndreaGuarracino avatar Apr 15 '24 15:04 AndreaGuarracino