tombo icon indicating copy to clipboard operation
tombo copied to clipboard

IndexError:index is out of bounds

Open weir12 opened this issue 5 years ago • 3 comments

Hi Marcus When tombo re-squiggle a fast5_fn, which has been annotate_raw_with_fastqs by tombo. An fast5 file in folder raised an unexpected error.

BaseCalled_template:::/home/weir/covid19_nanopore/raw_data/kim_rawdata/8F6N9/single_fast5/IVT/372/7b86be7e-3212-4cc9-804d-248094721897.fast5
:::
Traceback (most recent call last):
  File "/home/weir/software/tombo/tombo/tombo_helper.py", line 94, in banded_traceback
    return c_banded_traceback(*args, **kwargs)
  File "tombo/_c_dynamic_programming.pyx", line 305, in tombo._c_dynamic_programming.c_banded_traceback
NotImplementedError: Read event to sequence alignment extends beyond bandwidth

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1575, in _resquiggle_worker
    map_res, rsqgl_params, fast5_fn, all_raw_signal)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1491, in run_rsqgl_iters
    seq_samp_type=seq_samp_type)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1168, in resquiggle_read
    seq_samp_type=seq_samp_type, reg_id=map_res.align_info.ID)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1038, in find_adaptive_base_assignment
    rsqgl_params.band_bound_thresh)
  File "/home/weir/software/tombo/tombo/tombo_helper.py", line 96, in banded_traceback
    raise TomboError(unicode(e))
tombo.tombo_helper.TomboError: Read event to sequence alignment extends beyond bandwidth

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1583, in _resquiggle_worker
    map_res, save_params, fast5_fn, all_raw_signal)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1491, in run_rsqgl_iters
    seq_samp_type=seq_samp_type)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1168, in resquiggle_read
    seq_samp_type=seq_samp_type, reg_id=map_res.align_info.ID)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 1028, in find_adaptive_base_assignment
    fwd_pass, fwd_pass_move, band_event_starts, shifted_z_scores = run_fwd_pass()
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 899, in run_fwd_pass
    mapped_start_offset, rsqgl_params, events_per_base)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 676, in _get_masked_start_fwd_pass
    shifted_z_scores[seq_pos,:] = get_start_mask_z_score(seq_pos, event_pos)
  File "/home/weir/software/tombo/tombo/resquiggle.py", line 661, in get_start_mask_z_score
    event_vals, r_ref_means[seq_pos], r_ref_sds[seq_pos],
IndexError: index 423 is out of bounds for axis 0 with size 423

Here is the basecalled_template of bad fast5. Raw file was zipped in attachment. 7b86be7e-3212-4cc9-804d-248094721897.zip

GROUP "Analyses" {
      GROUP "Basecall_1D_000" {
         GROUP "BaseCalled_template" {
            DATASET "Fastq" {
               DATATYPE  H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_UTF8;
                  CTYPE H5T_C_S1;
               }
               DATASPACE  SCALAR
               DATA {
               (0): "@7b86be7e-3212-4cc9-804d-248094721897 runid=55a140204257e9bc857f48ce3639d81a4a2f4dee sampleid=SARS-CoV-2-IVT read=64824 ch=19 start_time=2020-03-03T22:10:30Z
           UCAACUACAGGGCUCAGAAUAUGACUAUGUCAUAUUCACUCACUAUCACUGAAACGGCUUUUCACUCUUGUUAAUGUAAACAGAUUUAAUGUUGCUAUUACACACCAGAGCAAAAAGUGUCGGGCAUAUUACUUUAAACAAAUGUCUUUUUCUUGAUAGAGACCUUUAUGACUUGGAGUUGCAAGCAAUCACUACAAGUCUUGUAAUUCCGUAGGGGAAUGUAAACAACUUUCAAGCUUGAAAUGUAACAGGACUCUUUAAAGAUUGAGGAGGUGUAGUAAGGUAAUCUGGGUUACCCACUAUCAGCACACACAUUUCCUCCUCAGUGUGUUGACUUAAUUUCAAACUGAAGGUUUAUGUGCGUUGGGGUCAUCACUGGCAUUCCUAAAAAGGACUGACUCCACACACACCUAUAGAAGACUCAUUUCUUAUGAUGGGUUUUUAAAAAAUGAAUUAUCAAGUUAAUGGUUACCCAACUGUGAGGUUUAUCUCCCGUGUGAGAAGGUUAUCUCUAGACUACAUGUACGCGUGCAUGAUAACUCUGGCUCAUAUGUCUGUUGAGGGCUCCCUCCUCCUCCCACAUACCACACAUACCCUCAUCUCAUACUAACUGCCACAUUUUUAUACUCCUGCCCCCCAGAUCAUCAUAGCAGUUUUUUUCAUCCCUCCAUCCUC
           +
           %&'%$%$'02-.)'%-,136$22652$,/24(2+30()%'$$')),&()46945(($%'(&$)3,&#-671/386323:<7.//2594:76;:10*%+$$$%%#))0-0%$.1300.2%$$###&&%#%%#%'&%%$#&,).3-.%%'''&$'(''%&()(+(&'(&'('')$#)%(&*+$#$$"$01+##$%#&'('3-&22),*(&'%#(&+./.56-)+#%%$&&(($$$%%($%)355-'')%#%*'%08-(**144;;>;:5)((()&*%+53(((11$+,,')1$$2))(*%"$%#$$"#$'(#%#%$&%&(&$&&%,132''#2557/#$-.*)/%0/1233=>000-@<94&"$$.-&#&$&$$*$##'%'(+(*022353(%&%,'$.,(%%&$'%+')%)'&$'((.173/,)))%'%%)'(15<2..2,*%-311;78;9>D881+-'14+20+030;F,/2&,-..%//%%%)+,)'$')+++/+-04+454%&$#$$##$#%$#$$%(0-8=92%$54('&%$$%#$%(')($###$$$&#$#$&&$')&&+&%&'&'+&&'#'+(((%%#&$#&'($$#&.'$%''$%$#%$%%%%%)+/&"$$%)&%$%%$$#)+$#%)$$%'$&"##%#$##$'#$(%$%%$%$$#$#$)#(%#&%)$&
           "
               }
            }
         }
      }

weir12 avatar Apr 26 '20 04:04 weir12

I have downloaded the file and I am unable to reproduce this error. I used the read sequence as the reference. The mapping could be part of the issue. Could you provide the reference contig for this read?

marcus1487 avatar May 01 '20 15:05 marcus1487

Sure,here is the reference file for mapping. covid19 reference Thanks!

weir12 avatar May 02 '20 14:05 weir12

I have been able to reproduce this issue, but have yet to find a fix. Thank you for the data and reference! Will hopefully have a fix soon.

marcus1487 avatar May 07 '20 19:05 marcus1487