medaka icon indicating copy to clipboard operation
medaka copied to clipboard

Medaka variant error: "IndexError: index 0 is out of bounds for axis 0 with size 0"

Open BioWilko opened this issue 1 year ago • 2 comments

Describe the bug A clear and concise description of what the bug is including the command that you have run.

medaka variant crashes with exit status 20 when the command medaka variant MPXV.reference.fasta MPVX_untreated_LSK114_280722-fast_barcode74.1.hdf MPVX_untreated_LSK114_280722-fast_barcode74.1.vcf is run.

Logging

Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/artic-test/bin/medaka", line 11, in <module>
    sys.exit(main())
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/medaka.py", line 720, in main
    args.func(args)
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/variant.py", line 230, in variants_from_hdf
    for sample in joined_samples:
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/variant.py", line 104, in join_samples
    yield medaka.common.Sample.from_samples(queue + [to_yield])
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/common.py", line 143, in from_samples
    rel = Sample.relative_position(s1, s2)
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/common.py", line 224, in relative_position
    s1_ord, s2_ord = sorted((s1, s2), key=lambda x: (x.first_pos, -x.size))
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/common.py", line 224, in <lambda>
    s1_ord, s2_ord = sorted((s1, s2), key=lambda x: (x.first_pos, -x.size))
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/common.py", line 89, in first_pos
    return self._get_pos(0)
  File "/home/ubuntu/miniconda3/envs/artic-test/lib/python3.8/site-packages/medaka/common.py", line 69, in _get_pos
    return p['major'][index], p['minor'][index]
IndexError: index 0 is out of bounds for axis 0 with size 0

Environment (if you do not have a GPU, write No GPU):

  • Installation method: conda (mamba solver)
  • OS: Ubuntu 20.04 LTS
  • medaka version: 1.6.1
  • GPU model: Nvidia A100
  • Nvidia driver version: 510.73.05
  • CUDA version: 11.6
  • cuDNN version: 8.4.1.50

Additional context This problem only seems to have appeared after changing MPX reference from AY753185.1 to MT903344.1 which included translating the primer bed file to have co-ordinates relative to the new reference (potentially affecting primer trimming). This issue only occurs for particular barcodes on particular runs. I am happy to share data privately to help locate the issue but cannot share the data publicly.

BioWilko avatar Aug 02 '22 16:08 BioWilko

Hi @BioWilko, fancy seeing you here! I believe I have narrowed down the scope of this issue to a slicing operation in join_samples. I have crafted a dummy HDF5 file that can trigger the same IndexError you have provided. It would be good to understand how this could happen on a real dataset so I have sent a request to the e-mail address on your Github profile which should contain instructions to provide your HDF to us.

SamStudio8 avatar Aug 04 '22 16:08 SamStudio8

Small world eh @SamStudio8? Predictably the error refused to manifest when I tried to replicate it but I managed it in the end and have sent over the file, thanks!

BioWilko avatar Aug 05 '22 09:08 BioWilko

Your data triggered an interesting edge case wherein the Sample chunk generator was emitting a chunk containing only a long insertion. Compared against the reference the chunk had only one "major" non-variant position at index 0. When the Sample was bisected on index 0, an empty Sample slice was created (with no indexable positions) and you know the rest.

I have patched Medaka to explicitly avoid this case and released v1.7.1 (which you can get from the releases page, or via bioconda). Please advise if this has fixed your issue.

SamStudio8 avatar Aug 30 '22 10:08 SamStudio8

Sorry I took so long to get back to you but I have finally tested this and confirm it has fixed my issue, thanks!

BioWilko avatar Oct 11 '22 12:10 BioWilko