FASTQ files without move tables in the header cause a crash
Describe the bug FASTQ files without move tables in the header cause a crash without a clear error message
Logging
$ medaka_consensus -i test.fastq.gz -d test.fasta -o test_medaka_out -t 1
Attempting to automatically select model version.
WARNING: Failed to detect a model version, will use default: 'r1041_e82_400bps_sup_v5.2.0'
Checking program versions
This is medaka 2.1.0
Program Version Required Pass
bcftools 1.22 1.11 True
bgzip 1.22 1.11 True
minimap2 2.30 2.11 True
samtools 1.22 1.11 True
tabix 1.22 1.11 True
Traceback (most recent call last):
File "/home/boas/miniforge3/envs/env_medaka/lib/python3.8/site-packages/medaka/medaka.py", line 360, in check_compatible
data_has_move_tables = check_bam_for_dwells(data)
File "/home/boas/miniforge3/envs/env_medaka/lib/python3.8/site-packages/medaka/medaka.py", line 325, in check_bam_for_dwells
with pysam.AlignmentFile(bam) as bam:
File "pysam/libcalignmentfile.pyx", line 751, in pysam.libcalignmentfile.AlignmentFile.__cinit__
File "pysam/libcalignmentfile.pyx", line 1000, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/boas/miniforge3/envs/env_medaka/bin/medaka", line 11, in <module>
sys.exit(main())
File "/home/boas/miniforge3/envs/env_medaka/lib/python3.8/site-packages/medaka/medaka.py", line 952, in main
args.func(args)
File "/home/boas/miniforge3/envs/env_medaka/lib/python3.8/site-packages/medaka/medaka.py", line 363, in check_compatible
data_has_move_tables = check_fastx_for_dwells(data)
File "/home/boas/miniforge3/envs/env_medaka/lib/python3.8/site-packages/medaka/medaka.py", line 342, in check_fastx_for_dwells
return "\tmv:" in read.comment
TypeError: argument of type 'NoneType' is not iterable
Model r1041_e82_400bps_sup_v5.2.0 is not compatible with /home/boas/analysis/test.fastq.gz
Additional context
In medaka version 2.1.0, the function check_compatible seems to have been added (https://github.com/nanoporetech/medaka/blob/master/medaka/medaka.py#L345).
If I input a FASTQ file without comments in the header, a TypeError will occur at https://github.com/nanoporetech/medaka/blob/master/medaka/medaka.py#L342 because read.comment is None.
This might be intended behavior, but it would be nice if medaka would catch this error and output a clear error message, for example specifying that move tables are missing in the FASTQ file.
Quick addition: if I run the same inputs using Medaka v2.0.1, this error is not encountered and the polishing runs as expected.
I also received the same error using v2.1.0 on ONT fastq reads processed through fastplong, although if I supplied the raw fastqs it worked.
Same error even if you specify the model.
This should be fixed in v2.1.1