octopus
octopus copied to clipboard
Segmentation fault when reference genome contains '.'
Describe the bug
Thanks for the great tool!
If the reference fasta contains one or more '.', instead of a nucleotide, it fails with a segfault.
I successfully run octopus
on my fasta + bam when '.' is replaced by a valid base, e.g. 'a'.
If these shouldn't be supported, it would be useful to flag this issue when it occurs, rather than just segfaulting. Else, could also refuse to call/report any variants at positions with '.' in the ref.
Can provide the fasta and BAM I've used to identify this if you'd like.
$ octopus --version
octopus version 0.7.4
Target: x86_64 Linux 5.4.0-72-generic
SIMD extension: AVX2
Compiler: GNU 9.3.0
Boost: 1_74
Command line to run octopus:
$ octopus -I induced_ref_mapped.bam -R induced_ref.fa --organism-ploidy 1 --threads 4
Thanks for the bug report. Though there is no official FASTA/Q specification, I don't believe a .
would be considered a valid base by any well-established tool. However, I agree Octopus should be validating its input better - I'll look to report a more helpful error message.