sourmash
sourmash copied to clipboard
`sig describe` error when subdirectory name contains "sourmash"
Hi,
I have found that sourmash sig describe returns an error if the signature file is located in a subdirectory that contains the string "sourmash". I am running sourmash version 4.8.5.
Setup
Download the example genome and create the results folders:
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/005/845/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_genomic.fna.gz -O genome.fna.gz
mkdir results
mkdir results/signatures
mkdir results/sourmash_signatures
Example
A signature created in results/signatures behaves as expected:
sourmash sketch dna genome.fna.gz -o results/signatures/genome.sig
sourmash sig describe results/signatures/genome.sig
== This is sourmash version 4.8.5. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==
---
signature filename: results/signatures/genome.sig
signature: ** no name **
source file: genome.fna.gz
md5: 0a8632c67e6d88f737ddb510bef90337
k=31 molecule=DNA num=0 scaled=1000 seed=42 track_abundance=0
size: 4476
sum hashes: 4476
signature license: CC0
loaded 1 signatures total, from 1 files
However, copying the signature into 'results/sourmash_signatures' results in an error when described:
cp results/signatures/genome.sig results/sourmash_signatures/genome.sig
sourmash sig describe results/sourmash_signatures/genome.sig
== This is sourmash version 4.8.5. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==
ERROR: Error while reading signatures from 'results/sourmash_signatures/genome.sig'.
WOW. This is fascinating. I can replicate, and it is in fact all the way down in the Rust layer:
import sourmash
x = sourmash.load_signatures('results/sourmash_signatures/genome.sig',
do_raise=True)
list(x)
I have a guess as to what is going on, will dig into it later.
Thanks for filing the weirdest bug I've ever seen in sourmash!! This is neat and also a bit disturbing 😆
I'm happy someone found that out, I would have been helpless to figure this one out!!