EukDetect
EukDetect copied to clipboard
invalid escape sequence '\d' on get_uncomputed_taxid_per_busco
Dear Dr. Lind,
I'm generating a custom eukdetect db and I'm stucked at get_uncomputed_taxid_per_busco.py. It fails with the following message:
""" python /home/qi47rin/proj/00-git/EukDetect/build_db/get_uncomputed_taxid_per_busco.py --speciestax cache/45-create-eukdetect-db/genomes-table/species_taxid.tsv --fasta cache/45-create-eukdetect-db/genes-repeat-filtered/buscos_cdhit99_less10perc_repeats_masked.fna --collapsed_ids cache/45-create-eukdetect-db/busco-cdhit99-renamed/buscos_cdhit99_renamed_busco_seqid_sequential_correspondence.txt --taxdb cache/45-create-eukdetect-db/taxdump/taxa.sqlite > cache/45-create-eukdetect-db/busco-taxid/busco_taxid_link.txt
Activating conda environment: cache/00-conda-env/bdf327b44096dcc3f601392a860ec146_
/home/qi47rin/proj/00-git/EukDetect/build_db/get_uncomputed_taxid_per_busco.py:27: SyntaxWarning: invalid escape sequence '\d'
sp = re.split('-\dat\d-', '-'.join(seq.id.split('-')[1:]))[0]
/home/qi47rin/proj/00-git/EukDetect/build_db/get_uncomputed_taxid_per_busco.py:46: SyntaxWarning: invalid escape sequence '\d'
new = re.split('-\dat\d-', '-'.join(sp.split('-')[1:]))[0]
Traceback (most recent call last):
File "/home/qi47rin/proj/00-git/EukDetect/build_db/get_uncomputed_taxid_per_busco.py", line 79, in
It follows attached the files I have generated, but taxdump given its size. Do you know what might be happening?
Another question, in the helper section withing the script, when you say "Tab delimited file of species name (as encoded in busco header) and taxonomy ID")", you mean the headers in the fasta file?
Best regards, Ailton. euk-db-asp3.zip