tadrep icon indicating copy to clipboard operation
tadrep copied to clipboard

tadrep detect error

Open elozanoe opened this issue 3 months ago • 1 comments

Hi! I've been looking at this tool and I think it could be useful for what I want to do. In the file plasmids.fna I have a set of contigs that may be plasmids, and I wanted to use tadrep detect to align the contigs against the PLSDB or RefSeq databases. I have used this command:

tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75

However, I obtain this message:

ERROR: No data available in /mnt/disk2/data/tfm/prueba2/db.json

Here I paste the tadrep.log file content:

2024-03-18 19:00:24,627 - MainProcess - INFO - MAIN - version 0.9.1 2024-03-18 19:00:24,627 - MainProcess - INFO - MAIN - command line: /home/tfm/miniconda3/envs/tadrep/bin/tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75 2024-03-18 19:00:24,627 - MainProcess - INFO - CONFIG - threads=32 2024-03-18 19:00:24,627 - MainProcess - INFO - CONFIG - verbose=True 2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - tmp-path=/tmp/tmph8tm2930 2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - output-path=/mnt/disk2/data/tfm/prueba2 2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - prefix=None 2024-03-18 19:00:24,628 - MainProcess - INFO - UTILS - genome-path=/mnt/disk2/data/tfm/prueba/plasmids.fna 2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - summary_path=/mnt/disk2/data/tfm/prueba2/summary.tsv 2024-03-18 19:00:24,628 - MainProcess - INFO - CONFIG - db_path=/mnt/disk2/data/tfm/prueba2/db.json 2024-03-18 19:00:24,628 - MainProcess - DEBUG - IO - /mnt/disk2/data/tfm/prueba2/db.json NOT existing 2024-03-18 19:00:24,628 - MainProcess - DEBUG - CONFIG - No data in /mnt/disk2/data/tfm/prueba2/db.json

I think there is a mistake in the path of the database. When I try to specify it using --db /mnt/disk2/databases/tadrep_db/plsdb, I obtain "error: unrecognized arguments: --db /mnt/disk2/databases/tadrep_db/plsdb". I've also tried to copy the PLSDB database to the path that is marked in bold, but it still gives an error:

cp /path/to/tadrep_db/plsdb/plsdb.json ./prueba2/db.json

tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75

Obtaining: Detection and reconstruction started ... ERROR: No cluster in database /mnt/disk2/data/tfm/prueba2/db.json

Here I paste the new tadrep.log file content:

2024-03-18 19:07:02,203 - MainProcess - INFO - MAIN - version 0.9.1 2024-03-18 19:07:02,203 - MainProcess - INFO - MAIN - command line: /home/tfm/miniconda3/envs/tadrep/bin/tadrep -v -o ./prueba2 detect --genome ./prueba/plasmids.fna --min-contig-coverage 75 --min-contig-identity 75 --min-plasmid-coverage 50 --min-plasmid-identity 75 2024-03-18 19:07:02,203 - MainProcess - INFO - CONFIG - threads=32 2024-03-18 19:07:02,203 - MainProcess - INFO - CONFIG - verbose=True 2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - tmp-path=/tmp/tmp2fygzok5 2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - output-path=/mnt/disk2/data/tfm/prueba2 2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - prefix=None 2024-03-18 19:07:02,204 - MainProcess - INFO - UTILS - genome-path=/mnt/disk2/data/tfm/prueba/plasmids.fna 2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - summary_path=/mnt/disk2/data/tfm/prueba2/summary.tsv 2024-03-18 19:07:02,204 - MainProcess - INFO - CONFIG - db_path=/mnt/disk2/data/tfm/prueba2/db.json 2024-03-18 19:07:02,204 - MainProcess - DEBUG - IO - /mnt/disk2/data/tfm/prueba2/db.json existing 2024-03-18 19:07:14,238 - MainProcess - INFO - IO - imported json: # sequences=1 2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-contig-coverage=0.750 2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-contig-identity=0.750 2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-plasmid-coverage=0.500 2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - min-plasmid-identity=0.750 2024-03-18 19:07:14,238 - MainProcess - INFO - CONFIG - gap-sequence-length=10 2024-03-18 19:07:14,239 - MainProcess - INFO - CONFIG - blast-threads=32 2024-03-18 19:07:14,239 - MainProcess - DEBUG - DETECTION - No Clusters in /mnt/disk2/data/tfm/prueba2/db.json!

What am I doing wrong? Do I have to do any previous steps with the plasmids.fna file? Thanks in advance!

elozanoe avatar Mar 18 '24 18:03 elozanoe