Execution failed when --kmerfinder_db and --ncbi_assembly_metadata are provided
Dear developers,
We are trying to use additional parameters --kmerfinder_db and --ncbi_assembly_metadata for bacass workflow with release 2.4.0 or development version. Both are failed with the following error:
WARN: The following invalid input values have been detected:
* --kmerfinder_db: DATABASES/bacteria.tar.gz
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kmerfinder database and NCBI assembly metadata not provided.
Please specify the '--kmerfinderdb' and '--ncbi_assembly_metadata' parameters.
Both are required to run Kmerfinder.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Any advise?
Here is the run script:
rm -rf ~/.nextflow/assets/nf-core/bacass ;
nextflow run nf-core/bacass -r dev -c nextflow.config -profile singularity --input samplesheet.tsv --kraken2db /ibex/ai/reference/KSL/kraken2/kraken2_dbs/scripts_download/k2_nt_20230502.tar.gz --kmerfinder_db DATABASES/bacteria.tar.gz --ncbi_assembly_metadata ASSEMBLY-REPORTS/assembly_summary_refseq.txt --outdir outputs/2024-12-04_15-KSA-samples__local-downloads__dev
Here is the complete log file:
Nextflow 24.10.2 is available - Please consider updating your version to it
N E X T F L O W ~ version 24.04.4
Pulling nf-core/bacass ...
downloaded from [https://github.com/nf-core/bacass.git](https://urldefense.com/v3/__https://github.com/nf-core/bacass.git__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOd6Vy7Ht0$)
Launching `[https://github.com/nf-core/bacass](https://urldefense.com/v3/__https://github.com/nf-core/bacass__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdQdoeFKY$)` [thirsty_bhabha] DSL2 - revision: ad892edcdb [dev]
------------------------------------------------------
,--./,-.
___ __ __ __ ___ /,-._.--~'
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/bacass 2.5.0dev
------------------------------------------------------
Input/output options
input : samplesheet.tsv
outdir : outputs/2024-12-04_15-KSA-samples__local-downloads__dev
Contamination Screening
kraken2db : /ibex/ai/reference/KSL/kraken2/kraken2_dbs/scripts_download/k2_nt_20230502.tar.gz
ncbi_assembly_metadata: ASSEMBLY-REPORTS/assembly_summary_refseq.txt
Assembly parameters
canu_mode : -nanopore
Annotation
dfast_config : /home/pampum/.nextflow/assets/nf-core/bacass/assets/test_config_dfast.py
Core Nextflow options
revision : dev
runName : thirsty_bhabha
containerEngine : singularity
launchDir : /ibex/user/pampum/2024-11-26_KSA-lib-ONT-assemblies
workDir : /ibex/user/pampum/2024-11-26_KSA-lib-ONT-assemblies/work
projectDir : /home/pampum/.nextflow/assets/nf-core/bacass
userName : pampum
profile : singularity
configFiles :
!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------* The pipeline
[https://doi.org/10.5281/zenodo.2669428](https://urldefense.com/v3/__https://doi.org/10.5281/zenodo.2669428__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdfnSEiLo$)
* The nf-core framework
[https://doi.org/10.1038/s41587-020-0439-x](https://urldefense.com/v3/__https://doi.org/10.1038/s41587-020-0439-x__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdNw1r3WM$)
* Software dependencies
[https://github.com/nf-core/bacass/blob/master/CITATIONS.md](https://urldefense.com/v3/__https://github.com/nf-core/bacass/blob/master/CITATIONS.md__;!!Nmw4Hv0!0mP0MQIUjAAB-QNxI3mFS2Dk1FPivNLfEt_5nSL8E1w47QotyFxumO7yk5_kK57CL3WLftfSsoxw9gHJmOmCgncD09XgPC-mszOdJfshqHM$)
WARN: The following invalid input values have been detected:
* --kmerfinder_db: DATABASES/bacteria.tar.gz
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kmerfinder database and NCBI assembly metadata not provided.
Please specify the '--kmerfinderdb' and '--ncbi_assembly_metadata' parameters.
Both are required to run Kmerfinder.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We may not possible to use custom parameters. Can you please help us to fix this error?
- try using absolute paths for the kmer db and ncbi metadata input
- possibly similar to issue #187
Dear @m-jahn
Thanks for your recommendations.
I was downloaded the Kmerfinder database from https://zenodo.org/records/13447056.
However, this kmerfinder job step was failed to locate bacteria.tax and it's not part of the distribution.
i.e.,
Command executed:
kmerfinder.py \
--infile SRR10093029_1.fastp.fastq.gz SRR10093029_2.fastp.fastq.gz \
--output_folder . \
--db_path 20190108_kmerfinder_stable_dirs/bacteria.ATG \
-tax 20190108_kmerfinder_stable_dirs/bacteria.tax \
-x
mv results.txt SRR10093029_results.txt
mv data.json SRR10093029_data.json
cat <<-END_VERSIONS > versions.yml
"NFCORE_BACASS:BACASS:KMERFINDER_SUBWORKFLOW:KMERFINDER":
kmerfinder: $(echo "3.0.2")
END_VERSIONS
Command exit status:
1
Command output:
# Time used to run KMA for species identifation: 0.016 s
Cant open file: [Errno 2] No such file or directory: '20190108_kmerfinder_stable_dirs/bacteria.tax'
As you may refer this Kmerfinder database (2019/01/08 - 17GB) - stable dir, which may not have bacteria.tax.
Version: 20190108_stable_dirs
Website: ftp://ftp.cbs.dtu.dk/public/CGE/databases/KmerFinder/version/
Content 20190108_stable_dirs.tar.gz:
bacteria
├── bacteria.ATG.comp.b
├── bacteria.ATG.length.bp
├── bacteria.ATG.name
├── bacteria.ATG.seq.b
└── bacteria.name
Update: Same KmerFinder version, but the previous database was corrupted and resulted in untar errors. This version should fix that.
Any further suggestions? Thanks in advance.
In your work directory for this module, rename bacteria.name to bacteria.tax. Then it will work.
This is due to a change in the kmerfinder database structure. Strange enough, the bacass pipe should actually work with both as this module looks for both .name and .tax ending, but it doesn't.
More specifically, this line in modules/local/kmerfinder/main.nf looks for both file endings:
def db_tax = file("${kmerfinderdb_path}/${tax_group}.name").exists() ? "${kmerfinderdb_path}/${tax_group}.name" : "${kmerfinderdb_path}/${tax_group}.tax"
I can not explain why it doesn't accept the .name file.
Many thanks @m-jahn for your recommendations.
I noticed, the taxonomy file was unavailable at ftp://ftp.cbs.dtu.dk/public/CGE/databases/KmerFinder/version/.
However, I was trying to install the required database fromKmerFinder : bash ~/kmerfinder/src/kmerfinder_db/INSTALL.sh $PWD bacteria latest and it's helped to create a bacteria taxonomy file bacteria.tax. Thanks for your suggestions again!