add crabs/dbimport from readsimulator pipeline
PR checklist
Closes #5532
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the module conventions in the contribution docs
- [ ] If necessary, include test data in your PR.
- [ ] Remove all TODO statements.
- [x] Emit the
versions.ymlfile. - [ ] Follow the naming conventions.
- [ ] Follow the parameters requirements.
- [ ] Follow the input/output options guidelines.
- [x] Add a resource
label - [ ] Use BioConda and BioContainers if possible to fulfil software requirements.
- Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
- For modules:
- [x]
nf-core modules test <MODULE> --profile docker - [ ]
nf-core modules test <MODULE> --profile singularity - [ ]
nf-core modules test <MODULE> --profile conda
- [x]
- For subworkflows:
- [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile docker - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile singularity - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile conda
- [ ]
- For modules:
CRABS has changed a lot of its functionality when updating to version 1.0.0. This needs to be taken care of!
Unfortunately I get
java.lang.OutOfMemoryError: Required array size too large
when I try to use the downloadtaxonomy module (see PR #7423).
I asked if there is a way to only download a fraction of the data here: https://github.com/gjeunen/reference_database_creator/issues/83
We can then either do that or try to downsample the data, load it into test_datasets and continue from there. In any case, the downloadtaxonomy module is needed to properly run crabs.
There is downsampled test data available in the test-datasets repository. Unfortunately I run into the following error:
│ Command executed: │
│ │
│ if [ "false" == "true" ]; then │
│ gzip -c -d genome.fasta > genome.fasta │
│ fi │
│ │
│ crabs --import \ │
│ --input genome.fasta \ │
│ --output test.crabsdb.fa \ │
│ --acc2tax nucl_gb.accession2taxid \ │
│ --names names.dmp \ │
│ --nodes nodes.dmp \ │
│ --import-format embl --ranks 'superkingdom;phylum;class;order;family;genus;species' \ │
│ │
│ rm genome.fasta │
│ │
│ cat <<-END_VERSIONS > versions.yml │
│ "CRABS_DBIMPORT": │
│ crabs: $(crabs --help | grep 'CRABS |' | sed 's/.*CRABS | \(v[0-9.]*\).*/\1/') │
│ END_VERSIONS │
│ │
│ Command exit status: │
│ 1 │
│ │
│ Command output: │
│ | Read data to memory | 0% -:--:-- 0:00:00 │
│ │
│ Command error: │
│ /usr/local/lib/python3.12/site-packages/function/crabs_functions.py:775: SyntaxWarning: invalid escape sequence '\.' │
│ for item in ['_sp\.','_SP\.','_indet.', '_sp.', '_SP.']: │
│ /usr/local/lib/python3.12/site-packages/function/crabs_functions.py:775: SyntaxWarning: invalid escape sequence '\.' │
│ for item in ['_sp\.','_SP\.','_indet.', '_sp.', '_SP.']: │
│ Matplotlib created a temporary cache directory at /tmp/matplotlib-am3lbtwt because the default path (/.config/matplotlib) is not a writable directory; it is │
│ highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support │
│ multiprocessing. │
│ │
│ /// CRABS | v1.0.7 │
│ │
│ | Function | Import sequence data into CRABS format │
│ | Read data to memory | 0% -:--:-- 0:00:00 │
│ Traceback (most recent call last): │
│ File "/usr/local/bin/crabs", line 847, in <module> │
│ crabs() │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1157, in __call__ │
│ return self.main(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/rich_click/rich_command.py", line 152, in main │
│ rv = self.invoke(ctx) │
│ ^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1434, in invoke │
│ return ctx.invoke(self.callback, **ctx.params) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 783, in invoke │
│ return __callback(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/bin/crabs", line 561, in crabs │
│ seq_input_dict, initial_seq_number = input_to_memory(task, progress_bar, input_) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/function/crabs_functions.py", line 393, in embl_to_memory │
│ seq_name = line.split('|')[1] │
│ ~~~~~~~~~~~~~~~^^^ │
│ IndexError: list index out of range
Seems like the versions.yml is different for the conda package (there null is returned as version which is obviously wrong). But when I test it locally it works fine and returns the version as expected.
nf-core modules test crabs/dbimport --profile conda [12:58PM]
,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/tools version 3.2.0 - https://nf-co.re
INFO Generating nf-test snapshot
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── nf-test output ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 🚀 nf-test 0.9.2 │
│ https://www.nf-test.com │
│ (c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr │
│ │
│ Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar │
│ Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar │
│ Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar │
│ Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar │
│ Warning: every snapshot that fails during this test run is re-record. │
│ │
│ Test Process CRABS_DBIMPORT │
│ │
│ Test [c8d9c787] 'sarscov2 - fasta' PASSED (50.227s) │
│ Test [97fec540] 'sarscov2 - fasta - stub' PASSED (35.526s) │
│ │
│ │
│ SUCCESS: Executed 2 tests in 85.766s │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
INFO Generating nf-test snapshot again to check stability
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── nf-test output ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 🚀 nf-test 0.9.2 │
│ https://www.nf-test.com │
│ (c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr │
│ │
│ Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar │
│ Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar │
│ Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar │
│ Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar │
│ │
│ Test Process CRABS_DBIMPORT │
│ │
│ Test [c8d9c787] 'sarscov2 - fasta' PASSED (37.762s) │
│ Test [97fec540] 'sarscov2 - fasta - stub' PASSED (36.477s) │
│ │
│ │
│ SUCCESS: Executed 2 tests in 78.64s │
│ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
INFO All tests passed!
It works locally with exactly the snapshot that is part of this PR, I do not understand why accessing the version does not work in the CI.
This works on my Mac as well as on our linux workstation...
@fellen31 The problem seems to persist. Locally I get the version as shown in my terminal output but in the CI the version is reported as null.
This is what I get on codespaces:
(nf-core) gitpod /workspaces/modules (crabs-2) $ nf-core modules test crabs/import --profile conda
,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/tools version 3.2.0 - https://nf-co.re
INFO Generating nf-test snapshot
╭──────────────────────────────────────────────────────────────────────────────────────────────────── nf-test output ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 🚀 nf-test 0.9.2 │
│ https://www.nf-test.com │
│ (c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr │
│ │
│ Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar │
│ Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar │
│ Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar │
│ Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar │
│ │
│ Test Process CRABS_IMPORT │
│ │
│ Test [8b835859] 'sarscov2 - fasta' PASSED (67.027s) │
│ Test [697024d6] 'sarscov2 - fasta - stub' │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────── nf-test error ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Picked up JAVA_TOOL_OPTIONS: │
│ │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
INFO Generating nf-test snapshot again to check stability
╭──────────────────────────────────────────────────────────────────────────────────────────────────── nf-test output ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │
│ 🚀 nf-test 0.9.2 │
│ https://www.nf-test.com │
│ (c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr │
│ │
│ Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar │
│ Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar │
│ Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar │
│ Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar │
│ │
│ Test Process CRABS_IMPORT │
│ │
│ Test [8b835859] 'sarscov2 - fasta' PASSED (51.696s) │
│ Test [697024d6] 'sarscov2 - fasta - stub' PASSED (53.447s) │
│ │
│ │
│ SUCCESS: Executed 2 tests in 110.153s │
│ │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────────────────────────────────────────────────────────────────────────── nf-test error ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Picked up JAVA_TOOL_OPTIONS: │
│ │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
CRITICAL Ran, but found errors:
- nf-test failed
- nf-test failed
Results from codespaces:
(nf-core) gitpod /workspaces/modules (crabs-2) $ nf-test test modules/nf-core/crabs/import/tests/main.nf.test --update-snapshot --profile conda
Picked up JAVA_TOOL_OPTIONS:
🚀 nf-test 0.9.2
https://www.nf-test.com
(c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr
Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar
Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar
Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar
Warning: every snapshot that fails during this test run is re-record.
Test Process CRABS_IMPORT
Test [8b835859] 'sarscov2 - fasta' PASSED (45.875s)
Test [697024d6] 'sarscov2 - fasta - stub' PASSED (42.505s)
SUCCESS: Executed 2 tests in 89.699s
(nf-core) gitpod /workspaces/modules (crabs-2) $ nf-test test modules/nf-core/crabs/import/tests/main.nf.test --profile conda
Picked up JAVA_TOOL_OPTIONS:
🚀 nf-test 0.9.2
https://www.nf-test.com
(c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr
Load .nf-test/plugins/nft-bam/0.5.0/nft-bam-0.5.0.jar
Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar
Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar
Test Process CRABS_IMPORT
Test [8b835859] 'sarscov2 - fasta' PASSED (46.555s)
Test [697024d6] 'sarscov2 - fasta - stub' PASSED (41.081s)
SUCCESS: Executed 2 tests in 88.936s
@gjeunen Do you maybe have an idea why accessing the version via conda fails in our CI? 😢