add crabs/insilicopcr module from readsimulator
PR checklist
Closes #5533
- [ ] This comment contains a description of changes (with reason).
- [x] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the module conventions in the contribution docs
- [ ] If necessary, include test data in your PR.
- [x] Remove all TODO statements.
- [x] Emit the
versions.ymlfile. - [ ] Follow the naming conventions.
- [ ] Follow the parameters requirements.
- [ ] Follow the input/output options guidelines.
- [x] Add a resource
label - [x] Use BioConda and BioContainers if possible to fulfil software requirements.
- Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
- For modules:
- [x]
nf-core modules test <MODULE> --profile docker - [ ]
nf-core modules test <MODULE> --profile singularity - [ ]
nf-core modules test <MODULE> --profile conda
- [x]
- For subworkflows:
- [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile docker - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile singularity - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile conda
- [ ]
- For modules:
CRABS has changed a lot of its functionality when updating to version 1.0.0. This needs to be taken care of!
│ Command executed: │
│ │
│ crabs --in-silico-pcr \ │
│ --input genome.fasta \ │
│ --output test.crabs.fa \ │
│ --threads 2 \ │
│ --forward "GTCGGTAAAACTCGTGCCAGC" --reverse "CATAGTGGGGTATCTAATCCCAGTTTG" │
│ │
│ cat <<-END_VERSIONS > versions.yml │
│ "CRABS_INSILICOPCR": │
│ crabs: $(crabs --help | grep 'CRABS |' | sed 's/.*CRABS | \(v[0-9.]*\).*/\1/') │
│ END_VERSIONS │
│ │
│ Command exit status: │
│ 1 │
│ │
│ Command output: │
│ | Import data | 0% -:--:-- 0:00:00 │
│ │
│ Command error: │
│ │
│ /// CRABS | v1.0.7 │
│ │
│ | Function | Extract amplicons through in silico PCR │
│ | Import data | 0% -:--:-- 0:00:00 │
│ Traceback (most recent call last): │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/bin/crabs", line 847, in <module> │
│ crabs() │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/lib/python3.12/site-packages/click/core.py", line 1161, in __call__ │
│ return self.main(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/lib/python3.12/site-packages/rich_click/rich_command.py", line 152, in main │
│ rv = self.invoke(ctx) │
│ ^^^^^^^^^^^^^^^^ │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/lib/python3.12/site-packages/click/core.py", line 1443, in invoke │
│ return ctx.invoke(self.callback, **ctx.params) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/lib/python3.12/site-packages/click/core.py", line 788, in invoke │
│ return __callback(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/bin/crabs", line 612, in crabs │
│ temp_input_path, fasta_dict = crabs_to_fasta(console, columns, input_) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/Users/famke/02-nf-core/modules/.nf-test/tests/3b1c74a8acf3d61abf2918c42e21b00d/work/conda/env-a7419bd3b965a905-a602bdab460f91ded5175e0760caa6e7/lib/python3.12/site-packages/function/crabs_functions.py", line 1066, in crabs_to_fasta │
│ fasta_string = f'>{lineparts[0]}\n{lineparts[1]}\n' │
│ ~~~~~~~~~^^^ │
│ IndexError: list index out of range
Now we run into a similar error here as with dbimport. I guess the test-data needs to be adjusted.
When we change to the genome-ena.fasta:
│ Command executed: │
│ │
│ crabs --in-silico-pcr \ │
│ --input genome-ena.fasta \ │
│ --output test.crabs.fa \ │
│ --threads 2 \ │
│ --forward "GTCGGTAAAACTCGTGCCAGC" --reverse "CATAGTGGGGTATCTAATCCCAGTTTG" │
│ │
│ cat <<-END_VERSIONS > versions.yml │
│ "CRABS_INSILICOPCR": │
│ crabs: $(crabs --help | grep 'CRABS |' | sed 's/.*CRABS | \(v[0-9.]*\).*/\1/') │
│ END_VERSIONS │
│ │
│ Command exit status: │
│ 1 │
│ │
│ Command output: │
│ | Import data | 0% -:--:-- 0:00:00 │
│ │
│ Command error: │
│ /usr/local/lib/python3.12/site-packages/function/crabs_functions.py:775: SyntaxWarning: invalid escape sequence '\.' │
│ for item in ['_sp\.','_SP\.','_indet.', '_sp.', '_SP.']: │
│ /usr/local/lib/python3.12/site-packages/function/crabs_functions.py:775: SyntaxWarning: invalid escape sequence '\.' │
│ for item in ['_sp\.','_SP\.','_indet.', '_sp.', '_SP.']: │
│ Matplotlib created a temporary cache directory at /tmp/matplotlib-dtwy8p2_ because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the │
│ import of Matplotlib and to better support multiprocessing. │
│ │
│ /// CRABS | v1.0.7 │
│ │
│ | Function | Extract amplicons through in silico PCR │
│ | Import data | 0% -:--:-- 0:00:00 │
│ Traceback (most recent call last): │
│ File "/usr/local/bin/crabs", line 847, in <module> │
│ crabs() │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1157, in __call__ │
│ return self.main(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/rich_click/rich_command.py", line 152, in main │
│ rv = self.invoke(ctx) │
│ ^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 1434, in invoke │
│ return ctx.invoke(self.callback, **ctx.params) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/click/core.py", line 783, in invoke │
│ return __callback(*args, **kwargs) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/bin/crabs", line 612, in crabs │
│ temp_input_path, fasta_dict = crabs_to_fasta(console, columns, input_) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.12/site-packages/function/crabs_functions.py", line 1066, in crabs_to_fasta │
│ fasta_string = f'>{lineparts[0]}\n{lineparts[1]}\n' │
│ ~~~~~~~~~^^^ │
│ IndexError: list index out of range
This step probably needs to be implemented as a module first: https://github.com/gjeunen/reference_database_creator?tab=readme-ov-file#52-module-2-import-downloaded-data-into-crabs-format
Waiting for other modules being created first see issue #7496 - would leave it open and see if someone picks up on the issue during the Hackathon.