New module: kraken2/build
PR checklist
Closes #2953
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the module conventions in the contribution docs
- [ ] If necessary, include test data in your PR.
- [ ] Remove all TODO statements.
- [ ] Emit the
versions.ymlfile. - [ ] Follow the naming conventions.
- [ ] Follow the parameters requirements.
- [ ] Follow the input/output options guidelines.
- [ ] Add a resource
label - [ ] Use BioConda and BioContainers if possible to fulfil software requirements.
- Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
- For modules:
- [ ]
nf-core modules test <MODULE> --profile docker - [ ]
nf-core modules test <MODULE> --profile singularity - [ ]
nf-core modules test <MODULE> --profile conda
- [ ]
- For subworkflows:
- [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile docker - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile singularity - [ ]
nf-core subworkflows test <SUBWORKFLOW> --profile conda
- [ ]
- For modules:
The following error is related to expected file names for .dmp taxonomy files:
│ Command error: │
│ Creating sequence ID to taxonomy ID map (step 1)... │
│ lookup_accession_numbers: expected TAB not found in taxonomy/prot.accession2taxid │
│ Found 0/13 targets, searched through 1 accession IDs, search complete. │
│ lookup_accession_numbers: 13/13 accession numbers remain unmapped, see unmapped.txt in DB directory │
│ Sequence ID to taxonomy ID map complete. [0.009s] │
│ Estimating required capacity (step 2)... │
│ Estimated hash table requirement: 157988 bytes │
│ Capacity estimation complete. [0.006s] │
│ Building database files (step 3)... │
│ build_db: error opening taxonomy//nodes.dmp: No such file or directory
Expected file names are: names.dmp and nodes.dmp. Then I will first try to rename these files in the add module. PR to fix taxonomy file names: #5214
Seems to be starting to work now @alxndrdiaz !
Seems to be starting to work now @alxndrdiaz !
It seems to work now. Also assertions need to be improved.
One last (?) problem: there are two files (opts.k2d and unmapped.txt) that seem to change between tests. The following assertion fails if these files are included:
assertAll(
{ assert process.success },
{ assert process.out.db.get(0).get(1) ==~ ".*/test" },
{ assert snapshot (
path("${process.out.db[0][1]}/hash.k2d"),
path("${process.out.db[0][1]}/taxo.k2d"),
path("${process.out.db[0][1]}/opts.k2d"),
path("${process.out.db[0][1]}/unmapped.txt")
).match()
}
)
In this case opts.k2d and unmapped.txt had different md5 codes between tests:
│ 1 [ 1 [ │
│ 2 "hash.k2d:md5,e9984a5e98f87c048 2 "hash.k2d:md5,e9984a5e98f87c048 │
│ 8cb5e7618d5bbe0", 8cb5e7618d5bbe0", │
│ 3 "taxo.k2d:md5,29d65b1796e09191f 3 "taxo.k2d:md5,29d65b1796e09191f │
│ d7bdcaa24130459", d7bdcaa24130459", │
│ ! 4 "opts.k2d:md5,de7a6df4eb9f322f0 ! 4 "opts.k2d:md5,bbef3355da216a020 │
│ 53724a3d05ad8aa", ddc1b36db249910", │
│ ! 5 "unmapped.txt:md5,f6c3f052cfd71 ! 5 "unmapped.txt:md5,1c04243f50ce0 │
│ c5cd7133f7f58ddcb52" e7769ad7dce51285c7d" │
│ 6 ]