John Sundh

Results 11 comments of John Sundh

@cjfields, yes exactly this is with a training set from the BOLD database (clustered at 99% identity). We are also testing vsearch and SINTAX, as well as the sk-learn classifier...

I've been developing a workflow for metagenomic projects located here: https://github.com/NBISweden/nbis-meta. It supports paired- and single-end data and includes preprocessing, read-based classification, assembly and binning as well as functional and...

Thanks @johanneskoester! Yes the `opj` code is an old leftover that I've been putting off removing for a long time, but I completely agree. Will get to work on that....

The database is up now at https://scilifelab.figshare.com/articles/dataset/COI_reference_sequences_from_BOLD_DB/20514192

@erikrikarddaniel How was it now with taxonomic ranks used in ampliseq? I see the finished `assignTaxonomy.fna` file has _e.g._ `Bacteria;Bacteria;Firmicutes;Bacilli;Staphylococcales;Staphylococcaceae;Staphylococcus` as the header so it's using taxlevels `"Domain", "Kingdom", "Phylum",...

@erikrikarddaniel @jtangrot I'm a little unsure on how to use the `fmtscript` part of the database entries in the workflow. Does it have to be a shell script, or can...

I wasn't able to use a python script because python is not included in the ubuntu:20.04 image used for the `FORMAT_TAXONOMY` process. I suggest adding a `container` keyword to the...

It's probably not too difficult to make it work for singularity as well (it works already for conda) it's more that my groovy/nextflow skills are limited still. But having a...

@d4straub How lightweight does the container have to be? Of course it has to contain bash, so the slimmest python containers are out. But I see nextflow also relies on...

It works with `quay.io/biocontainers/python:3.8.3` for docker. However, I don't feel confident in getting all checks to work, see https://github.com/NBISweden/ampliseq/pull/2.