John Sundh comments

Results 11 comments of


John Sundh

Fewer assignments when reference has more taxonomic labels

@cjfields, yes exactly this is with a training set from the BOLD database (clustered at 99% identity). We are also testing vsearch and SINTAX, as well as the sk-learn classifier...

Reviews

I've been developing a workflow for metagenomic projects located here: https://github.com/NBISweden/nbis-meta. It supports paired- and single-end data and includes preprocessing, read-based classification, assembly and binning as well as functional and...

Reviews

Thanks @johanneskoester! Yes the `opj` code is an old leftover that I've been putting off removing for a long time, but I completely agree. Will get to work on that....

Add GBIF adapted CO1 database

The database is up now at https://scilifelab.figshare.com/articles/dataset/COI_reference_sequences_from_BOLD_DB/20514192

Add GBIF adapted CO1 database

@erikrikarddaniel How was it now with taxonomic ranks used in ampliseq? I see the finished `assignTaxonomy.fna` file has _e.g._ `Bacteria;Bacteria;Firmicutes;Bacilli;Staphylococcales;Staphylococcaceae;Staphylococcus` as the header so it's using taxlevels `"Domain", "Kingdom", "Phylum",...

Add GBIF adapted CO1 database

@erikrikarddaniel @jtangrot I'm a little unsure on how to use the `fmtscript` part of the database entries in the workflow. Does it have to be a shell script, or can...

Add GBIF adapted CO1 database

I wasn't able to use a python script because python is not included in the ubuntu:20.04 image used for the `FORMAT_TAXONOMY` process. I suggest adding a `container` keyword to the...

Add GBIF adapted CO1 database

It's probably not too difficult to make it work for singularity as well (it works already for conda) it's more that my groovy/nextflow skills are limited still. But having a...

Add GBIF adapted CO1 database

@d4straub How lightweight does the container have to be? Of course it has to contain bash, so the slimmest python containers are out. But I see nextflow also relies on...

Add GBIF adapted CO1 database

It works with `quay.io/biocontainers/python:3.8.3` for docker. However, I don't feel confident in getting all checks to work, see https://github.com/NBISweden/ampliseq/pull/2.