Benjamin Callahan comments

Results 423 comments of


Benjamin Callahan

Error: Memory allocation failed.

What version of the UNITE release are you using? To be comparible with `assignTaxonomy` you should be using the "General Fasta" releases, probably "All Eukarotyoes". See the dada2 training data...

Using dada2 to process single sample fasta files

> such that each is now a derep class object. Is it then possible to filter and trim (using fastqFilter?). If so, should this be done to each derep class...

MergePairs Error

First, I'd just leave out the explicit `derepFastq` calls, they aren't needed. i.e. ``` errF

Output of Full Sequences from Assign Taxonomy

There is not. I would recommend using a different tool to get best hits in a reference database to move forward in this fashion. Note that there has been a...

High loss of CCS reads in filtering step

> pretty much stick with 99.9% consensus CCS call That's what we start with before running the DADA2 workflow. Usually also include a loose `maxEE` filter and a length window...

High loss of CCS reads in filtering step

The learned error rates look really high for HiFi sequencing. What chemistry/instrument version are you using? What is the environment/sample-type you are sequencing?

High loss of CCS reads in filtering step

Huh... I don't know, that should be handled pretty well by `learnErrors` from my experience. A couple more things to try: `plotComplexity` on a couple sample fastqs after after primer...

A 20bps GAP in merged reads' length distribution.

> Is this expected? Yes, the V3V4 region in bacteria has a bimodal length distribution that differs by about 20 nts.

p.freq differs when using method="frequency" and method="auto/combined/minimum/either/both"

There is a poorly documented, but intentional, behavior by decontam that when using "combined" methods (or other methods that use both frequency and prevalence) that the negative controls are excluded...

Result vary as per sample size

> So, my question is whether I should run this analysis on all samples or only on the subset? I'm assuming the analysis you are referring to is decontam. In...