dada2
dada2 copied to clipboard
Different results from different 'minoverlap' parameter
Hello!
I was analysing 16S NGS data for a study that will correlate culture x 16S NGS results: On the first analysis, I kept the 'minoverlap' default and we had around 1100 ASVs and didn't find some pathogens previosly isolated in culture - mainly Acinetobacter and Burkholderia-Caballeronia-Paraburkholderia. So I set 'minoverlap = 10', had around 1500 ASVs and finally Acinetobacter and Burkholderia-Caballeronia-Paraburkholderia was found in the microbiome data. I would really like to understand these differences.
Thanks in advance!
The minOverlap
parameter defines the minimum amount the forward and reverse reads must overlap with one another for them to be merged. So, if the reads are too short to overlap sufficiently, they will be dropped by this step.
A couple key things to remember here: It is the length of the reads after truncLen
has been enforced that affects merging. It is the length of the sequenced amplicon including primers if they are sequenced. And there is biological length variation in even 16S fragment length, the V3V4 locus being a notable example with two modes about 20nts different in length from one another.
A rough rule of thumb, your truncation lengths should add up to 20 + the length of the sequenced amplicon in the longest organism being targeted.