dada2 icon indicating copy to clipboard operation
dada2 copied to clipboard

Different sequence runs shows significantly different in alpha diversity

Open Leran10 opened this issue 2 years ago • 5 comments

Hi,

I have 40 samples split into 4 16s sequencing runs. But after merge the 4 runs and do alpha diversity analysis, I found they show significantly statistical difference from each other.

All these conditions should be very similar. The only differences are each batch may have different sequencing death. The total reads we got are different batch by batch.

So I wondered, if this is normal or an evidence that we have something went wrong during wet lab processing?

Thanks! Leran

Leran10 avatar May 25 '22 20:05 Leran10

What alpha diversity metric are you analyzing here?

benjjneb avatar May 28 '22 00:05 benjjneb

Hi, We used Observed richness and Shannon diversity:

image

Leran10 avatar Jun 02 '22 14:06 Leran10

Observed taxa is highly dependent on sequencing depth. Shannon Index is much less so, but not entirely immune to it. Could you recreate these plots from a set of samples subsampled to a constant sequencing depth? (this goes by the term "rarefy" in microbiome analysis) That would be helpful as to trying to understand how systematic the differences are between "W"s.

That said, there are not large amounts of samples in each "W". Is it possible there is a real difference between the samples in each run?

benjjneb avatar Jun 02 '22 23:06 benjjneb

Thank! We used Rarefy() to subsample them to depth of 7000. And 8 samples and 97 OTUs were removed. The updated plot is as below:

image

P values of pairwise Wilcoxon tests are not significant anymore, but the medians of observed ASVs of W2 and W4 are higher than W1 and W3.

So it seems that using this method we still cannot completely get rid of their differences by sequencing batch.

And all the runs are normal cohort without any treatment.....

Leran10 avatar Jun 20 '22 21:06 Leran10

First, I would not use observed ASVs/OTUs as a metric. Looking at the Shannon results, I think it is reasonable that you should include run as a confounding factor in your subsequent statistical analyses of this data.

benjjneb avatar Jun 20 '22 22:06 benjjneb