F1000_workflow
F1000_workflow copied to clipboard
DESeq2 variance stabilizing transformation
In DESeq2, after applying getVarianceStabilizedData, all counts = 0 are transformed to negative values. These correspond to 2/3 of the values in the matrix. I am worried that this will interfere with downstream Hierarchical FDR analysis.
- Is it "normal" to get negative normalized values for counts = 0?
- Can I proceed with Hierarchical multiple testing with structSSI using this transformed counts?
- Could I just go with log transformed counts?
- Should I apply other transformation?
Thanks Susan for publishing this helpful guide.
In some analysis, having zeros turn into nonzero values can be problematic. For example, this would cause problems for any procedures expecting some sort of zero-inflation. However, for the hierarchical testing procedure, you really only need to be concerned about whether your original tree-wide p-values are valid. Since the treePValues function is performing t or F tests, you will be okay if you have enough samples and the transformed data aren’t too skewed -- this will guarantee that the central limit theorem kicks in for the averages. Alternatively, you could use a nonparametric test. So, briefly,
- Yes, this is expected, even in the original RNA-seq analysis for which the variance stabilization was designed.
- As long as the individual tests for each ASV is valid, the tree testing procedure will behave as expected. 3 + 4. The best transformation is the one that gives your individual tests the most power. You can get some intuition for this by looking at the histograms of the transformed counts -- the less skewed, the better.