Deviation contrast and excluding samples via design matrix
Hi Hugo,
Thank you for your material! It helped me understand how to handle design matrix and contrast vectors.
I have two specific topics in which I would ask for your help, and it would also be nice if you could implement these topics in your tutorial: 1) deviation contrast 2) excluding samples from contrasting
1) deviation contrast
I am trying to create a deviation contrast that is the difference between the mean of a condition to the grand mean. I wonder whether this procedure is correct?
design(dds) <- ~ 0 + condition
dds <- DESeq(dds)
mod_mat <- model.matrix(design(dds), colData(dds))
grand_mean <- colMeans(mod_mat)
condition1 <- colMeans(mod_mat[which(dds$condition == "condition1"), ])
res3 <- results(dds, contrast = condition1 - grand_mean)
2) excluding samples from contrasting
If I want to exclude a sample from my contrast analysis but want to keep it to infer dispersion, is it ok to set all values to zero in the design matrix? But in this case, do I also need to consider that I have less sample in the contrast when I perform, e.g., a deviation contrast? More specifically, I shouldn't calculate the grand mean by colMeans, because the divisor should be the # of all samples minus the # of samples that were excluded.
By modifying the code above:
design(dds) <- ~ 0 + condition
dds <- DESeq(dds)
mod_mat <- model.matrix(design(dds), colData(dds))
I want to exclude the condition3 from the contrast:
excl_n <- sum(mod_mat[,str_detect(colnames(mod_mat), condition3)]) # number of excluded samples
mod_mat[,colnames(mod_mat) %in% condition3] <- 0
grand_mean <- apply(mod_mat, 2, function(x) sum(x) / (length(x) - excl_n))
condition1 <- colMeans(mod_mat[which(dds$condition == "condition1"), ])
res3 <- results(dds, contrast = condition1 - grand_mean)
Thank you!
Torda
Hi @vtorda
Sorry for taking some time to reply to this. This reply is just based on looking at your code visually, without having tested it myself. But the short answer is that I think what you're doing is fine in both cases.
1) Yes, I think setting the contrast that way would answer your question of interest.
2) That seems like it's one way to do it, but I think perhaps you're over-complicating (unless I'm missing something). You could perhaps do it this way:
grand_mean <- colMeans(mod_mat[which(dds$condition != "condition3"), ])
So, strictly speaking it's not the grand mean (it's the mean of all the terms except condition3), but it seems to be what you want, I think.
I'll keep this issue open, to remind me to include examples of this in the document, when I have some time. Thanks for the questions! :smiley:
Hi Hugo,
Thank you very much for your reply! You helped me a lot, not just with your answers but with your thorough explanations in your documents! Regarding the second point, I think you are right; your code is much more straightforward and does the same what I want: calculate the mean of all samples except one and contrast the mean of one of the samples and the mean of all samples except that one.
Cheers, Torda