svaseq icon indicating copy to clipboard operation
svaseq copied to clipboard

Could I use the surrogate variable calculated from the whole samples when splitting samples by tissue?

Open DORcas-Zheng opened this issue 11 months ago • 0 comments

Hi, I am investigating age-related gene expression changes across different tissue regions with RNA-seq data using DESeq2. A part of my coldata looks like(I have a total of 100 samples)

> coldata
  tissue     age    sex
1    PFC  3month  female
2    Amy  3month  male
3    PFC 6month  female
4    Amy  6month  male
5    PFC 20month  female
6    Amy 20month  male

To mitigate technical variation, I've applied SVAseq with n.sv=2 across all 100 samples. For the subsequent differential expression gene analysis, I've constructed the design model as design = SV1 + SV2 + sex + tissue + age. Additionally, I'm considering dividing my data by tissue. However, I'm uncertain whether it's appropriate to use the surrogate variables computed from the entire dataset as adjusted factors for this purpose. Any insights would be greatly appreciated. Thank you.

DORcas-Zheng avatar Mar 05 '24 06:03 DORcas-Zheng