seurat icon indicating copy to clipboard operation
seurat copied to clipboard

Following up: Inconsistent behavior of 'merge' for RNA and SCT assays #8687

Open anaccsilva opened this issue 10 months ago • 1 comments

Hello, @viktorfeketa clearly discussed the issues I am having in issue #8687, but I was not able to completely understand the solution. It does seem to be a recurrent issue for people that was used to Seurat V4 running SCT, but it seems a little bit more complex to use SCT and Seurat V5.

I have 12 individual samples that I previously run SCT normalization. I merged these objects - does it matter RNA, SCT assay? Should I split and run SCT again before integrating?

Here is the code I used (but I am getting strange clusters that I can't really identify):

OM.list <- c(OM39, OM44, OM46, OM19, OM22,OM28,OM32,OM36,OM37,OM45,OM48,OM51)
OM.merge <- merge( x = OM39, y= list(OM44, OM46, OM19, OM22,OM28,OM32,OM36,OM37,OM45,OM48,OM51))


OM.merge[["RNA"]] <- split(OM.merge[["RNA"]], f = OM.merge$disease)
OM.all <- SCTransform(OM.merge)
OM.all <- RunPCA(OM.all)
OM.all <- RunUMAP(OM.all, dims = 1:30)
DimPlot(OM.all, reduction = "umap", group.by = c("disease", "om"))

OM.all <- IntegrateLayers(object = OM.all, method = HarmonyIntegration, normalization.method = "SCT", verbose = F)
OM.all <- FindNeighbors(OM.all, reduction = "harmony", dims = 1:30)
OM.all <- FindClusters(OM.all, resolution = c(0, 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0, 1.2, 1.5), verbose = FALSE)

I really appreciate if someone can show the workflow for integrating multiple samples using SCT and starting from a list of individual Seurat objects. Thank you, Ana

anaccsilva avatar Apr 15 '24 21:04 anaccsilva

Hello,

Your code looks reasonable to me, except what is the disease variable? Running SCTransform on an object with layers split by disease will create a separate model for each value of disease, which might not be what you want, especially if you have samples from different sequencing runs that have the same value of "disease".

Instead, you could split by "sample" or whatever variable corresponds to sequencing runs, which will also be used for integration. OM.merge[["RNA"]] <- split(OM.merge[["RNA"]], f = OM.merge$sample)

The updated workflow we show in the vignette (and in your code above) is equivalent to runningSCTransform and RunPCA separately on a list of SeuratObjects. It's easier to merge them initially, split them by layer, and then run preprocessing steps on 1 multi-layered object so that you don't have to loop over a list of seurat objects.

Hope that is helpful.

mhkowalski avatar Apr 19 '24 16:04 mhkowalski