seurat-wrappers icon indicating copy to clipboard operation
seurat-wrappers copied to clipboard

DimProj from RunFastMNN: non-conformable arguments

Open dagarfield opened this issue 4 years ago • 4 comments

I've been using RunFastMNN to align partially overlapping datasets. It works great in this context, but I run into an issue not in downstream analyses, but in downstream presentations like heat maps and other exploratory plots discussed here.

> seuratObj.mnn <- RunFastMNN(object.list = by.patient.list)
> ProjectDim(seuratObj.mnn, reduction  = "mnn", dims.print = 1:5)
Error in data.use %*% cell.embeddings : non-conformable arguments
> ProjectDim(seuratObj.mnn, reduction  = "umap", dims.print = 1:2)
Error in data.use %*% cell.embeddings : non-conformable arguments
#And the be thorough
> DimHeatmap(object = seuratObj.mnn, reduction = "mnn", dims = 1, balanced = TRUE)
Error in Loadings(object = object, projected = projected, ...)[, dim,  : 
  subscript out of bounds

But integrated objects following this approach seem to work fine:

> ProjectDim(otherSeuratObj, reduction  = "umap", dims.print = 1:2)
UMAP_ 1 
Positive:  TAGLN, JUN, DCN, DNAJB1, FOS, LUM, JUNB, IGFBP5, MYL9, EGR1 
	   GADD45B, ACTA2, CYR61, CRYAB, TPM2, ATF3, HSPA6, MEG3, GEM, ADIRF 
Negative:  TMSB4X, CD74, B2M, SRGN, HLA-DRA, HLA-DRB1, IFI27, HLA-C, HLA-B, RBP1 
	   TM4SF1, GSTA1, HLA-A, HLA-DPA1, TMSB10, HLA-DPB1, CXCR4, CLU, HLA-DQB1, ACKR1 
UMAP_ 2 
Positive:  B2M, TM4SF1, HLA-C, HLA-B, HLA-A, SRGN, CD74, SPARCL1, CLU, GADD45B 
	   IGFBP7, ANXA1, CCL5, HSPA1A, HLA-DRB1, CXCR4, SOCS3, JUN, ACKR1, UBC 
Negative:  RBP1, GSTA1, SERPINE2, STAR, AMH, TNNI3, FHL2, MAGED2, IQCG, DCN 
	   SOX4, RPL3, LUM, RPS25, RPL7, RPS18, GATM, ARID5B, RPL41, RPS8 
An object of class Seurat 
41602 features across 53746 samples within 3 assays 
Active assay: RNA (20004 features)
 2 other assays present: SCT, integrated
 3 dimensional reductions calculated: pca, umap, tsne
```

Any guesses where to look? It is, of course, possible to go directly to fastMNN and to construct the appropriate reduced dimensionality object. But it would be nice to use RunFastMNN....and I feel like I'm probably missing something obvious about the dimensionality of what's stored in the output object of RunFastMNN.

Thanks

dagarfield avatar Feb 04 '20 08:02 dagarfield

@dagarfield Did you manage to solve the problem? I am facing the same issue as well over here... I guess if no plausible solution is available, then constructing an appropriate reduced dimension object using FastMNN would be the only option.

nicodemus88 avatar May 21 '20 09:05 nicodemus88

In the end, I went to FastMNN itself (as you suggest) and constructed the object directly rather than through the Seurat wrapper. It was a bit annoying, but worked well enough in the end, and the FastMNN documentation is pretty good.

dagarfield avatar May 21 '20 09:05 dagarfield

@dagarfield Could you please kindly provide me your steps in constructing the proper object? I tried to do so but I still could not project my MNN dimensions. Here is my code on how I did MNN correction then convert to Seurat object:

so <- readRDS(file = paste0(output, "/PBMC/SO_merge.Rds"))

### Create SingleCellExperiment object
sce <- as.SingleCellExperiment(so)
rowData(sce) <- NULL
reducedDim(sce) <- NULL
reducedDim(sce, type = "UMAP") <- NULL

### Correct by sample ID
s11 <- sce[ , grepl("S11", sce$orig.ident)]
s12 <- sce[ , grepl("S12", sce$orig.ident)]
s13 <- sce[ , grepl("S13", sce$orig.ident)]
s14 <- sce[ , grepl("S14", sce$orig.ident)]
s15 <- sce[ , grepl("S15", sce$orig.ident)]
s16 <- sce[ , grepl("S16", sce$orig.ident)]
s18 <- sce[ , grepl("S18", sce$orig.ident)]
s19 <- sce[ , grepl("S19", sce$orig.ident)]
s20 <- sce[ , grepl("S20", sce$orig.ident)]
s21 <- sce[ , grepl("S21", sce$orig.ident)]
s22 <- sce[ , grepl("S22", sce$orig.ident)]
s23 <- sce[ , grepl("S23", sce$orig.ident)]
s24 <- sce[ , grepl("S24", sce$orig.ident)]
s25 <- sce[ , grepl("S25", sce$orig.ident)]
s26 <- sce[ , grepl("S26", sce$orig.ident)]
s27 <- sce[ , grepl("S27", sce$orig.ident)]
s28 <- sce[ , grepl("S28", sce$orig.ident)]

all.sce <- list(S11 = s11, S12 = s12, S13 = s13, S14 = s14, S15 = s15, S16 = s16,
                S18 = s18, S19 = s19, S20 = s20, S21 = s21, S22 = s22, S23 = s23,
                S24 = s24, S25 = s25, S26 = s26, S27 = s27, S28 = s28)

### Subset all batches to common universe of genes
universe <- Reduce(intersect, lapply(all.sce, rownames))
all.sce <- lapply(all.sce, "[", i = universe,)

### Adjust scaling to equalize sequencing coverage
normed.sce <- do.call(multiBatchNorm, all.sce)

### Find highly variable genes
all.var <- lapply(all.sce, modelGeneVar)
combined.var <- do.call(combineVar, all.var)
hvg.list <- rownames(combined.var)[combined.var$bio > 0]

### Correct batch effect
set.seed(920101)
mnn.sce <- do.call(fastMNN, c(normed.sce, list(subset.row = hvg.list)))

### Save computed MNN into SCE object, then convert to Seurat object
reducedDim(sce, "MNN") <- reducedDim(mnn.sce, "corrected")
so.fastmnn <- as.Seurat(sce)

Could you guide me on where I did wrong? Thank you very much!

nicodemus88 avatar May 22 '20 14:05 nicodemus88

Brief update... I managed to solve the issue, although I'm not sure if this is the proper way.

The problem with ProjectDim is that it calls the data from the scale.data slot to be used for projection. However, the merged, MNN-corrected Seurat object does not have the scaled data nor variable features as mentioned in #15 .

Therefore, I saved the highly variable genes list used for MNN into the variable features slot in the Seurat object, then scaled the data. After that, I was able to project the loadings. My code is as below.

### Continue from above
so.fastmnn <- as.Seurat(sce)

### Keep highly variable genes list into Seurat object
so.fastmnn@[email protected] <- hvg.list

### Scale data & project loadings
so.fastmnn <- ScaleData(so.fastmnn)

ProjectDim(so.fastmnn, reduction = "mnn", dims.print = 1:2, nfeatures.print = 5)

My results as below:

mnn_ 1 
Positive:  NKG7, GNLY, GZMB, FGFBP2, CST7 
Negative:  RPL32, RPL13, RPS8, RPS12, RPL39 
mnn_ 2 
Positive:  COTL1, TRBV5-1, NSMCE1, HLA-DRB5, SAT1 
Negative:  CD7, NKG7, CCL5, FGFBP2, GZMB 
An object of class Seurat 
15572 features across 93495 samples within 1 assay 
Active assay: RNA (15572 features, 13326 variable features)
 1 dimensional reduction calculated: mnn

These steps seems logical to me but I hope someone could clarify if what I did is indeed correct.

@dagarfield, did you do something similar? Could you share how you solved the issue?

Seurat developers, do my steps seems logical?

Thank you very much!

nicodemus88 avatar May 23 '20 14:05 nicodemus88