seurat scVIIntegration and FastMNNIntegration failed after

In seurat 5, all run quite well with logNormalize, but got an error after I changed to SCT normalization with FastMNNIntegration and scVIIntegration. The codes and errors I got please find below:

combined<-readRDS("LCNEC_filter.rds") combined[["RNA"]] <- split(combined[["RNA"]], f = combined$patient_ID) combined <- SCTransform(combined) combined <- RunPCA(combined)

combined <- IntegrateLayers( object = combined, method = FastMNNIntegration, new.reduction = "integrated.mnn", normalization.method = "SCT", verbose = FALSE )

Error in checkBatchConsistency(batches, cells.in.columns = TRUE) : number of rows is not the same across batches (see batch "normalization.method")

combined <- IntegrateLayers( object = combined, method = scVIIntegration, new.reduction = "integrated.scvi", normalization.method = "SCT", conda_env = "miniconda3/envs/Seurat", verbose = FALSE )

Error in UseMethod(generic = "JoinLayers", object = object) : no applicable method for 'JoinLayers' applied to an object of class "c('SCTAssay', 'Assay', 'KeyMixin')"

CCA, rPCA and harmony work well with SCT normalization, does it mean FastMNN and scVI can't run with SCT normalization?

Thanks!

Mar 15 '24 23:03 sabrina0701

I came across your issue while searching for answers for my related problems with SCTransform and the new v5 layers. I think the issue is that SCTransform internally calls sctransform::vst which has a parameter min_cells=5 by default. Which means when SCTransform is run separately for each sample, it only uses genes that are expressed in at least 5 cells in that sample. So the set of genes used is going to differ for each sample, which is what I think may be causing the number of rows is not the same across batches (see batch "normalization.method") error that you're seeing. SCTransform somehow returns values for all genes in counts and data slots in the SCT assay, but the scale.data slot only has values for genes that are expressed in at least 5 cells in every single sample! I stumbled upon this after trying to make heatmaps of some top DE genes, only to find they weren't in the scale.data. Which boggles my mind because genes that aren't expressed in any cells in one group but many cells in another group are exactly the genes you want to find!

The second error for scVIIntegration of no applicable method for 'JoinLayers' applied to an object of class "c('SCTAssay', 'Assay', 'KeyMixin') might be due to the fact that the SCT assay is no longer split into layers by sample, but instead it's back to count/data/scale.data layers.

I have no idea of the inner workings of SCTransform, vst, or the FastMNNIntegration/scVIIntegration to even suggest any fixes. Just hope this points the people that do to the correct places!

Mar 21 '24 17:03 jdrnevich

I'm getting the same error, have you managed to figure out a work around?

May 10 '24 12:05 artsvendsen

Also getting the same error with SCT normalization, when running:

integObj %>% IntegrateLayers( object = ., normalization.method = "SCT", method = FastMNNIntegration, #orig.reduction = "pca", new.reduction = "iFMNN" )

Error in checkBatchConsistency(batches, cells.in.columns = TRUE) : number of rows is not the same across batches (see batch "normalization.method")

Looks like there is an issue in fastMNN retrieving the S4 objects and its only 1 object is getting retrieved (was using 2 in my test). Might be because layers are different between SCT assay and RNA assay and not getting accounted for correctly? In RNA assay I see 5 layers for 2 samples, in SCT assay I see 3 layers.

When running trace, the batches object only has 1 object in it, and the error is thrown by checkBatchConsistency. In the meantime I'll just be running with RNA assay with log normalization and scaling

trace('fastMNN', edit=T)

batches <- .unpackLists(...) checkBatchConsistency(batches, cells.in.columns = TRUE)

Or could it be SCT normalization can't be run with FastMNN integration? Thank you!

May 28 '24 16:05 NBIX-Brandon-Sos

I also encountered the same problem, did you solve it?

Jul 20 '24 12:07 Mincana-Huang

Haven't looked into it more, I just run the RNA assay with log normalization and scaling for FastMNN right now

Jul 22 '24 15:07 NBIX-Brandon-Sos

I have the same problem using the SCT assay for fastMNN integration..

Sep 11 '24 12:09 kathbosc

I encountered the same error when trying to run SCT + IntegrateLayers + FastMNNIntegration using R 4.4.

Reproduce the error

Run tutorial: Integrative analysis in Seurat v5.

library(Seurat)
library(SeuratData)
library(SeuratWrappers)
options(future.globals.maxSize = 1e9)
# Layers in the Seurat v5 object
obj <- LoadData("pbmcsca")
obj <- subset(obj, nFeature_RNA > 1000)
obj[["RNA"]] <- split(obj[["RNA"]], f = obj$Method)
# Perform streamlined (one-line) integrative analysis
options(future.globals.maxSize = 3e+09)
obj <- SCTransform(obj)
obj <- RunPCA(obj, npcs = 30, verbose = F)

Different ways of specifying IntegrateLayers produce different errors

# Case 1
obj <- IntegrateLayers(
  object = obj,
  method = FastMNNIntegration,
  normalization.method = "SCT",
  new.reduction = "integrated.mnn",
  verbose = F
)
# Case 2
obj <- IntegrateLayers(
  object = obj,
  method = FastMNNIntegration,
  new.reduction = "integrated.mnn",
  verbose = FALSE
)

Error msg that comes from out <- do.call(what = batchelor::fastMNN, args = c(objects.sce, list(...))) in FastMNNIntegration

# Case 1
Error in checkBatchConsistency(batches, cells.in.columns = TRUE) : 
  number of rows is not the same across batches (see batch "normalization.method")
# Case 2
Error in .check_valid_batch(batches[[1]], batch = batch) : 
  'batch' must be specified if '...' has only one object

Debug

This is likely due to that the input (args) of batchelor::fastMNN was not set properly. So the actual input in Case 1 was fastMNN(objects.sce, list(normalization.method = "SCT)) and Case 2 was fastMNN(objects.sce)

objects.sce is converted from ONE SINGLE data layer storing cells from all batches (as opposed to several layers, each storing one batch); specifically ...

IntegrateLayers passes obj[["SCT"]] to FastMNNIntegration, which further subset/convert obj[["SCT"]]$data into objects.sce
obj[["SCT"]] stores all batches in the same layer, therefore objects.sce is a list of ONE sce element

list(normalization.method = "SCT) from list(...)

be careful that list(...) will pass all additional arguments to fastMNN, where they will be considered as objects representing different batches (see below)

However the function help fastMNN(..., batch = NULL, many more arguments) says

...: "... Alternatively, one or more SingleCellExperiment objects can be supplied containing a log-expression matrix in the assay.type assay. ... If multiple objects are supplied, each object is assumed to contain all and only cells from a single batch. If a single object is supplied, it is assumed to contain cells from all batches, so batch should also be specified."
batch: "A vector or factor specifying the batch of origin for all cells when only a single object is supplied in .... This is ignored if multiple objects are present."

Therefore input should be specified in either of the two ways:

fastMNN(sce1, sce2, sce3, sce4, sce5, batch = NULL)
fastMNN(sce_all, batch = sce_all_group)

Solution

This piece of code runs through without incurring error; however, it's good to have someone familiar with SCT confirm the validity :) Specifically, I didn't read SCTransform into details and not sure whether the data layer (later converted to assay "logcounts" in objects.sce) it generates contains proper log-expression matrix that fastMNN expects.

obj <- IntegrateLayers(
  object = obj,
  method = FastMNNIntegration,
  new.reduction = "integrated.mnn",
  batch= obj$Method, # Add this line
# DO NOT add additional arguments that doesn't belong to `IntegrateLayers` / `FastMNNIntegration` / `fastMNN`
  verbose = F
)

Oct 02 '24 22:10 shwong-tw