seurat icon indicating copy to clipboard operation
seurat copied to clipboard

RunFastMNN() with SCT assay

Open AmelZulji opened this issue 2 years ago • 10 comments

Hi folks,

I am getting following error when running RunFastMNN() on samples normalized with SCT:

Computing 2000 integration features
Error in SummarizedExperiment::SummarizedExperiment(assays = assays) : 
  the rownames and colnames of the supplied assay(s) must be NULL or identical to those of
  the SummarizedExperiment object (or derivative) to construct

Seems like it is dataset specific and I dont have an reproducible example. However, I noticed that SCTransform() on one of the samples produces a warning:

Warning message:
In variance_prior(ql_disp, df, covariate = gene_means, abundance_trend = ql_disp_trend) :
  Variance prior estimate did not properly converge

If I remove the dubious sample, RunFastMNN() works fine.

The problem was discussed here, but I dont know how to go around it. Would you have any suggestion?

Thank you, Amel

AmelZulji avatar Nov 24 '21 13:11 AmelZulji

Can you post the full code here?

saketkc avatar Dec 03 '21 17:12 saketkc

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

no-response[bot] avatar Dec 17 '21 18:12 no-response[bot]

I apologize for the late response, @saketkc

here is the code I use:

library(tidyverse)
library(SeuratWrappers)
library(Seurat)
library(filesstrings)

file_paths <- 
  Sys.glob("cr_dir/*/outs/filtered_feature_bc_matrix.h5")

raw_seu_obj_list <- list()
for (i in file_paths) {
  # extract sample name from full path
  sample_name <- after_nth(string = i, pattern = "/", n = 1) %>% before_first(pattern = "/")
  
  tmp_count <- Read10X_h5(filename = i)
  tmp_seu <- CreateSeuratObject(counts = tmp_count)
  
  tmp_seu[["sample_id"]] <- paste(sample_name)
  raw_seu_obj_list[[sample_name]] <- tmp_seu
}

on_merged <-
  merge(x = raw_seu_obj_list[[1]],
        y = raw_seu_obj_list[2:length(raw_seu_obj_list)],
        add.cell.ids = names(raw_seu_obj_list))

on_merged <- SCTransform(object = on_merged, 
                         conserve.memory = T, 
                         method="glmGamPoi")

on_merged <- RunFastMNN(object.list = SplitObject(on_merged, split.by = "sample_id"), assay = "SCT")

this is the Error it throws:

Computing 2000 integration features
Error in SummarizedExperiment::SummarizedExperiment(assays = assays) : 
  the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
  SummarizedExperiment object (or derivative) to construct

I greatly appreciate your help!

Reagrds, Amel

AmelZulji avatar Feb 17 '22 09:02 AmelZulji

Seems like the problem is caused by as.SingleCellExperiment() which is called within RunFastMNN() on the object which is subset based on the features.

This is throwing the same error:

pbmc_small_sub <- subset(pbmc_small, features = VariableFeatures(pbmc_small))
as.SingleCellExperiment(pbmc_small_sub)

Error in SummarizedExperiment::SummarizedExperiment(assays = assays) : 
  the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
  SummarizedExperiment object (or derivative) to construct

Adding checkDimnames = F in the as.SingleCellExperiment() seems to resolve the problem. However, i'm wondering why subseting based on feature is causing this problem in first place?

sume <- SummarizedExperiment::SummarizedExperiment(assays = assays, checkDimnames = F)

You help is much appreciated!

Reagrds, Amel

AmelZulji avatar Mar 01 '22 16:03 AmelZulji

I think this is the problematic line from as.SingleCellExperiment():

    if (isTRUE(x = all.equal(target = dim(x = assays[["counts"]]), 
      current = dim(x = scaledata_a)))) {
      assays[["scaledata"]] <- scaledata_a

which does not check the ordering of rows in scaledata assay. Matching order of rows solves the problem:

library(Seurat)
pbmc_small_sub <- subset(pbmc_small, features = VariableFeatures(pbmc_small))

counts_sub <- GetAssayData(object = pbmc_small_sub, assay = "RNA", slot = "counts")
data_sub <- GetAssayData(object = pbmc_small_sub, assay = "RNA", slot = "data")
scaled_sub <- GetAssayData(object = pbmc_small_sub, assay = "RNA", slot = "scale.data")

indx <- match(rownames(counts_sub),rownames(scaled_sub))
scaled_sub <- scaled_sub[indx,]

test <- list(counts = counts_sub, data = data_sub, scaled = scaled_sub)
SummarizedExperiment::SummarizedExperiment(assays = test)

class: SummarizedExperiment 
dim: 20 80 
metadata(0):
assays(3): counts data scaled
rownames(20): PPBP IGLL5 ... RP11-290F20.3 S100A9
rowData names(0):
colnames(80): ATGCCAGAACGACT CATGGCCTGTGCAT ... GGAACACTTCAGAC CTTGATTGATCTTC
colData names(0):

Kind regards, Amel

AmelZulji avatar Mar 02 '22 16:03 AmelZulji

Hi @AmelZulji, what version of Seurat are you currently using? This works fine for me with Seurat v4.1.0 and SummarizedExperiment 1.14.1:

pbmc_small_sub <- subset(pbmc_small, features = VariableFeatures(pbmc_small))
as.SingleCellExperiment(pbmc_small_sub)

saketkc avatar Mar 02 '22 18:03 saketkc

Thanks for response, @saketkc. I am using Seurat v4.1.0 and SummarizedExperiment 1.24.0.

AmelZulji avatar Mar 03 '22 09:03 AmelZulji

Thanks, I can confirm that I now receive an error with SummarizeExperiment 1.24.0. We will have a fix soon.

saketkc avatar Mar 11 '22 13:03 saketkc

Hi @saketkc,

Any update on this?

Regards, Amel

AmelZulji avatar Jul 27 '22 13:07 AmelZulji

I would like to second @AmelZulji. I'm getting exactly the same issue, with SummarizedExperiment 1.24.0 and Seurat 4.1.1, and SeuratWrappers 0.3.0. I can confirm that downgrading to SummarizedExperiment 1.20.0 solves the issue. @AmelZulji's analysis of the problem is correct. The order of rows from [email protected] does not match the one from $SCT@counts and $SCT@data, and this is raises an error in SummarizedExperiment 1.24.0.

This issue seems to be related: https://github.com/satijalab/seurat-wrappers/issues/126

annikagable avatar Sep 30 '22 08:09 annikagable

I have encountered the same issue. Is there a solution on the way for this? @saketkc

Dazcam avatar Oct 13 '22 12:10 Dazcam

Having the same issue after regressing out cell cycle score.

reliscu avatar Oct 25 '22 02:10 reliscu

Hello, I recently encountered this problem, when trying to run fastMNN after SCTransform.

I check the source code of fastMNN and think the answer of @AmelZulji is correct.

The order of row names in SCT scaledata is different in the raw count.

Thus, we only need to change the order of scaledata as the @AmelZulji says.

Here is my solution: Run trace('RunFastMNN', edit = T) to change the source code of RunFastMNN.

Then change the objects.sce <- lapply(X = object.list, FUN = function(x, f) { return(as.SingleCellExperiment(x = subset(x = x, features = f))) }, f = features) (line25-28)

to the

objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx <- match(rownames(x@assays$SCT@counts),rownames(x@[email protected])) x@[email protected] <- x@[email protected][indx,] }else{ x = subset(x = x, features = f) } return(as.SingleCellExperiment(x)) }, f = features)

Then RunFastMNN function will be work for SCTransform.

yuanlizhanshi avatar Dec 17 '22 17:12 yuanlizhanshi

Thank you @yuanlizhanshi - your edit to the function resolved this exact issue. Appreciate it!

sanjeevRJMU1 avatar Mar 22 '23 20:03 sanjeevRJMU1

Hi all. I am getting the following issue when I replace the code: objects.sce <- lapply(X = object.list, FUN = function(x, f) { return(as.SingleCellExperiment(x = subset(x = x, features = f))) }, f = features) with: objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx <- match(rownames(x@assays$SCT@counts),rownames(x@[email protected])) x@[email protected] <- x@[email protected][indx,] }else{ x = subset(x = x, features = f) } return(as.SingleCellExperiment(x)) }, f = features)

Anyone can let me know why? Thanks.

Error in parse(file) : C:...\rstudio-scratch-20b827c91a5d.R:25:128: unexpected symbol 24: } 25: objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx ^

callmekelvinn avatar Mar 31 '23 02:03 callmekelvinn

Hi all. I am getting the following issue when I replace the code: objects.sce <- lapply(X = object.list, FUN = function(x, f) { return(as.SingleCellExperiment(x = subset(x = x, features = f))) }, f = features) with: objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx <- match(rownames(x@assays$SCT@counts),rownames(x@assays$[email protected])) x@assays$[email protected] <- x@assays$[email protected][indx,] }else{ x = subset(x = x, features = f) } return(as.SingleCellExperiment(x)) }, f = features)

Anyone can let me know why? Thanks.

Error in parse(file) : C:...\rstudio-scratch-20b827c91a5d.R:25:128: unexpected symbol 24: } 25: objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx ^

You're missing a semicolon (;) or a line break between the two commands.

Here's the corrected code:

objects.sce <- lapply(X = object.list, FUN = function(x,f) { if (DefaultAssay(x) == 'SCT') { x = subset(x = x, features = f) indx <- match(rownames(x@assays$SCT@counts),rownames(x@[email protected])) x@[email protected] <- x@[email protected][indx,] } else { x = subset(x = x, features = f) } return(as.SingleCellExperiment(x)) }, f = features)

Vitarum avatar Jul 22 '23 17:07 Vitarum