seurat-object icon indicating copy to clipboard operation
seurat-object copied to clipboard

`subset(features = ...)` yields `Error in .subscript.2ary(x, i, j, drop = TRUE) : subscript out of bounds` after SCTransform

Open mschilli87 opened this issue 1 month ago • 6 comments

R
R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
library(Seurat)
Loading required package: SeuratObject
Loading required package: sp

Attaching package: ‘SeuratObject’

The following object is masked from ‘package:base’:

    intersect

library(SeuratData)
── Installed datasets ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── SeuratData v0.2.2.9001 ──
✔ pbmc3k 3.1.4

─────────────────────────────────────────────────────────────────────────────────────── Key ───────────────────────────────────────────────────────────────────────────────────────
✔ Dataset loaded successfully
❯ Dataset built with a newer version of Seurat than installed
❓ Unknown version of Seurat installed
#InstallData("pbmc3k")

sobj <- LoadData("pbmc3k")
Validating object structure
Updating object slots
Ensuring keys are in the proper structure
Warning: Assay RNA changing from Assay to Assay
Ensuring keys are in the proper structure
Ensuring feature names don't have underscores or pipes
Updating slots in RNA
Validating object structure for Assay ‘RNA’
Object representation is consistent with the most current Seurat version
Warning: Assay RNA changing from Assay to Assay5
subset(sobj, features=rownames(sobj))
An object of class Seurat
13714 features across 2700 samples within 1 assay
Active assay: RNA (13714 features, 0 variable features)
 2 layers present: counts, data
subset(sobj, features=head(rownames(sobj)))
An object of class Seurat
6 features across 2700 samples within 1 assay
Active assay: RNA (6 features, 0 variable features)
 2 layers present: counts, data
sobj <- SCTransform(sobj, vars.to.regress = "nCount_RNA", verbose = FALSE)
subset(sobj, features=rownames(sobj))
An object of class Seurat
25144 features across 2700 samples within 2 assays
Active assay: SCT (12572 features, 3000 variable features)
 3 layers present: counts, data, scale.data
 1 other assay present: RNA
> subset(sobj, features=head(rownames(sobj)))
subset(sobj, features=head(rownames(sobj)))
Error in .subscript.2ary(x, i, j, drop = TRUE) : subscript out of bounds
sobj <- SCTransform(sobj, vars.to.regress = "nCount_RNA", verbose = FALSE, return.only.var.genes = FALSE)
subset(sobj, features=rownames(sobj))
An object of class Seurat
25144 features across 2700 samples within 2 assays
Active assay: SCT (12572 features, 3000 variable features)
 3 layers present: counts, data, scale.data
 1 other assay present: RNA
subset(sobj, features=head(rownames(sobj)))
Error in .subscript.2ary(x, i, j, drop = TRUE) : subscript out of bounds

~So for some reason, SCT is dropping features that are still returned as rownames even thought SCT is the active assay. On top, subset fails with a cryptic error message instead of handling this sitatuion gracefully (e.g. by returning a Seurat object with all requested features that are present in the SCT alongside a warning why (and which) features had to be dropped and what to do to fix it). Moreover,~ I ran into this when trying to extract normalized counts for genes that I found differentially expressed in a downstream analysis, so those are not just some weird all-zero count edge cases or such.

sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0

locale:
 [1] LC_CTYPE=C.UTF-8           LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Madrid
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2.9001   Seurat_5.0.3
[4] SeuratObject_5.0.1      sp_2.1-3

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3          jsonlite_1.8.8
  [3] magrittr_2.0.3              spatstat.utils_3.0-4
  [5] zlibbioc_1.48.0             vctrs_0.6.5
  [7] ROCR_1.0-11                 DelayedMatrixStats_1.24.0
  [9] spatstat.explore_3.2-6      RCurl_1.98-1.14
 [11] S4Arrays_1.2.0              htmltools_0.5.7
 [13] SparseArray_1.2.4           sctransform_0.4.1
 [15] parallelly_1.37.1           KernSmooth_2.23-22
 [17] htmlwidgets_1.6.4           ica_1.0-3
 [19] plyr_1.8.9                  plotly_4.10.4
 [21] zoo_1.8-12                  igraph_2.0.3
 [23] mime_0.12                   lifecycle_1.0.4
 [25] pkgconfig_2.0.3             Matrix_1.6-5
 [27] R6_2.5.1                    fastmap_1.1.1
 [29] GenomeInfoDbData_1.2.11     MatrixGenerics_1.14.0
 [31] fitdistrplus_1.1-11         future_1.33.1
 [33] shiny_1.8.0                 digest_0.6.34
 [35] colorspace_2.1-0            patchwork_1.2.0
 [37] S4Vectors_0.40.2            tensor_1.5
 [39] RSpectra_0.16-1             irlba_2.3.5.1
 [41] GenomicRanges_1.54.1        progressr_0.14.0
 [43] fansi_1.0.6                 spatstat.sparse_3.0-3
 [45] httr_1.4.7                  polyclip_1.10-6
 [47] abind_1.4-5                 compiler_4.3.3
 [49] fastDummies_1.7.3           MASS_7.3-60.0.1
 [51] DelayedArray_0.28.0         rappdirs_0.3.3
 [53] tools_4.3.3                 lmtest_0.9-40
 [55] httpuv_1.6.14               future.apply_1.11.1
 [57] goftest_1.2-3               glmGamPoi_1.14.3
 [59] glue_1.7.0                  nlme_3.1-164
 [61] promises_1.2.1              grid_4.3.3
 [63] Rtsne_0.17                  cluster_2.1.6
 [65] reshape2_1.4.4              generics_0.1.3
 [67] gtable_0.3.4                spatstat.data_3.0-4
 [69] tidyr_1.3.1                 data.table_1.15.2
 [71] XVector_0.42.0              utf8_1.2.4
 [73] BiocGenerics_0.48.1         spatstat.geom_3.2-9
 [75] RcppAnnoy_0.0.22            ggrepel_0.9.5
 [77] RANN_2.6.1                  pillar_1.9.0
 [79] stringr_1.5.1               spam_2.10-0
 [81] RcppHNSW_0.6.0              later_1.3.2
 [83] splines_4.3.3               dplyr_1.1.4
 [85] lattice_0.22-5              survival_3.5-8
 [87] deldir_2.0-4                tidyselect_1.2.0
 [89] miniUI_0.1.1.1              pbapply_1.7-2
 [91] gridExtra_2.3               IRanges_2.36.0
 [93] SummarizedExperiment_1.32.0 scattermore_1.2
 [95] stats4_4.3.3                Biobase_2.62.0
 [97] matrixStats_1.2.0           stringi_1.8.3
 [99] lazyeval_0.2.2              codetools_0.2-19
[101] tibble_3.2.1                cli_3.6.2
[103] uwot_0.1.16                 xtable_1.8-4
[105] reticulate_1.35.0           munsell_0.5.0
[107] Rcpp_1.0.12                 GenomeInfoDb_1.38.6
[109] globals_0.16.2              spatstat.random_3.2-3
[111] png_0.1-8                   parallel_4.3.3
[113] ellipsis_0.3.2              ggplot2_3.5.0
[115] dotCall64_1.1-1             sparseMatrixStats_1.14.0
[117] bitops_1.0-7                listenv_0.9.1
[119] viridisLite_0.4.2           scales_1.3.0
[121] ggridges_0.5.6              leiden_0.4.3.1
[123] purrr_1.0.2                 crayon_1.5.2
[125] rlang_1.1.3                 cowplot_1.1.3

mschilli87 avatar May 23 '24 13:05 mschilli87