seurat
seurat copied to clipboard
DietSeurat does not remove data and scale layers
While trying to merge data, I wanted to use DietSeurat in order to only keep the "counts" layer. However, this appears to be impossible at the moment:
DietSeurat(pbmc_small,layers = "counts")->data
data
An object of class Seurat
230 features across 80 samples within 1 assay
Active assay: RNA (230 features, 20 variable features)
3 layers present: counts, data, scale.data
Sessioninfo:
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.4.3 DoubletFinder_2.0.3 SCINA_1.2.0 gplots_3.1.3 MASS_7.3-58.1 sleepwalk_0.3.2 lubridate_1.9.3
[8] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.3 purrr_1.0.2 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
[15] ggplot2_3.4.4 tidyverse_2.0.0 Seurat_5.0.0 SeuratObject_5.0.0 sp_2.1-1
loaded via a namespace (and not attached):
[1] Rtsne_0.16 colorspace_2.1-0 deldir_1.0-9 ellipsis_0.3.2 ggridges_0.5.4 RcppHNSW_0.5.0
[7] rstudioapi_0.15.0 spatstat.data_3.0-3 leiden_0.4.3 listenv_0.9.0 ggrepel_0.9.4 RSpectra_0.16-1
[13] fansi_1.0.5 R.methodsS3_1.8.2 codetools_0.2-18 splines_4.2.2 polyclip_1.10-6 spam_2.10-0
[19] jsonlite_1.8.7 ica_1.0-3 cluster_2.1.4 R.oo_1.25.0 png_0.1-8 uwot_0.1.16
[25] shiny_1.7.5.1 sctransform_0.4.1 spatstat.sparse_3.0-3 compiler_4.2.2 httr_1.4.7 Matrix_1.6-3
[31] fastmap_1.1.1 lazyeval_0.2.2 cli_3.6.1 later_1.3.1 htmltools_0.5.7 tools_4.2.2
[37] igraph_1.5.1 dotCall64_1.1-0 gtable_0.3.4 glue_1.6.2 RANN_2.6.1 reshape2_1.4.4
[43] Rcpp_1.0.11 scattermore_1.2 cellranger_1.1.0 vctrs_0.6.4 spatstat.explore_3.2-5 nlme_3.1-160
[49] progressr_0.14.0 lmtest_0.9-40 spatstat.random_3.2-1 globals_0.16.2 timechange_0.2.0 mime_0.12
[55] miniUI_0.1.1.1 lifecycle_1.0.4 irlba_2.3.5.1 gtools_3.9.4 goftest_1.2-3 future_1.33.0
[61] jrc_0.6.0 zoo_1.8-12 scales_1.2.1 hms_1.1.3 promises_1.2.1 spatstat.utils_3.0-4
[67] parallel_4.2.2 RColorBrewer_1.1-3 reticulate_1.34.0 pbapply_1.7-2 gridExtra_2.3 stringi_1.8.1
[73] fastDummies_1.7.3 caTools_1.18.2 bitops_1.0-7 rlang_1.1.2 pkgconfig_2.0.3 matrixStats_1.1.0
[79] lattice_0.20-45 ROCR_1.0-11 tensor_1.5 patchwork_1.1.3 htmlwidgets_1.6.2 cowplot_1.1.1
[85] tidyselect_1.2.0 parallelly_1.36.0 RcppAnnoy_0.0.21 plyr_1.8.9 magrittr_2.0.3 R6_2.5.1
[91] generics_0.1.3 pillar_1.9.0 withr_2.5.2 fitdistrplus_1.1-11 survival_3.4-0 abind_1.4-5
[97] future.apply_1.11.0 KernSmooth_2.23-20 utf8_1.2.4 spatstat.geom_3.2-7 plotly_4.10.3 tzdb_0.4.0
[103] grid_4.2.2 data.table_1.14.8 digest_0.6.33 xtable_1.8-4 httpuv_1.6.12 R.utils_2.12.2
[109] munsell_0.5.0 viridisLite_0.4.2
@ddiez I am still running into this same issue. I'm not a GitHub wizard by any stretch of the imagination, but a look at #8197 looks like it is held up pending review maybe? Can you advise?
seuObj<- DietSeurat(pbmc_small,layers = "counts") seuObj An object of class Seurat 230 features across 80 samples within 1 assay Active assay: RNA (230 features, 20 variable features) 3 layers present: counts, data, scale.data
Session Info:
`sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2.1
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York tzcode source: internal
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] gtsummary_1.7.2 gt_0.10.0 msigdbr_7.5.1 ggnetwork_0.5.12 cluster_2.1.6 SingleR_2.4.0
[7] scDblFinder_1.16.0 ggridges_0.5.5 gghalves_0.1.4 ggforce_0.4.1 viridis_0.6.4 viridisLite_0.4.2
[13] patchwork_1.1.3 scran_1.30.0 scuttle_1.12.0 SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 Biobase_2.62.0
[19] GenomicRanges_1.54.1 GenomeInfoDb_1.38.5 IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1 MatrixGenerics_1.14.0
[25] matrixStats_1.2.0 Seurat_5.0.1 SeuratObject_5.0.1 sp_2.1-2 lubridate_1.9.3 forcats_1.0.0
[31] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
[37] ggplot2_3.4.4 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] GSEABase_1.64.0 progress_1.2.3 urlchecker_1.0.1 goftest_1.2-3 DT_0.31 Biostrings_2.70.1
[7] HDF5Array_1.30.0 vctrs_0.6.5 spatstat.random_3.2-2 digest_0.6.33 png_0.1-8 ggrepel_0.9.4
[13] deldir_2.0-2 parallelly_1.36.0 MASS_7.3-60 reshape2_1.4.4 httpuv_1.6.13 qvalue_2.34.0
[19] withr_2.5.2 xfun_0.41 ggfun_0.1.3 ellipsis_0.3.2 survival_3.5-7 memoise_2.0.1
[25] ggbeeswarm_0.7.2 profvis_0.3.8 tidytree_0.4.6 zoo_1.8-12 pbapply_1.7-2 prettyunits_1.2.0
[31] KEGGREST_1.42.0 promises_1.2.1 httr_1.4.7 restfulr_0.0.15 globals_0.16.2 fitdistrplus_1.1-11
[37] rhdf5filters_1.14.1 ps_1.7.5 rhdf5_2.46.1 rstudioapi_0.15.0 miniUI_0.1.1.1 generics_0.1.3
[43] processx_3.8.3 babelgene_22.9 curl_5.2.0 zlibbioc_1.48.0 ScaledMatrix_1.10.0 polyclip_1.10-6
[49] GenomeInfoDbData_1.2.11 SparseArray_1.2.3 desc_1.4.3 xtable_1.8-4 S4Arrays_1.2.0 BiocFileCache_2.10.1
[55] hms_1.1.3 irlba_2.3.5.1 colorspace_2.1-0 filelock_1.0.3 ROCR_1.0-11 reticulate_1.34.0
[61] spatstat.data_3.0-3 shinyWidgets_0.8.0 magrittr_2.0.3 lmtest_0.9-40 later_1.3.2 ggtree_3.10.0
[67] lattice_0.22-5 spatstat.geom_3.2-7 future.apply_1.11.1 scattermore_1.2 XML_3.99-0.16 cowplot_1.1.2
[73] RcppAnnoy_0.0.21 pillar_1.9.0 nlme_3.1-164 compiler_4.3.2 beachmat_2.18.0 RSpectra_0.16-1
[79] stringi_1.8.3 shinycssloaders_1.0.0 tensor_1.5 devtools_2.4.5 GenomicAlignments_1.38.0 plyr_1.8.9
[85] crayon_1.5.2 abind_1.4-5 BiocIO_1.12.0 scater_1.30.1 gridGraphics_0.5-1 locfit_1.5-9.8
[91] bit_4.0.5 codetools_0.2-19 BiocSingular_1.18.0 bslib_0.6.1 plotly_4.10.3 mime_0.12
[97] splines_4.3.2 Rcpp_1.0.11 fastDummies_1.7.3 dbplyr_2.4.0 sparseMatrixStats_1.14.0 shinyFiles_0.9.3
[103] knitr_1.45 blob_1.2.4 utf8_1.2.4 fs_1.6.3 listenv_0.9.0 DelayedMatrixStats_1.24.0
[109] pkgbuild_1.4.3 GSVA_1.50.0 ggplotify_0.1.2 Matrix_1.6-4 callr_3.7.3 statmod_1.5.0
[115] tzdb_0.4.0 tweenr_2.0.2 pkgconfig_2.0.3 tools_4.3.2 cachem_1.0.8 RSQLite_2.3.4
[121] DBI_1.2.0 fastmap_1.1.1 scales_1.3.0 grid_4.3.2 usethis_2.2.2 ica_1.0-3
[127] shinydashboard_0.7.2 Rsamtools_2.18.0 sass_0.4.8 dotCall64_1.1-1 graph_1.80.0 RANN_2.6.1
[133] cerebroApp_1.3.1 farver_2.1.1 yaml_2.3.8 rtracklayer_1.62.0 cli_3.6.2 leiden_0.4.3.1
[139] lifecycle_1.0.4 uwot_0.1.16 bluster_1.12.0 sessioninfo_1.2.2 BiocParallel_1.36.0 annotate_1.80.0
[145] timechange_0.2.0 gtable_0.3.4 rjson_0.2.21 progressr_0.14.0 parallel_4.3.2 ape_5.7-1
[151] limma_3.58.1 jsonlite_1.8.8 colourpicker_1.3.0 edgeR_4.0.4 RcppHNSW_0.5.0 bitops_1.0-7
[157] bit64_4.0.5 xgboost_1.7.6.1 Rtsne_0.17 yulab.utils_0.1.2 spatstat.utils_3.0-4 BiocNeighbors_1.20.1
[163] jquerylib_0.1.4 metapod_1.10.1 dqrng_0.3.2 shinyjs_2.1.0 lazyeval_0.2.2 shiny_1.8.0
[169] htmltools_0.5.7 sctransform_0.4.1 rappdirs_0.3.3 glue_1.6.2 spam_2.10-0 broom.helpers_1.14.0
[175] XVector_0.42.0 RCurl_1.98-1.13 treeio_1.26.0 gridExtra_2.3 igraph_1.6.0 R6_2.5.1
[181] pkgload_1.3.3 Rhdf5lib_1.24.1 aplot_0.2.2 DelayedArray_0.28.0 tidyselect_1.2.0 vipor_0.4.7
[187] xml2_1.3.6 AnnotationDbi_1.64.1 future_1.33.1 rsvd_1.0.5 munsell_0.5.0 KernSmooth_2.23-22
[193] data.table_1.14.10 htmlwidgets_1.6.4 RColorBrewer_1.1-3 biomaRt_2.58.0 rlang_1.1.2 spatstat.sparse_3.0-3
[199] spatstat.explore_3.2-5 remotes_2.4.2.1 fansi_1.0.6 beeswarm_0.4.0`
@NBrittonPhD I think this is normal. It is not possible to commit into the repo branches without being a project member (which I am not). I imagine they will review and merge if appropriate when they have time. There are many other issues opened and DietSeurat is relatively low priority for the regular user, I would say. So, we will have to wait :-)
I encountered the same issue. My team's hack for say removing the scale.data layer is
pbmc[["RNA"]]$scale.data <- NULL
I hope the Seurat team fixes this in time. DietSeurat() is easy to remember and teach others.
The DietSeurat
function was a favorite of mine (https://x.com/vangalenlab/status/1288592376353583109?s=20), but since I'm using R version 4.3.1 with Seurat_5.0.1 and SeuratObject_5.0.1, it does not work as expected.
I tried these commands:
aml_small1 <- DietSeurat(aml)
aml_small2 <- DietSeurat(aml, layers = "counts")
Since the DietSeurat documentation says layers = NULL
and layers - A vector or named list of layers to keep
, it is unexpected that both these commands leave all three layers intact in the "small" object.
> Layers(aml_small1)
[1] "counts" "data" "scale.data"
> Layers(aml_small2)
[1] "counts" "data" "scale.data"
I also tried to provide a list to the layers argument but that yielded an error (Error in .PropagateList(): ! None of the values of 'x' match with 'names
).
I also tried @adairama's solution, which works for a Seurat object that is version 4.9.9.9083 (tested with Version(aml)
), but I'm unable to remove the data
layer for the bpdcn
object which is version 5.0.1:
> bpdcn[["RNA"]]$data <- NULL
Warning message:
Resetting the data matrix to the raw counts
> Layers(bpdcn)
[1] "counts" "data"
Hello, I also wanted to reduce a Seurat object to only the counts layer and a single dimension from the many it was composed of (CCA and RPCA integrations) for export, and encountered the same problem as everyone with DietSeurat()
not removing data and scale.data layers.
For anyone interested, here is a simple code I used to produce my diet object anyway :
#First save the idents you want to keep in a df, here I only want to keep 2, but you can create as many columns as you want :
idents.df = data.frame("orig.ident" = integrated$orig.ident, "rpca_clusters" = integrated$rpca_clusters)
#Then create a new Seurat object with the counts layer and idents of your object :
export = CreateSeuratObject(counts = LayerData(integrated, assay = "RNA", layer = "counts"), project = "Export", meta.data = idents.df)
#Transfer UMAP and/or other dimensions, two methods here, second one with [[""]] is better
export@reductions$pca = integrated@reductions$pca
export[["umap"]] = integrated[["umap.rpca"]]
#Additional code to change the key (name of the plot labels on your UMAP)
Key(export[["umap"]]) = "UMAP_"
#Transfer variable features if needed
VariableFeatures(export) = VariableFeatures(integrated)
#Object is ready, you can visualize to verify everything is in order :
Idents(export) = "orig.ident"
Idents(export) = "rpca_clusters"
DimPlot(export, reduction = "umap", label = T, pt.size = 1, raster = T)
Hello, I also wanted to reduce a Seurat object to only the counts layer and a single dimension from the many it was composed of (CCA and RPCA integrations) for export, and encountered the same problem as everyone with
DietSeurat()
not removing data and scale.data layers.For anyone interested, here is a simple code I used to produce my diet object anyway :
#First save the idents you want to keep in a df, here I only want to keep 2, but you can create as many columns as you want :
idents.df = data.frame("orig.ident" = integrated$orig.ident, "rpca_clusters" = integrated$rpca_clusters)
#Then create a new Seurat object with the counts layer and idents of your object :
export = CreateSeuratObject(counts = LayerData(integrated, layers = "counts"), project = "Export", meta.data = idents.df)
#Transfer UMAP and/or other dimensions
export@reductions$pca = integrated@reductions$pca
export@reductions$umap = integrated@reductions$umap.rpca
#Object is ready, you can visualize to verify everything is in order :
Idents(export) = "orig.ident"
Idents(export) = "rpca_clusters"
DimPlot(export, reduction = "umap", label = T, pt.size = 1, raster = T)
This works !
I was also having problem using DietSeurat() to extract "counts" layer for further subclustering work (subseting minor cell type and reclustering with batch correction ) on my seurat object, Looks like for those who performed SCtransform method for seurat analysis may not have problem using DietSeurat(), due to "RNA" assay with only "counts" layer. But for those who performed Standard Seurat workflow (Normalizedata() > FindVariablefeatures() > ScaleData()), his method will work!
Happy that the code I created is helping :)
I didn't include it in my original message but some people might want to transfer also variable features, I will edit to add the line.
Happy that the code I created is helping :)
I didn't include it in my original message but some people might want to transfer also variable features, I will edit to add the line.
You also helped me, thank you so much!
Thanks, however, I hope that DietSeurat() will be updated for v5.
Hi, for anyone interested I have gathered my code up above as well as a few additions in a package you can install, see one of my repositories RightSeuratTools for more info. It should help as a placeholder until Seurat's dev team is able to update DietSeurat()
.
I had the same issue. UsingremoveLayersByPattern
from Seurat.utils package worked for me. It removes all scale.data layers.
library(Seurat)
library(Seurat.utils)
seurat.obj <- removeLayersByPattern(seurat.obj, pattern = "scale.data", perl = TRUE)