seurat
seurat copied to clipboard
FindTransferAnchors giving error when using an harmony integrated scRNA data as reference and spatial data as query
Hi
I'm trying to deconvolve my sequencing based spatial data using my scRNA data which is and integrated dataset of two independent scRNA data. I'm following the tutorial mentioned at https://satijalab.org/seurat/articles/spatial_vignette#integration-with-single-cell-data for my analysis. However when I'm running FindTransferAnchors, I'm getting the error "Error: Given reference assay (SCT) has 2 reference sct models. Please provide a reference assay with a single reference sct model.". I've seen the issue #7291 and the issues referenced in this issue about using "integrated assay" while running FindTransferAnchors, however, there is no integrated assay created when following the Seurat V5 tutorial https://satijalab.org/seurat/articles/integration_introduction#perform-integration-with-sctransform-normalized-datasets for performing integration. Below is a reproducible example for my issue and also my session info. Please let me know if I'm doing anything wrong. @saketkc
Thanks!
Load Data
ifnb <- LoadData("ifnb")
split datasets and process without integration
ifnb[["RNA"]] <- split(ifnb[["RNA"]], f = ifnb$stim) ifnb <- SCTransform(ifnb) ifnb <- RunPCA(ifnb) ifnb <- RunUMAP(ifnb, dims = 1:30) DimPlot(ifnb, reduction = "umap", group.by = c("stim", "seurat_annotations"))
integrate datasets
ifnb <- IntegrateLayers(object = ifnb, method = HarmonyIntegration, normalization.method = "SCT", verbose = F, new.reduction = "integrated.harmony") ifnb <- FindNeighbors(ifnb, reduction = "integrated.harmony", dims = 1:30) ifnb <- FindClusters(ifnb, resolution = 0.6) ifnb <- RunUMAP(ifnb, dims = 1:30, reduction = "integrated.harmony", reduction.name = "umap.harmony", return.model = T) DimPlot(ifnb, reduction = "umap.harmony", group.by = c("stim", "seurat_annotations"))
Spatial Data
brain <- LoadData("stxBrain", type = "anterior1")
brain <- SCTransform(brain, assay = "Spatial", verbose = FALSE) brain <- RunPCA(brain, assay = "SCT", verbose = FALSE) brain <- FindNeighbors(brain, reduction = "pca", dims = 1:30) brain <- FindClusters(brain, verbose = FALSE) brain <- RunUMAP(brain, reduction = "pca", dims = 1:30)
cortex <- subset(brain, idents = c(1, 2, 3, 4, 6, 7)) cortex <- subset(cortex, anterior1_imagerow > 400 | anterior1_imagecol < 150, invert = TRUE) cortex <- subset(cortex, anterior1_imagerow > 275 & anterior1_imagecol > 370, invert = TRUE) cortex <- subset(cortex, anterior1_imagerow > 250 & anterior1_imagecol > 440, invert = TRUE)
cortex <- SCTransform(cortex, assay = "Spatial", verbose = FALSE) %>% RunPCA(verbose = FALSE)
anchors <- FindTransferAnchors(reference = ifnb, query = cortex, normalization.method = "SCT") predictions.assay <- TransferData(anchorset = anchors, refdata = ifnb$seurat_annotations, prediction.assay = TRUE, weight.reduction = cortex[["pca"]], dims = 1:30) cortex[["predictions"]] <- predictions.assay
sessionInfo() R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: America/Los_Angeles tzcode source: internal
attached base packages: [1] tools grid stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] stxBrain.SeuratData_0.1.1 ggdendro_0.1.23 scales_1.3.0 ifnb.SeuratData_3.1.0
[5] SeuratData_0.2.2.9001 RColorBrewer_1.1-3 lubridate_1.9.3 forcats_1.0.0
[9] stringr_1.5.1 purrr_1.0.2 readr_2.1.5 tidyr_1.3.0
[13] tibble_3.2.1 tidyverse_2.0.0 SeuratDisk_0.0.0.9021 dplyr_1.1.4
[17] ComplexHeatmap_2.18.0 clusterProfiler_4.10.0 org.Hs.eg.db_3.18.0 AnnotationDbi_1.64.1
[21] IRanges_2.36.0 S4Vectors_0.40.2 Biobase_2.62.0 BiocGenerics_0.48.1
[25] harmony_1.2.0 Rcpp_1.0.12 readxl_1.4.3 reshape2_1.4.4
[29] ggplot2_3.4.4 splitstackshape_1.4.8 patchwork_1.2.0 sctransform_0.4.1
[33] Seurat_5.0.1 SeuratObject_5.0.1 sp_2.1-2
loaded via a namespace (and not attached):
[1] fs_1.6.3 matrixStats_1.2.0 spatstat.sparse_3.0-3 bitops_1.0-7
[5] enrichplot_1.22.0 HDO.db_0.99.1 httr_1.4.7 doParallel_1.0.17
[9] utf8_1.2.4 R6_2.5.1 lazyeval_0.2.2 uwot_0.1.16
[13] GetoptLong_1.0.5 withr_3.0.0 gridExtra_2.3 progressr_0.14.0
[17] textshaping_0.3.7 cli_3.6.2 spatstat.explore_3.2-5 fastDummies_1.7.3
[21] scatterpie_0.2.1 labeling_0.4.3 spatstat.data_3.0-4 ggridges_0.5.5
[25] pbapply_1.7-2 systemfonts_1.0.5 yulab.utils_0.1.3 gson_0.1.0
[29] DOSE_3.28.2 parallelly_1.36.0 limma_3.58.1 rstudioapi_0.15.0
[33] RSQLite_2.3.4 generics_0.1.3 gridGraphics_0.5-1 shape_1.4.6
[37] ica_1.0-3 spatstat.random_3.2-2 GO.db_3.18.0 Matrix_1.6-5
[41] fansi_1.0.6 abind_1.4-5 lifecycle_1.0.4 yaml_2.3.8
[45] SummarizedExperiment_1.32.0 SparseArray_1.2.3 glmGamPoi_1.14.0 qvalue_2.34.0
[49] BiocFileCache_2.10.1 Rtsne_0.17 blob_1.2.4 promises_1.2.1
[53] crayon_1.5.2 miniUI_0.1.1.1 lattice_0.22-5 cowplot_1.1.2
[57] KEGGREST_1.42.0 pillar_1.9.0 knitr_1.45 GenomicRanges_1.54.1
[61] fgsea_1.28.0 rjson_0.2.21 future.apply_1.11.1 codetools_0.2-19
[65] fastmatch_1.1-4 leiden_0.4.3.1 glue_1.7.0 ggfun_0.1.4
[69] remotes_2.4.2.1 data.table_1.14.10 vctrs_0.6.5 png_0.1-8
[73] treeio_1.26.0 spam_2.10-0 cellranger_1.1.0 gtable_0.3.4
[77] cachem_1.0.8 xfun_0.41 S4Arrays_1.2.0 mime_0.12
[81] tidygraph_1.3.0 survival_3.5-7 iterators_1.0.14 statmod_1.5.0
[85] ellipsis_0.3.2 fitdistrplus_1.1-11 ROCR_1.0-11 nlme_3.1-164
[89] ggtree_3.10.0 bit64_4.0.5 filelock_1.0.3 RcppAnnoy_0.0.21
[93] GenomeInfoDb_1.38.5 irlba_2.3.5.1 KernSmooth_2.23-22 colorspace_2.1-0
[97] DBI_1.2.1 processx_3.8.3 tidyselect_1.2.0 bit_4.0.5
[101] compiler_4.3.2 curl_5.2.0 hdf5r_1.3.9 DelayedArray_0.28.0
[105] desc_1.4.3 plotly_4.10.4 shadowtext_0.1.3 lmtest_0.9-40
[109] callr_3.7.3 rappdirs_0.3.3 digest_0.6.34 goftest_1.2-3
[113] presto_1.0.0 spatstat.utils_3.0-4 rmarkdown_2.25 RhpcBLASctl_0.23-42
[117] XVector_0.42.0 htmltools_0.5.7 pkgconfig_2.0.3 sparseMatrixStats_1.14.0
[121] MatrixGenerics_1.14.0 dbplyr_2.4.0 fastmap_1.1.1 rlang_1.1.3
[125] GlobalOptions_0.1.2 htmlwidgets_1.6.4 DelayedMatrixStats_1.24.0 shiny_1.8.0
[129] farver_2.1.1 zoo_1.8-12 jsonlite_1.8.8 BiocParallel_1.36.0
[133] GOSemSim_2.28.1 RCurl_1.98-1.14 magrittr_2.0.3 GenomeInfoDbData_1.2.11
[137] ggplotify_0.1.2 dotCall64_1.1-1 munsell_0.5.0 ape_5.7-1
[141] viridis_0.6.4 reticulate_1.34.0 stringi_1.8.3 ggraph_2.1.0
[145] zlibbioc_1.48.0 MASS_7.3-60.0.1 pkgbuild_1.4.3 plyr_1.8.9
[149] parallel_4.3.2 listenv_0.9.1 ggrepel_0.9.5 deldir_2.0-2
[153] Biostrings_2.70.1 graphlayouts_1.1.0 splines_4.3.2 tensor_1.5
[157] hms_1.1.3 circlize_0.4.15 ps_1.7.6 igraph_1.6.0
[161] spatstat.geom_3.2-7 RcppHNSW_0.5.0 pkgload_1.3.4 evaluate_0.23
[165] tzdb_0.4.0 foreach_1.5.2 tweenr_2.0.2 httpuv_1.6.13
[169] RANN_2.6.1 polyclip_1.10-6 future_1.33.1 clue_0.3-65
[173] scattermore_1.2 ggforce_0.4.1 xtable_1.8-4 RSpectra_0.16-1
[177] tidytree_0.4.6 later_1.3.2 ragg_1.2.7 viridisLite_0.4.2
[181] aplot_0.2.2 memoise_2.0.1 cluster_2.1.6 timechange_0.3.0
[185] globals_0.16.2
This seems to be currently not supported with V5. cc @dcollins15
Unfortunately, Saket is right, it's not possible to use a harmony
-integrated dataset as the reference for FindTransferAnchors
since it works on a dimensional reduction of the original counts matrix. This is true of v5 integration generally so that the behavior of IntegrateLayers
is consistent across algorithms.
If you want to use an integrated dataset as a reference my recommendation would be to fall back to the v3 implementation.
Thank you for your response @saketkc and @dcollins15. For now, I aggregated the two samples using cellranger and then performed clustering using seurat and batch correction using harmony giving me a result similar to what I was getting before with the integration algorithm. I was able to use this as a reference to deconvolute my spatial data. Thanks!!