seurat
seurat copied to clipboard
Detecting Pseudobulk DACRs with FindMarkers using DESeq2
Currently analysing a 10X Multiome dataset. When I run FindMarkers() on an ATAC assay with custom peaks using DESeq2 I get an error back. Appreciate if someone here could teach me how to understand and troubleshoot this.
Code used:
# https://github.com/satijalab/seurat/discussions/7763
# add 1 so that DESeq2 will run
so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1)
# FindMarkers()
markers <- FindMarkers(so, ident.1 = "Control", ident.2 = "Knockout",
group.by = "sampleSource",
assay = "customPeakList",
test.use = "DESeq2")
Resulting error message:
converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
newsplit: out of vertex space
In addition: There were 17 warnings (use warnings() to see them)
sessionInfo:
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /usr/lib64/libblas.so.3.4.2
LAPACK: /usr/lib64/liblapack.so.3.4.2
locale:
[1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8
[4] LC_COLLATE=en_AU.UTF-8 LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
[7] LC_PAPER=en_AU.UTF-8 LC_NAME=en_AU.UTF-8 LC_ADDRESS=en_AU.UTF-8
[10] LC_TELEPHONE=en_AU.UTF-8 LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=en_AU.UTF-8
attached base packages:
[1] grid stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DESeq2_1.38.3 SummarizedExperiment_1.28.0 MatrixGenerics_1.10.0
[4] matrixStats_1.1.0 SeuratDisk_0.0.0.9020 scCustomize_1.1.1
[7] cowplot_1.1.3 ggpubr_0.4.0 xlsx_0.6.5
[10] RColorBrewer_1.1-3 viridisLite_0.4.2 circlize_0.4.15
[13] ComplexHeatmap_2.14.0 patchwork_1.1.3.9000 TFBSTools_1.34.0
[16] JASPAR2020_0.99.10 BSgenome.Mmusculus.UCSC.mm10_1.4.3 BSgenome_1.64.0
[19] rtracklayer_1.56.1 Biostrings_2.66.0 XVector_0.38.0
[22] EnsDb.Mmusculus.v79_2.99.0 ensembldb_2.20.2 AnnotationFilter_1.20.0
[25] GenomicFeatures_1.48.4 AnnotationDbi_1.58.0 Biobase_2.58.0
[28] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9 IRanges_2.32.0
[31] S4Vectors_0.36.2 BiocGenerics_0.44.0 forcats_0.5.1
[34] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[37] readr_2.1.2 tidyr_1.3.0 tibble_3.2.1
[40] ggplot2_3.4.4 tidyverse_1.3.1 Signac_1.12.0
[43] Seurat_5.0.0 SeuratObject_5.0.1 sp_2.1-1
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 ggprism_1.0.4 scattermore_1.2 R.methodsS3_1.8.2
[5] bit64_4.0.5 knitr_1.45 irlba_2.3.5.1 DelayedArray_0.24.0
[9] R.utils_2.12.2 data.table_1.14.8 KEGGREST_1.36.0 RCurl_1.98-1.13
[13] doParallel_1.0.17 generics_0.1.3 RSQLite_2.2.14 RANN_2.6.1
[17] future_1.33.0 bit_4.0.4 tzdb_0.3.0 spatstat.data_3.0-3
[21] xml2_1.3.3 lubridate_1.8.0 httpuv_1.6.12 DirichletMultinomial_1.38.0
[25] xfun_0.41 rJava_1.0-6 hms_1.1.1 evaluate_0.23
[29] promises_1.2.1 fansi_1.0.6 restfulr_0.0.15 progress_1.2.2
[33] caTools_1.18.2 dbplyr_2.4.0 readxl_1.4.0 geneplotter_1.74.0
[37] igraph_1.5.1 DBI_1.1.3 htmlwidgets_1.6.3 spatstat.geom_3.2-7
[41] paletteer_1.5.0 ellipsis_0.3.2 RSpectra_0.16-1 backports_1.4.1
[45] annotate_1.74.0 biomaRt_2.52.0 deldir_1.0-9 vctrs_0.6.5
[49] ROCR_1.0-11 abind_1.4-5 cachem_1.0.6 withr_2.5.0
[53] progressr_0.14.0 sctransform_0.4.1 GenomicAlignments_1.34.1 prettyunits_1.2.0
[57] goftest_1.2-3 cluster_2.1.3 dotCall64_1.1-0 lazyeval_0.2.2
[61] seqLogo_1.62.0 crayon_1.5.2 hdf5r_1.3.5 spatstat.explore_3.2-5
[65] pkgconfig_2.0.3 vipor_0.4.5 nlme_3.1-157 ProtGenerics_1.28.0
[69] rlang_1.1.3 globals_0.16.2 lifecycle_1.0.4 miniUI_0.1.1.1
[73] filelock_1.0.2 fastDummies_1.7.3 BiocFileCache_2.6.1 modelr_0.1.8
[77] ggrastr_1.0.1 cellranger_1.1.0 polyclip_1.10-6 RcppHNSW_0.5.0
[81] lmtest_0.9-40 Matrix_1.6-3 carData_3.0-5 zoo_1.8-12
[85] beeswarm_0.4.0 reprex_2.0.1 ggridges_0.5.4 GlobalOptions_0.1.2
[89] png_0.1-8 rjson_0.2.21 bitops_1.0-7 R.oo_1.25.0
[93] KernSmooth_2.23-20 spam_2.10-0 blob_1.2.3 shape_1.4.6
[97] parallelly_1.36.0 spatstat.random_3.2-1 rstatix_0.7.0 ggsignif_0.6.3
[101] CNEr_1.32.0 scales_1.3.0 memoise_2.0.1 magrittr_2.0.3
[105] plyr_1.8.9 ica_1.0-3 zlibbioc_1.44.0 compiler_4.2.0
[109] BiocIO_1.6.0 clue_0.3-64 fitdistrplus_1.1-11 snakecase_0.11.0
[113] Rsamtools_2.14.0 cli_3.6.2 listenv_0.9.0 pbapply_1.7-2
[117] MASS_7.3-57 tidyselect_1.2.0 stringi_1.8.1 yaml_2.3.7
[121] locfit_1.5-9.7 ggrepel_0.9.4 fastmatch_1.1-3 tools_4.2.0
[125] future.apply_1.11.0 parallel_4.2.0 rstudioapi_0.15.0 TFMPvalue_0.0.8
[129] foreach_1.5.2 janitor_2.2.0 gridExtra_2.3 Rtsne_0.16
[133] digest_0.6.33 shiny_1.8.0 pracma_2.4.2 Rcpp_1.0.12
[137] car_3.0-13 broom_0.8.0 later_1.3.1 RcppAnnoy_0.0.21
[141] httr_1.4.7 colorspace_2.1-0 rvest_1.0.2 XML_3.99-0.15
[145] fs_1.5.2 tensor_1.5 reticulate_1.35.0 splines_4.2.0
[149] uwot_0.1.16 RcppRoll_0.3.0 rematch2_2.1.2 spatstat.utils_3.0-4
[153] xlsxjars_0.6.1 plotly_4.10.3 xtable_1.8-4 jsonlite_1.8.7
[157] poweRlaw_0.70.6 R6_2.5.1 pillar_1.9.0 htmltools_0.5.7
[161] mime_0.12 glue_1.7.0 fastmap_1.1.1 BiocParallel_1.32.6
[165] codetools_0.2-18 utf8_1.2.4 lattice_0.20-45 spatstat.sparse_3.0-3
[169] ggbeeswarm_0.6.0 curl_5.1.0 leiden_0.4.3.1 gtools_3.9.5
[173] GO.db_3.15.0 survival_3.5-7 rmarkdown_2.25 munsell_0.5.0
[177] GetoptLong_1.0.5 GenomeInfoDbData_1.2.9 iterators_1.0.14 haven_2.5.0
[181] reshape2_1.4.4 gtable_0.3.4
You do NOT need to do this:
so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1
You misunderstood the discussion at https://github.com/satijalab/seurat/discussions/7763 , this is only necassary if you want to run DESeq2 WITHOUT Seurat.
I removed the step
so@assays$customPeakList$counts <- as.matrix(so@assays$customPeakList$counts+1
I now get this error message: Does this mean my data is too sparse?
Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc, :
every gene contains at least one zero, cannot compute log geometric means
Hi, this error occurs if every single feature contains at least one zero, so it does indicate sparsity. Can you describe more about your data here? How did you pseudobulk here? If you look at some entries of so@assays$customPeakList$counts
, does it seem very sparse across the board or are there a couple of particular outlier samples causing the sparsity?
Thanks for using Seurat!
It appears that this issue has gone stale. In an effort to keep our Issues board from getting more unruly than it already is, we’re going to begin closing out issues that haven’t had any recent activity.
If this issue is still relevant we strongly encourage you to reopen or repost it.