seurat icon indicating copy to clipboard operation
seurat copied to clipboard

Issue with dotplot due to "duplicate row.names" in seurat object even though they are unique

Open HOEFSLE opened this issue 10 months ago • 1 comments

Hi, I'm running into an issue displaying a dotplot with my data. According to the error message there are duplicate rownames in my data. However, when check for it my rownames (barcodes as well as features) appear to be unique. Weirdly I can run VlnPlot on the same object and vlnplot behaves as expected.

Thanks and best regards

any(duplicated(rownames(seu_obj.integrated)))
[1] FALSE
> any(duplicated(stringr::str_split_fixed(Cells(seu_obj.integrated),'_',1)))
[1] FALSE
> DotPlot(seu_obj.integrated,features= df$ensembl_gene_id, group.by = "cell_ontology",split.by = "cohort")
Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names':

 VlnPlot(seu_obj.integrated,features=df$ensembl_gene_id,group.by ="cell_ontology",split.plot = TRUE,split.by = "cohort")


sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale: [1] LC_COLLATE=German_Germany.utf8 LC_CTYPE=German_Germany.utf8 [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C [5] LC_TIME=German_Germany.utf8

time zone: Europe/Berlin tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] ggthemes_5.0.0 ggplot2_3.4.4 dplyr_1.1.3 Seurat_5.0.1 [5] SeuratObject_5.0.1 sp_2.1-0

loaded via a namespace (and not attached): [1] RColorBrewer_1.1-3 jsonlite_1.8.7 magrittr_2.0.3 [4] ggbeeswarm_0.7.2 spatstat.utils_3.0-3 farver_2.1.1 [7] zlibbioc_1.46.0 vctrs_0.6.3 ROCR_1.0-11 [10] memoise_2.0.1 spatstat.explore_3.2-3 RCurl_1.98-1.12
[13] progress_1.2.2 htmltools_0.5.7 curl_5.1.0 [16] sctransform_0.4.1 parallelly_1.36.0 KernSmooth_2.23-21 [19] htmlwidgets_1.6.3 ica_1.0-3 plyr_1.8.9 [22] plotly_4.10.3 zoo_1.8-12 cachem_1.0.8 [25] igraph_1.5.1 mime_0.12 lifecycle_1.0.4
[28] pkgconfig_2.0.3 Matrix_1.6-4 R6_2.5.1 [31] fastmap_1.1.1 GenomeInfoDbData_1.2.10 fitdistrplus_1.1-11 [34] future_1.33.0 shiny_1.8.0 digest_0.6.33 [37] colorspace_2.1-0 patchwork_1.1.3 AnnotationDbi_1.62.2
[40] S4Vectors_0.38.1 tensor_1.5 RSpectra_0.16-1 [43] irlba_2.3.5.1 RSQLite_2.3.3 labeling_0.4.3 [46] filelock_1.0.2 progressr_0.14.0 fansi_1.0.5 [49] spatstat.sparse_3.0-2 httr_1.4.7 polyclip_1.10-6
[52] abind_1.4-5 compiler_4.3.1 withr_2.5.2 [55] bit64_4.0.5 DBI_1.1.3 fastDummies_1.7.3 [58] biomaRt_2.56.1 MASS_7.3-60 rappdirs_0.3.3 [61] tools_4.3.1 vipor_0.4.5 lmtest_0.9-40 [64] beeswarm_0.4.0 httpuv_1.6.11 future.apply_1.11.0 [67] goftest_1.2-3 glue_1.6.2 nlme_3.1-162 [70] promises_1.2.1 grid_4.3.1 Rtsne_0.16 [73] cluster_2.1.4 reshape2_1.4.4 generics_0.1.3 [76] gtable_0.3.4 spatstat.data_3.0-3 tidyr_1.3.0 [79] hms_1.1.3 data.table_1.14.8 xml2_1.3.5 [82] XVector_0.40.0 utf8_1.2.4 BiocGenerics_0.46.0 [85] spatstat.geom_3.2-5 RcppAnnoy_0.0.21 ggrepel_0.9.4 [88] RANN_2.6.1 pillar_1.9.0 stringr_1.5.1 [91] spam_2.10-0 RcppHNSW_0.5.0 later_1.3.1 [94] splines_4.3.1 BiocFileCache_2.8.0 lattice_0.21-8 [97] survival_3.5-5 bit_4.0.5 deldir_1.0-9 [100] httpgd_1.3.1 tidyselect_1.2.0 Biostrings_2.68.1
[103] miniUI_0.1.1.1 pbapply_1.7-2 gridExtra_2.3 [106] IRanges_2.34.1 scattermore_1.2 stats4_4.3.1 [109] Biobase_2.60.0 matrixStats_1.0.0 stringi_1.7.12 [112] lazyeval_0.2.2 codetools_0.2-19 tibble_3.2.1 [115] cli_3.6.1 uwot_0.1.16 systemfonts_1.0.5
[118] xtable_1.8-4 reticulate_1.34.0 munsell_0.5.0 [121] GenomeInfoDb_1.36.4 Rcpp_1.0.11 globals_0.16.2 [124] spatstat.random_3.1-6 dbplyr_2.3.0 png_0.1-8 [127] ggrastr_1.0.2 XML_3.99-0.16 parallel_4.3.1 [130] ellipsis_0.3.2 assertthat_0.2.1 blob_1.2.4 [133] prettyunits_1.2.0 dotCall64_1.1-1 bitops_1.0-7 [136] listenv_0.9.0 viridisLite_0.4.2 scales_1.3.0 [139] ggridges_0.5.4 crayon_1.5.2 leiden_0.4.3.1 [142] purrr_1.0.2 rlang_1.1.1 cowplot_1.1.1 [145] KEGGREST_1.40.1

HOEFSLE avatar Apr 14 '24 19:04 HOEFSLE

Hi, can you double-check if colnames(seu_obj.integrated) has any duplicates, as well as whether there are duplicates in df$ensembl_gene_id?

igrabski avatar Apr 19 '24 17:04 igrabski

Hi, I am having the same problem and it happens only when I am trying to run DotPlot. DotPlot( combined , cols = c("#00AAD4","#FF2A2A"), col.min=-2.5, col.max=2.5, features=rev(goi), group.by="wsnn_res.0.4" ) + coord_flip() -> p1

Warning message: "non-unique values when setting 'row.names': " Error in .rowNamesDF<-(x, value = value): duplicate 'row.names' are not allowed

But, i do not have any duplication in any gene name. rownames(combined) %>% duplicated() %>% table()

. FALSE 23614

colnames(combined) %>% duplicated() %>% table()

. FALSE 23614

Any suggestions as to what might be causing the problem?

raunakkar avatar May 15 '24 10:05 raunakkar

Hi, are there any duplicates in the set of features rev(goi)?

igrabski avatar May 17 '24 18:05 igrabski

Hi @igrabski , No there were no duplicates in any of the vectors for which I tried to generate the DotPlots.

raunakkar avatar May 18 '24 12:05 raunakkar

I ran into this issue when the Idents(seurat_obj) contained NA. I identified this issue with Idents(seurat_obj) %>% anyNA(). Fixing the idents or filtering these cells solved this issue for me. For example, here is how I filtered my seurat_obj to remove cell_type (my Idents) containing NA:

seurat_obj <- subset(seurat_obj, cell_type %in% na.omit(seurat_obj$cell_type))

mjmccoy avatar May 24 '24 22:05 mjmccoy

Hi I am having the same issues. I have been trying to find any way to re-order the cell identities that I wrote up

> Idents(nucSamples) <- "LowRes"
> nucSamples$LowResOrdered <- factor(nucSamples$LowRes, levels = c(
+     "Epi.1",
+     "EpiCiliated.7",
+     "Endo.2",
+     "Mac.4",
+     "T_NKT.11",
+     "SmMusc.5",
+     "Fibro.0",
+     "Astrocytes.3",
+     "Oligodendrocytes.10",
+     "Chromaffin.8",
+     "Neuron.9",
+     "Cycling.6"
+ ))
> Idents(nucSamples) <- "LowResOrdered"
> 
> DotPlot(nucSamples, features = "EPCAM", group.by = "LowResOrdered")


Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names':  

And you can see that there are no duplicated or NA rows

> Idents(nucSamples) %>% anyNA()
[1] FALSE
> rownames(nucSamples) %>% duplicated() %>% table()
.
FALSE 
29658 
> colnames(nucSamples) %>% duplicated() %>% table()
.
FALSE 
18034 

ejscience avatar May 27 '24 01:05 ejscience

@mjmccoy Great suggestion! It worked for me. Thanks.

raunakkar avatar May 27 '24 08:05 raunakkar

Great, I'm glad this worked for you, @raunakkar!

@ejscience, are you able to share your object? If you can email it to igrabski [at] nygenome [dot] org, I am happy to take a look.

igrabski avatar May 28 '24 21:05 igrabski

Thanks for sharing your object -- the error turns out to be that one specified factor level had a typo, which resulted in NAs in nucSamples$LowResOrdered and consequently in Idents(nucSamples). In the future, we will look into having a more informative error message for cases like these!

igrabski avatar Jun 04 '24 13:06 igrabski