seurat-disk
seurat-disk copied to clipboard
Error converting Tabula Sapiens h5ad files to h5seurat
Anyone else have these issues trying to convert using SeuratDisk? My main error is "Error in if (!x[[i]]$dims) { : argument is of length zero" (see below) Convert seemed to work ok, except it has an unknown file type h5ad? I'd approach the TS people but they will likely point me your way at first, as they used scanpy (h5ad) exports.
Convert("TS_Kidney.h5ad", dest = "h5seurat") Warning: Unknown file type: h5ad Warning: 'assay' not set, setting to 'RNA' Creating h5Seurat file for version 3.1.5.9900 Adding X as data Adding raw/X as counts Adding meta.features from raw/var Adding X_pca as cell embeddings for pca Adding X_scvi as cell embeddings for scvi Adding X_scvi_umap as cell embeddings for scvi_umap Adding X_umap as cell embeddings for umap Adding miscellaneous information for umap Adding _scvi to miscellaneous data Adding _training_mode to miscellaneous data Adding cell_ontology_class_colors to miscellaneous data Adding dendrogram_cell_type_tissue to miscellaneous data Adding dendrogram_computational_compartment_assignment to miscellaneous data Adding dendrogram_consensus_prediction to miscellaneous data Adding dendrogram_tissue_cell_type to miscellaneous data Adding donor_colors to miscellaneous data Adding donor_method_colors to miscellaneous data Adding hvg to miscellaneous data Adding method_colors to miscellaneous data Adding organ_tissue_colors to miscellaneous data Adding sex_colors to miscellaneous data Adding tissue_colors to miscellaneous data Adding layer decontXcounts as data in assay decontXcounts Adding layer decontXcounts as counts in assay decontXcounts Adding layer raw_counts as data in assay raw_counts Adding layer raw_counts as counts in assay raw_counts Warning message: In fun(libname, pkgname) : rgeos: versions of GEOS runtime 3.10.3-CAPI-1.16.1 and GEOS at installation 3.7.2-CAPI-1.11.2differ
kidney<-LoadH5Seurat("TS_Kidney.h5seurat") Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding feature-level metadata for RNA Initializing decontXcounts with data Adding counts for decontXcounts Initializing raw_counts with data Adding counts for raw_counts Adding reduction pca Adding cell embeddings for pca Adding miscellaneous information for pca Adding reduction scvi Adding cell embeddings for scvi Adding miscellaneous information for scvi Adding reduction scvi_umap Adding cell embeddings for scvi_umap Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from scvi_umap_ to scviumap_ Warning: All keys should be one or more alphanumeric characters followed by an underscore '', setting key to scviumap Adding miscellaneous information for scvi_umap Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Adding miscellaneous information Error in if (!x[[i]]$dims) { : argument is of length zero
sessionInfo() R version 4.1.3 (2022-03-10) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux 8.6 (Ootpa)
Matrix products: default BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.15.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_3.3.6 anndata_0.7.5.3 SeuratDisk_0.0.0.9020 sp_1.5-0 SeuratObject_4.1.0
[6] Seurat_4.1.1
loaded via a namespace (and not attached):
[1] plyr_1.8.7 igraph_1.3.4 lazyeval_0.2.2 splines_4.1.3
[5] listenv_0.8.0 scattermore_0.8 GenomeInfoDb_1.30.1 digest_0.6.29
[9] htmltools_0.5.3 fansi_1.0.3 magrittr_2.0.3 tensor_1.5
[13] cluster_2.1.3 ROCR_1.0-11 globals_0.15.1 matrixStats_0.62.0
[17] spatstat.sparse_2.1-1 colorspace_2.0-3 rappdirs_0.3.3 ggrepel_0.9.1
[21] xfun_0.31 dplyr_1.0.9 crayon_1.5.1 RCurl_1.98-1.7
[25] jsonlite_1.8.0 progressr_0.10.1 spatstat.data_2.2-0 survival_3.3-1
[29] zoo_1.8-10 glue_1.6.2 polyclip_1.10-0 gtable_0.3.0
[33] zlibbioc_1.40.0 XVector_0.34.0 leiden_0.4.2 DelayedArray_0.20.0
[37] future.apply_1.9.0 BiocGenerics_0.40.0 abind_1.4-5 scales_1.2.0
[41] spatstat.random_2.2-0 miniUI_0.1.1.1 Rcpp_1.0.9 viridisLite_0.4.0
[45] xtable_1.8-4 reticulate_1.25 spatstat.core_2.4-4 bit_4.0.4
[49] stats4_4.1.3 htmlwidgets_1.5.4 httr_1.4.3 RColorBrewer_1.1-3
[53] ellipsis_0.3.2 ica_1.0-3 farver_2.1.1 pkgconfig_2.0.3
[57] uwot_0.1.11 deldir_1.0-6 utf8_1.2.2 here_1.0.1
[61] labeling_0.4.2 tidyselect_1.1.2 rlang_1.0.4 reshape2_1.4.4
[65] later_1.3.0 munsell_0.5.0 tools_4.1.3 cli_3.3.0
[69] generics_0.1.3 ggridges_0.5.3 evaluate_0.15 stringr_1.4.0
[73] fastmap_1.1.0 yaml_2.3.5 goftest_1.2-3 knitr_1.39
[77] bit64_4.0.5 fitdistrplus_1.1-8 purrr_0.3.4 RANN_2.6.1
[81] sparseMatrixStats_1.6.0 pbapply_1.5-0 future_1.26.1 nlme_3.1-158
[85] mime_0.12 hdf5r_1.3.5 compiler_4.1.3 rstudioapi_0.13
[89] plotly_4.10.0 png_0.1-7 spatstat.utils_2.3-1 tibble_3.1.7
[93] glmGamPoi_1.6.0 stringi_1.7.8 RSpectra_0.16-1 rgeos_0.5-9
[97] lattice_0.20-45 Matrix_1.4-1 vctrs_0.4.1 pillar_1.8.0
[101] lifecycle_1.0.1 spatstat.geom_2.4-0 lmtest_0.9-40 RcppAnnoy_0.0.19
[105] data.table_1.14.2 cowplot_1.1.1 bitops_1.0-7 irlba_2.3.5
[109] httpuv_1.6.5 patchwork_1.1.1 GenomicRanges_1.46.1 R6_2.5.1
[113] promises_1.2.0.1 KernSmooth_2.23-20 gridExtra_2.3 IRanges_2.28.0
[117] parallelly_1.32.0 codetools_0.2-18 MASS_7.3-58 assertthat_0.2.1
[121] SummarizedExperiment_1.24.0 rprojroot_2.0.3 withr_2.5.0 sctransform_0.3.3
[125] S4Vectors_0.32.4 GenomeInfoDbData_1.2.7 mgcv_1.8-40 parallel_4.1.3
[129] grid_4.1.3 rpart_4.1.16 tidyr_1.2.0 DelayedMatrixStats_1.16.0
[133] rmarkdown_2.14 MatrixGenerics_1.6.0 Rtsne_0.16 Biobase_2.54.0
[137] shiny_1.7.2
Same error here trying to convert the bone marrow dataset:
Convert("TS_bone_Marrow.h5ad", dest = "h5seurat", overwrite = TRUE)
TS_bone_marrow <- LoadH5Seurat("TS_Bone_Marrow.h5seurat")
Error in if (!x[[i]]$dims) { : argument is of length zero
anyone??
This worked for me, FYI:
library(Seurat) library(SeuratObject) library(SeuratDisk) library(anndata) library(ggplot2) library(Matrix.stats)
h5ad_file<-read_h5ad("tabula_sapiens_whatever.h5ad")
#counts are in the "layers" slot -- use these for raw RNA-seq counts.
raw_counts<-h5ad_file$layers$get("raw_counts") #filter so no error messages for cells with counts =0 raw_counts<- raw_counts[,which(colSums(raw_counts) > 0.0)] #can also add variance filters here if desired. I didn't need to.
h5ad_df<-as.data.frame(raw_counts) #need to transpose to be in the correct format of genes=rows, barcodes=columns. h5ad_df<-t(h5ad_df)
#pull out observations for metadata injection. h5ad_obs<-as.data.frame(h5ad_file$obs)
tsfile<-CreateSeuratObject(h5ad_df) tsfile<-AddMetaData( tsfile, h5ad_obs)
#filtering out only the 10X counts -- they include both smartseq and 10x in the data.
Idents(object = tsfile) <- "method" tsfile<-subset(x = tsfile, idents = "10X")
#now can change your idents to whatever. #Idents(object = tsfile) <- "cell_ontology_class"
#all set from here on in, you have a seurat object with metadata.
Same error here trying to convert the bone marrow dataset:
Convert("TS_bone_Marrow.h5ad", dest = "h5seurat", overwrite = TRUE) TS_bone_marrow <- LoadH5Seurat("TS_Bone_Marrow.h5seurat") Error in if (!x[[i]]$dims) { : argument is of length zero
anyone??
Great,
I also figured it out.
I did simply this:
BM_rna <- LoadH5Seurat("TS_Bone_Marrow.h5seurat", assays = "RNA")
Great -- that approach didn't work for me though. As long as something works...
Approach by @moschmi worked for me.
Didn't work at first. Restart your Rstudio. It should work.
I'm getting another error when trying to load the lung dataset.
> Convert(source=tss.data.file, dest=tmp.file.name)
Warning: Unknown file type: h5ad
Warning: 'assay' not set, setting to 'RNA'
Creating h5Seurat file for version 3.1.5.9900
Adding X as data
Adding raw/X as counts
Adding meta.features from raw/var
Adding X_pca as cell embeddings for pca
Adding X_scvi as cell embeddings for scvi
Adding X_scvi_umap as cell embeddings for scvi_umap
Adding X_umap as cell embeddings for umap
Adding miscellaneous information for umap
Adding _scvi to miscellaneous data
Adding _training_mode to miscellaneous data
Adding cell_ontology_class_colors to miscellaneous data
Adding dendrogram_cell_type_tissue to miscellaneous data
Adding dendrogram_computational_compartment_assignment to miscellaneous data
Adding dendrogram_consensus_prediction to miscellaneous data
Adding dendrogram_tissue_cell_type to miscellaneous data
Adding donor_colors to miscellaneous data
Adding donor_method_colors to miscellaneous data
Adding hvg to miscellaneous data
Adding method_colors to miscellaneous data
Adding organ_tissue_colors to miscellaneous data
Adding sex_colors to miscellaneous data
Adding tissue_colors to miscellaneous data
Adding layer decontXcounts as data in assay decontXcounts
Adding layer decontXcounts as counts in assay decontXcounts
Adding layer raw_counts as data in assay raw_counts
Adding layer raw_counts as counts in assay raw_counts
> tss.srt.obj <- LoadH5Seurat(file=tmp.file.name, assays='RNA', meta.data=FALSE, misc = FALSE)
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Error in match.arg(arg = layer, choices = Layers(object = object, search = FALSE)) :
'arg' should be one of “counts”, “data”, “scale.data”
Has anyone else come across this error?
@VicenteFR - I don't see your original code references the .h5ad dataset, can you share that?
I'm getting another error when trying to load the lung dataset.
> Convert(source=tss.data.file, dest=tmp.file.name) Warning: Unknown file type: h5ad Warning: 'assay' not set, setting to 'RNA' Creating h5Seurat file for version 3.1.5.9900 Adding X as data Adding raw/X as counts Adding meta.features from raw/var Adding X_pca as cell embeddings for pca Adding X_scvi as cell embeddings for scvi Adding X_scvi_umap as cell embeddings for scvi_umap Adding X_umap as cell embeddings for umap Adding miscellaneous information for umap Adding _scvi to miscellaneous data Adding _training_mode to miscellaneous data Adding cell_ontology_class_colors to miscellaneous data Adding dendrogram_cell_type_tissue to miscellaneous data Adding dendrogram_computational_compartment_assignment to miscellaneous data Adding dendrogram_consensus_prediction to miscellaneous data Adding dendrogram_tissue_cell_type to miscellaneous data Adding donor_colors to miscellaneous data Adding donor_method_colors to miscellaneous data Adding hvg to miscellaneous data Adding method_colors to miscellaneous data Adding organ_tissue_colors to miscellaneous data Adding sex_colors to miscellaneous data Adding tissue_colors to miscellaneous data Adding layer decontXcounts as data in assay decontXcounts Adding layer decontXcounts as counts in assay decontXcounts Adding layer raw_counts as data in assay raw_counts Adding layer raw_counts as counts in assay raw_counts > tss.srt.obj <- LoadH5Seurat(file=tmp.file.name, assays='RNA', meta.data=FALSE, misc = FALSE) Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding feature-level metadata for RNA Error in match.arg(arg = layer, choices = Layers(object = object, search = FALSE)) : 'arg' should be one of “counts”, “data”, “scale.data”
Has anyone else come across this error?
I got the same issue, have you figured it out?
Here you are my solution.
mkdir -p /data/cqs/references/Tabula_Sapiens
cd /data/cqs/references/Tabula_Sapiens
wget -O TabulaSapiens.h5ad.zip https://figshare.com/ndownloader/files/40067134
unzip TabulaSapiens.h5ad.zip
setwd('/data/cqs/references/Tabula_Sapiens')
library(Seurat)
library(SeuratData)
library(SeuratDisk)
library(hdf5r)
h5ad_file <- 'TabulaSapiens.h5ad'
Convert(h5ad_file, dest = 'h5seurat')
h5seurat_file <- 'TabulaSapiens.h5seurat'
#https://github.com/mojaveazure/seurat-disk/issues/109
f <- H5File$new(h5seurat_file, "r+")
groups <- f$ls(recursive = TRUE)
for (name in groups$name[grepl("categories$", groups$name)]) {
names <- strsplit(name, "/")[[1]]
names <- c(names[1:length(names) - 1], "levels")
new_name <- paste(names, collapse = "/")
f[[new_name]] <- f[[name]]
}
for (name in groups$name[grepl("codes$", groups$name)]) {
names <- strsplit(name, "/")[[1]]
names <- c(names[1:length(names) - 1], "values")
new_name <- paste(names, collapse = "/")
f[[new_name]] <- f[[name]]
grp <- f[[new_name]]
grp$write(args = list(1:grp$dims), value = grp$read() + 1)
}
f$close_all()
obj=LoadH5Seurat(h5seurat_file,assays='RNA')
saveRDS(obj,'TabulaSapiens.rds')