seurat-disk
seurat-disk copied to clipboard
Error in Initializing RNA with data
Hi.
This is probably not directly related to SeuratDisk function, but when attempting to read a large .h5ad converted to .h5seurat by Convert(), the loading crashed in LoadH5Seurat(). The dataset I pretend to load is huge (>1M cells). Is it possible that R and consequently Seurat is limited to a matrix size that does not allow to import such files? Do you know any workaround to this error?
## My code library(Seurat) library(SeuratDisk) library(hdf5r) file=paste0(pathToDir,"allpools.scanpy.init.h5ad") fileSeurat=paste0(pathToDir,"allpools.scanpy.init.h5seurat") Convert(file, dest = fileSeurat, overwrite = TRUE) test <- LoadH5Seurat(fileSeurat)
This is the error message I obtained:
Registered S3 method overwritten by 'cli':
method from
print.boxx spatstat
Registered S3 method overwritten by 'SeuratDisk':
method from
as.sparse.H5Group Seurat
Warning: Unknown file type: h5ad
Warning: 'assay' not set, setting to 'RNA'
Creating h5Seurat file for version 3.1.5.9900
Adding X as data
Adding X as counts
Adding meta.features from var
Validating h5Seurat file
Initializing RNA with data
Error in if ((lp <- length(p)) < 1 || p[1] != 0 || any((dp <- p[-1] - :
missing value where TRUE/FALSE needed
Calls: LoadH5Seurat ... as.matrix.H5Group -> as.sparse -> as.sparse.H5Group -> sparseMatrix
In addition: Warning message:
In sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], :
NAs introduced by coercion to integer range
Execution halted
Thanks in advance for your help.
was this ever resolved? I have it too.
oddly when I create the matrix by hand it works:
hfile <- Connect("~/Downloads/allexpsctl.h5seurat")
x<-hfile[["assays/RNA/data"]]
sp<-sparseMatrix(i=x[["indices"]][]+1,p=x[["indptr"]][],x=x[["data"]][] )
sp[1:3,1:3]
3 x 3 sparse Matrix of class "dgCMatrix"
[1,] 2.662056 0.668771 .
[2,] 3.856432 . .
[3,] 2.741403 . .
I get a sparse matrix, but
obj <- LoadH5Seurat("~/Downloads/allexpsctl.h5seurat",assays="RNA")
Validating h5Seurat file
Initializing RNA with data
Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], :
all(dims >= dims.min) is not TRUE
BTW: My h5seurat file was created by converting an h5ad file.
Convert("~/Downloads/allexpsctl.h5ad",dest="h5seurat",overwrite=TRUE,verbose=TRUE)
This may be part of the problem. When I add the "dims" argument to the sparseMatrix function (which I previously left off), the function worked. However if you look at what the h5 file thinks the dims are, you see it is transposed. As it turns out, the spare matrix data in the h5 file is transposed, but the dimension h5attr(x = x, which = "dims") are not transpose leading to shape problems.
> hfile <- Connect("~/Downloads/allexpsctl.h5seurat")
Validating h5Seurat file
> x<-hfile[["assays/RNA/data"]]
> sp<-sparseMatrix(i=x[["indices"]][]+1,p=x[["indptr"]][],x=x[["data"]][] )
> dim(sp)
[1] 43890 25069
> h5attr(x = x, which = "dims")
[1] 25069 43890
I managed to load the dataset mentioned by @danielruss after transposing the data.
Convert("./matrices/allexpsctl.h5ad", "H5Seurat", overwrite = TRUE)
obj_HDF5 <- Connect("./matrices/allexpsctl.h5seurat", mode = "r+")
Transpose(obj_HDF5[["assays/RNA/counts"]], overwrite = TRUE)
Transpose(obj_HDF5[["assays/RNA/data"]], overwrite = TRUE)
obj_HDF5$link_delete("assays/RNA/counts")
obj_HDF5$link_delete("assays/RNA/data")
obj_HDF5$link_move_from(obj_HDF5, "assays/RNA/t_counts", "assays/RNA/counts")
obj_HDF5$link_move_from(obj_HDF5, "assays/RNA/t_data", "assays/RNA/data")
old_dims <- hdf5r::h5attr(obj_HDF5[["assays/RNA/data"]], "dims")
new_dims <- rev(old_dims)
hdf5r::h5attr(obj_HDF5[["assays/RNA/counts"]], "dims") <- new_dims
hdf5r::h5attr(obj_HDF5[["assays/RNA/data"]], "dims") <- new_dims
obj_HDF5$close()
How long did it take to perform the transpose? I am facing a simmilar problem working with the Allen mouse brain atlas dataset.