seurat-disk icon indicating copy to clipboard operation
seurat-disk copied to clipboard

Error in Initializing RNA with data

Open paupuigdevall opened this issue 4 years ago • 4 comments

Hi.

This is probably not directly related to SeuratDisk function, but when attempting to read a large .h5ad converted to .h5seurat by Convert(), the loading crashed in LoadH5Seurat(). The dataset I pretend to load is huge (>1M cells). Is it possible that R and consequently Seurat is limited to a matrix size that does not allow to import such files? Do you know any workaround to this error?

## My code library(Seurat) library(SeuratDisk) library(hdf5r) file=paste0(pathToDir,"allpools.scanpy.init.h5ad") fileSeurat=paste0(pathToDir,"allpools.scanpy.init.h5seurat") Convert(file, dest = fileSeurat, overwrite = TRUE) test <- LoadH5Seurat(fileSeurat)

This is the error message I obtained: Registered S3 method overwritten by 'cli': method from
print.boxx spatstat Registered S3 method overwritten by 'SeuratDisk': method from
as.sparse.H5Group Seurat Warning: Unknown file type: h5ad Warning: 'assay' not set, setting to 'RNA' Creating h5Seurat file for version 3.1.5.9900 Adding X as data Adding X as counts Adding meta.features from var Validating h5Seurat file Initializing RNA with data Error in if ((lp <- length(p)) < 1 || p[1] != 0 || any((dp <- p[-1] - : missing value where TRUE/FALSE needed Calls: LoadH5Seurat ... as.matrix.H5Group -> as.sparse -> as.sparse.H5Group -> sparseMatrix In addition: Warning message: In sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : NAs introduced by coercion to integer range Execution halted

Thanks in advance for your help.

paupuigdevall avatar Nov 09 '20 21:11 paupuigdevall

was this ever resolved? I have it too.
oddly when I create the matrix by hand it works:

hfile <- Connect("~/Downloads/allexpsctl.h5seurat")
x<-hfile[["assays/RNA/data"]]
sp<-sparseMatrix(i=x[["indices"]][]+1,p=x[["indptr"]][],x=x[["data"]][]  )
sp[1:3,1:3]
3 x 3 sparse Matrix of class "dgCMatrix"
                        
[1,] 2.662056 0.668771 .
[2,] 3.856432 .        .
[3,] 2.741403 .        .

I get a sparse matrix, but

obj <- LoadH5Seurat("~/Downloads/allexpsctl.h5seurat",assays="RNA")
Validating h5Seurat file
Initializing RNA with data
Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][],  : 
  all(dims >= dims.min) is not TRUE

BTW: My h5seurat file was created by converting an h5ad file. Convert("~/Downloads/allexpsctl.h5ad",dest="h5seurat",overwrite=TRUE,verbose=TRUE)

danielruss avatar Mar 24 '21 22:03 danielruss

This may be part of the problem. When I add the "dims" argument to the sparseMatrix function (which I previously left off), the function worked. However if you look at what the h5 file thinks the dims are, you see it is transposed. As it turns out, the spare matrix data in the h5 file is transposed, but the dimension h5attr(x = x, which = "dims") are not transpose leading to shape problems.

> hfile <- Connect("~/Downloads/allexpsctl.h5seurat")
Validating h5Seurat file
> x<-hfile[["assays/RNA/data"]]
> sp<-sparseMatrix(i=x[["indices"]][]+1,p=x[["indptr"]][],x=x[["data"]][]  )
> dim(sp)
[1] 43890 25069
> h5attr(x = x, which = "dims")
[1] 25069 43890

danielruss avatar Mar 24 '21 22:03 danielruss

I managed to load the dataset mentioned by @danielruss after transposing the data.

Convert("./matrices/allexpsctl.h5ad", "H5Seurat", overwrite = TRUE)

obj_HDF5 <- Connect("./matrices/allexpsctl.h5seurat", mode = "r+")

Transpose(obj_HDF5[["assays/RNA/counts"]], overwrite = TRUE)
Transpose(obj_HDF5[["assays/RNA/data"]], overwrite = TRUE)
obj_HDF5$link_delete("assays/RNA/counts")
obj_HDF5$link_delete("assays/RNA/data")
obj_HDF5$link_move_from(obj_HDF5, "assays/RNA/t_counts", "assays/RNA/counts")
obj_HDF5$link_move_from(obj_HDF5, "assays/RNA/t_data", "assays/RNA/data")
old_dims <- hdf5r::h5attr(obj_HDF5[["assays/RNA/data"]], "dims")
new_dims <- rev(old_dims)
hdf5r::h5attr(obj_HDF5[["assays/RNA/counts"]], "dims") <- new_dims
hdf5r::h5attr(obj_HDF5[["assays/RNA/data"]], "dims") <- new_dims
obj_HDF5$close()

pormr avatar Mar 01 '22 13:03 pormr

How long did it take to perform the transpose? I am facing a simmilar problem working with the Allen mouse brain atlas dataset.

JesusGF1 avatar Mar 22 '23 11:03 JesusGF1