matrix with no cells
Hi Ben,
I think that this is a marginal problem.
Anyway, we can have matrices in which all cells are filtered out; that is, all cells have less than 100 umis, for example.
I use BPCells import_matrix_market() to read the matrix market file and subsequently calculate the column sums and drop columns with fewer than 100 'counts'. Later I use write_matrix_dir() to store the matrix on disk. When I restore the matrix in R using open_matrix_dir() , R reports that the second dimension is 1, where the dim name is set to NULL.
library(BPCells)
mat <- import_matrix_market('Keyhole.umi_counts.mtx')
> str(mat)
Formal class 'MatrixDir' [package "BPCells"] with 7 slots
..@ dir : chr "/tmp/xxx/RtmpqwNpT9/matrix_market23d73452b52716"
..@ compressed : logi TRUE
..@ buffer_size: int 8192
..@ type : chr "uint32_t"
..@ dim : int [1:2] 70038 1178
..@ transpose : logi FALSE
..@ dimnames :List of 2
.. ..$ : NULL
.. ..$ : NULL
> csums <- colSums(mat)
> mat2 <- mat[,csums>100]
> str(mat2)
Formal class 'MatrixSubset' [package "BPCells"] with 7 slots
..@ matrix :Formal class 'MatrixDir' [package "BPCells"] with 7 slots
.. .. ..@ dir : chr "/tmp/xxx/RtmpqwNpT9/matrix_market23d73452b52716"
.. .. ..@ compressed : logi TRUE
.. .. ..@ buffer_size: int 8192
.. .. ..@ type : chr "uint32_t"
.. .. ..@ dim : int [1:2] 70038 1178
.. .. ..@ transpose : logi FALSE
.. .. ..@ dimnames :List of 2
.. .. .. ..$ : NULL
.. .. .. ..$ : NULL
..@ row_selection: int(0)
..@ col_selection: int(0)
..@ zero_dims : logi [1:2] FALSE TRUE
..@ dim : int [1:2] 70038 0
..@ transpose : logi FALSE
..@ dimnames :List of 2
.. ..$ : NULL
.. ..$ : NULL
> write_matrix_dir(mat2, 'foo_dir',overwrite=TRUE)
70038 x 1 IterableMatrix object with class MatrixDir
Row names: unknown names
Col names: unknown names
Data type: uint32_t
Storage order: column major
> mat3 <- open_matrix_dir('foo_dir')
> str(mat3)
Formal class 'MatrixDir' [package "BPCells"] with 7 slots
..@ dir : chr "/net/xxx/79/5dc9"| __truncated__
..@ compressed : logi TRUE
..@ buffer_size: int 8192
..@ type : chr "uint32_t"
..@ dim : int [1:2] 70038 1
..@ transpose : logi FALSE
..@ dimnames :List of 2
.. ..$ : NULL
.. ..$ : NULL
> dim(mat3)
[1] 70038 1
The value of 1 creates a problem with a Bioconductor package, which expects a value of 0.
I will try working around this by checking for this condition after the open_matrix_dir() call.
I appreciate your consideration and thoughts.
Ever grateful, Brent
I'll look into this! Busy until next Tuesday but I'll update you around then. Thanks Brent @brgew