pizzarr icon indicating copy to clipboard operation
pizzarr copied to clipboard

Indexing for arbitrary elements of dimensions

Open Artur-man opened this issue 1 year ago • 7 comments

User story

Hey there,

Is it possible to access arbitrary elements of dimensions (like it is done in Rarr with index arguement) instead of using slice. Is this already implemented or not available at the moment ?

zarr.array <- pizzarr::zarr_open(store = "data/mat.zarr")
mat <- array(1:350, c(10, 5, 7))
zarr.array$create_dataset("assay", data = mat, shape = dim(mat))
zarr.array$get_item("assay")$get_item(list(slice(1,6,2), slice(1, 2), slice(1, 1)))$data
, , 1

     [,1] [,2]
[1,]    1   11
[2,]    3   13
[3,]    5   15

It is possible to access a single element.

zarr.array$get_item("assay")$get_item(c(1, 2, 1))$data
, , 1

     [,1]
[1,]   72

But with multiple elements, it doesnt work.

zarr.array$get_item("assay")$get_item(c(1:2, 2, 1))$data
Error in check_selection_length(selection, shape) : TooManyIndicesError
zarr.array$get_item("assay")$get_item(list(c(1,6), 2, 1))$data
Error in if (is.na(stop)) { : the condition has length > 1

Artur-man avatar Aug 26 '24 11:08 Artur-man

$get_item(c(1:2, 2, 1))

Indexing with numeric vectors is difficult since the elements become flattened by default, unlike with lists

> c(1:2, 2, 1)
[1] 1 2 2 1
> list(1:2, 2, 1)
[[1]]
[1] 1 2

[[2]]
[1] 2

[[3]]
[1] 1

Perhaps you can do something fancy with rlang https://rlang.r-lib.org/reference/topic-defuse.html and prevent the flattening behavior / intercept prior to flattening.

The vector vs. list issue aside, there is this outstanding need to support integer indexing: https://github.com/keller-mark/pizzarr/issues/43

However at the moment, you could turn lists of integers into lists of slices in order to work around this:

to_slice <- function(i) {
  if(length(i) == 1) {
    return(slice(i, i))
  }
  if(length(i) == 2) {
    return(slice(i[1], i[2]))
  }
  if(length(i) == 3) {
    return(slice(i[1], i[2], i[3]))
  }
  stop("Received indexing vector with too many elements")
}
selection <- z$get_item(lapply(x, to_slice))

keller-mark avatar Aug 26 '24 13:08 keller-mark

We also have this bracket indexing function which may be relevant: https://github.com/keller-mark/pizzarr/blob/f84355d2708c22dc6e703f3cdd83d218221b352a/R/zarr-array.R#L1213

z[2, 5]

Example in test here: https://github.com/keller-mark/pizzarr/blob/main/tests/testthat/test-s3.R#L47

keller-mark avatar Aug 26 '24 13:08 keller-mark

Here is have to updated and implemented further right ? I will attempt if you guys haven't planned yet.

https://github.com/keller-mark/pizzarr/blob/f84355d2708c22dc6e703f3cdd83d218221b352a/R/indexing.R#L88-L100

Artur-man avatar Aug 28 '24 15:08 Artur-man

I like the fact that this repo is functionally an R replica of the original zarr-python implementation. I was able to implement IntArrayDimIndexer and OrthogonalIndexer classes to get get_item to accept orthogonal selection. There are still a few bugs I need to take care of, otherwise the DelayedArray assumption of random index access is satisfied.

Here is more info on our DelayedArray extension: https://github.com/BIMSBbioinfo/ZarrArray

Here are some examples:

# write
zarr.array <- pizzarr::zarr_open(store = "data/mat_example.zarr", mode = "w")
mat_test <- matrix(1:100, nrow = 10)
zarr.array$create_dataset("assay", data = mat_test, shape = dim(mat_test), chunks = c(2,2))
# read
zarr.array <- pizzarr::zarr_open(store = "data/mat_example.zarr", mode = "r")
a <- zarr.array$get_item("assay")
a[c(1,6,7),c(2,8,9)]$data
     [,1] [,2] [,3]
[1,]   11   71   81
[2,]   16   76   86
[3,]   17   77   87

Would you guys like a PR on this once everything is tidy ?

Artur-man avatar Aug 30 '24 00:08 Artur-man

@keller-mark has the final say, but I'd be happy to get the contribution!

dblodgett-usgs avatar Aug 30 '24 02:08 dblodgett-usgs

I agree with @dblodgett-usgs, the contribution is welcome! Compatibility with DelayedArray would be great!

keller-mark avatar Aug 30 '24 12:08 keller-mark

Awesome guys, thanks for the quick response, I will let you know!

Artur-man avatar Aug 30 '24 14:08 Artur-man