zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

[v3] support for ragged arrays

Open jhamman opened this issue 1 year ago • 5 comments

Zarr-Python 2 supported ragged arrays. This functionality has not made it into Zarr-Python 3 yet (see also #2617).

An example demonstrating this functionality using Zarr-Python 2:

z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
z
<zarr.core.Array (4,) object>
z.filters
[VLenArray(dtype='<i8')]
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
z[:]
array([array([1, 3, 5]), array([4]), array([ 7,  9, 14]),
       array([], dtype=int64)], dtype=object

This issue tracks the development of ragged arrays support in Zarr-Python 3.

jhamman avatar Jan 02 '25 03:01 jhamman

Just hopping into this dicusssion, but this does limit the ability for Hyperspy to support zarr 3.0.0. Our usecase is for ragged arrays which should be supported, and don't have the same security issues that directly json encoding a python object.

We could just unwrap the ragged arrays and store them alongside an second array with information of how to recreate the ragged array. Is that the best way to handle this or is there a better way to encode varible length objects.

CSSFrancis avatar Jan 10 '25 15:01 CSSFrancis

I think ragged arrays are definitely in-scope for 3.x, we just haven't had time to implement it.

d-v-b avatar Jan 10 '25 16:01 d-v-b

@d-v-b Thanks for the response! There is the VLenBytesCodec which seems like it could handle most of the encoding as long as the underlying array is 1 dimensional? The underlying source says that this might be changed in the future and is not explicitly supporting in v3. Is that still correct?

CSSFrancis avatar Jan 10 '25 16:01 CSSFrancis

Any updates on this?

sehoffmann avatar Nov 17 '25 15:11 sehoffmann

In https://github.com/rabernat/zarr-python/pull/1 we are developing an experimental prototype allowing any Arrow datatype to be stored in Zarr. This would enable ragged arrays using arrow list types.

rabernat avatar Nov 17 '25 15:11 rabernat