zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Unable to save object arrays

Open imilas opened this issue 4 years ago • 5 comments

Issue

I am unable to save variable length arrays to disk. I also cannot save JSON or Pickle object arrays to disk. Simple examples are given below.

Sorry if I missed something obvious in the documents. I cannot find an example of object arrays being saved to disk, so this could be entirely a syntax issue. Thanks for your time.

Example 1:

z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
print(z)
print(z.filters)
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
zarr.save("test1.zarr",z)

Example 2:

z = zarr.empty(5, dtype=object, object_codec=numcodecs.JSON())
print(z)
print(z.filters)
z[0] = 42
z[1] = 'foo'
z[2] = ['bar', 'baz', 'qux']
z[3] = {'a': 1, 'b': 2.2}
zarr.save("test2.zarr",z)

Problem description

Saving object arrays (as defined in the tutorial) results in the "ValueError: missing object_codec for object array" error.

The error occurs in the following line:

~/miniconda3/lib/python3.8/site-packages/zarr/storage.py in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec)
    386             if not filters:
    387                 # there are no filters so we can be sure there is no object codec
--> 388                 raise ValueError('missing object_codec for object array')
    389             else:
    390                 # one of the filters may be an object codec, issue a warning rather

As far as I can tell, the filters and codecs are defined in both cases. Is it possible to save the arrays defined above to disk?

Version and installation information

zarr.version: 2.4.0 numcodecs.version:0.7.2 python version : Python 3.8.5 OS: linux both pip and conda installations were tested

imilas avatar Jan 18 '21 00:01 imilas

any solution to this?

rocherroche avatar Dec 09 '21 03:12 rocherroche

Hi @rocherroche. There was recently a fix (#813 in 2.9.4) What version are you using & are you seeing the identical error?

joshmoore avatar Dec 09 '21 09:12 joshmoore

I can reproduce the above with 2.12.0 and also with the pickle codec. As noted in another issue, things work if the array is created as part of the specified store; if not and the array is just assigned, or copied into the destination using zarr.copy the above error (or a warning in the case of zarr.copy) occurs.

Are there any known/recommended workarounds?

Thank you!

TomasPuverle avatar Aug 04 '22 04:08 TomasPuverle

Hi, I wanted to check back in to see if anyone has any suggestions/workarounds for this problem. I am seeing this happening with other encoder classes, too. Thank you.

TomasPuverle avatar Sep 08 '22 05:09 TomasPuverle

As noted in another issue, things work if the array is created as part of the specified store

Do you mean https://github.com/zarr-developers/zarr-python/issues/1090#issuecomment-1190314533? If so, I guess that makes sense. Is that usage not currently possible for you? Can you share your code?

joshmoore avatar Sep 08 '22 06:09 joshmoore