zarr-python
zarr-python copied to clipboard
Unable to save object arrays
Issue
I am unable to save variable length arrays to disk. I also cannot save JSON or Pickle object arrays to disk. Simple examples are given below.
Sorry if I missed something obvious in the documents. I cannot find an example of object arrays being saved to disk, so this could be entirely a syntax issue. Thanks for your time.
Example 1:
z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
print(z)
print(z.filters)
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
zarr.save("test1.zarr",z)
Example 2:
z = zarr.empty(5, dtype=object, object_codec=numcodecs.JSON())
print(z)
print(z.filters)
z[0] = 42
z[1] = 'foo'
z[2] = ['bar', 'baz', 'qux']
z[3] = {'a': 1, 'b': 2.2}
zarr.save("test2.zarr",z)
Problem description
Saving object arrays (as defined in the tutorial) results in the "ValueError: missing object_codec for object array" error.
The error occurs in the following line:
~/miniconda3/lib/python3.8/site-packages/zarr/storage.py in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec)
386 if not filters:
387 # there are no filters so we can be sure there is no object codec
--> 388 raise ValueError('missing object_codec for object array')
389 else:
390 # one of the filters may be an object codec, issue a warning rather
As far as I can tell, the filters and codecs are defined in both cases. Is it possible to save the arrays defined above to disk?
Version and installation information
zarr.version: 2.4.0 numcodecs.version:0.7.2 python version : Python 3.8.5 OS: linux both pip and conda installations were tested
any solution to this?
Hi @rocherroche. There was recently a fix (#813 in 2.9.4) What version are you using & are you seeing the identical error?
I can reproduce the above with 2.12.0 and also with the pickle codec. As noted in another issue, things work if the array is created as part of the specified store; if not and the array is just assigned, or copied into the destination using zarr.copy the above error (or a warning in the case of zarr.copy) occurs.
Are there any known/recommended workarounds?
Thank you!
Hi, I wanted to check back in to see if anyone has any suggestions/workarounds for this problem. I am seeing this happening with other encoder classes, too. Thank you.
As noted in another issue, things work if the array is created as part of the specified store
Do you mean https://github.com/zarr-developers/zarr-python/issues/1090#issuecomment-1190314533? If so, I guess that makes sense. Is that usage not currently possible for you? Can you share your code?