zarr.save() fails with object arrays
Zarr version
2.8.1
Numcodecs version
0.9.1
Python Version
3.9
Operating System
Linux
Installation
conda
Description
If I make a Numpy object array, and save it with zarr.save(), the save fails because the object_codec has not been defined for the array when it is created in the zarr store. I am not sure if this is to become a documentation bug (i.e. don't save object arrays with zarr.save()) or an implementation bug. To trigger the bug, just do
import zarr
import numpy as np
jnk=np.empty((9,),dtype=object)
x=[()]*9 ; x[3]=(1,2)
jnk[:]=x
zarr.save('jnk.zarr',it=jnk)
And it will fail with a ValueError: missing object_codec for object array
Ultimately, I think this will be an easy fix; but I don't quite feel comfortable enough with this code base to just dive in and suggest a permanent fix.
Jamie
Steps to reproduce
see above
Additional output
No response
Hi @JamiePringle. Was there an example you followed to create an object array with dtype=object but without object_codec defined? If so, that sounds like it would be a doc bug.
Otherwise, https://zarr.readthedocs.io/en/stable/tutorial.html#object-arrays shows additionally adding one of a few types of codecs to perform the mapping and the error is saying that that field is missing.
There is no example; I was using the documentation for the save() convenience function. I can save the object array by hand in the usual way specifying the object_codec. However, the save() convenience function does not take the object_codec kwarg, likely because it would mess with the rest of the syntax where arrays to be saved are specified with kwargs. The following does not work:
import zarr
import numpy as np
jnk=np.empty((9,),dtype=object)
x=[()]*9 ; x[3]=(1,2)
jnk[:]=x
zarr.save('jnk.zarr',it=jnk,object_codec=numcodecs.VLenArray(int))
I think the easiest thing to do would be to specify in https://zarr.readthedocs.io/en/stable/api/convenience.html#zarr.convenience.save that this function does not work with object arrays.
Is there any way to specify that store use a certain codec for all objects saved within it? Since the store is passed into save(), that could also fix this issue.
Jamie
@joshmoore and @MSanKeys963 I am an outreachy applicant.I'd like to try and contribute to this issue.