Jeremy Maitin-Shepard
Jeremy Maitin-Shepard
Let me try to explain how I understand your example in the context of using the existing zarr format (without irregular grid support): Suppose we pick a chunk size of...
Definitely agreed that it should be dropped for now. It could be added back as a Python-specific python-object data type for use with a pickle codec, but not all uses...
Note: This is the representation used by TensorStore: https://google.github.io/tensorstore/schema.html#json-ChunkLayout.inner_order
Here is one example: Suppose we are storing volumetric data indexed by x y z. It is natural to order the dimensions [x, y, z], or sometimes [z, y, x]...
A better use case for this feature came up this evening: t5x (https://github.com/google-research/t5x) uses tensorstore to store machine learning model checkpoints. A user had modified the model to transpose the...
There has been extensive discussion of this both in #122 and in the Zarr community meeting regarding this issue. This is not planned to be part of the initial zarr...
One difficulty with the current zarr-python approach is that it means the "JSON" metadata is not actually spec-compliant JSON and cannot be parsed by the JavaScript `JSON.parse` function or by...
For `fill_value` the data type is already known so there isn't an issue there. zarr-python for v2 already uses a different encoding for `fill_value` ---- infinity is encoded as `"Infinity"`...
The difference between `fill_value` and user-defined attributes is that for `fill_value`, the data type is specified elsewhere in the metadata and can be used to decode whatever representation is used....
Not sure how many other implementations even support it? A fixed-length sequence of utf-32-enocded code points seems unlikely to be particularly useful as a data type.