Ryan Abernathey
Ryan Abernathey
> Instead we could require explicitly supplying `chunks` vis the `encoding` parameter in the `to_zarr()` call. This could also break existing workflows though. For example, pangeo-forge is using the encoding.chunks...
@d-v-b - resolving the typing errors here is beyond my ability. Would appreciate your help. 🙏
In https://github.com/rabernat/zarr-python/pull/1 we are developing an experimental prototype allowing any Arrow datatype to be stored in Zarr. This would enable ragged arrays using arrow list types.
> Any pointers regarding where to start / modules involved to implement this? I would like to have a try. The starting point would be to look at the code...
I will just comment that this is a really hard problem that kind of plagues all of Python. Once you introduce async functions, it sort of infects your entire stack....
I can see the value of a "pickled python object" dtype extension, provided it came with the necessary safety warnings.
Yeah I have to admit that I'm also 👎 on the idea of compactifying the json. For reference, the typical size of our zarr stores is 1 GB - 1...
Good points. Perhaps we could have an _option_ for this. Similar to xarray's option machinery. Like `zarr.set_options(compact_json=True)`.
Thanks a lot for the useful benchmark @b8raoult! It's worth pointing out that the example here is highly artificial and exacerbates known issues in Zarr Python 3. Specifically: - It...
These are absolutely real issues we should fix! I just wanted to share the counter-example so people don't freak out too much when they see this issue. 😁