Jeremy Maitin-Shepard
Jeremy Maitin-Shepard
> If we were to implement Vlen then I believe that in effect we are extending the V2 spec. That means that the implementation has to be interoperable with all...
I'm not sure what you mean as far as blosc not conforming to the standard --- none of the codecs are specified in the zarr v2 spec. I'm a bit...
> > ... vlenarray does not use pickling: > > Are you sure? The documentation says it is relevant and I assume that it is used to serialize the elements...
Here is the format used by vlen-array, for reference: JSON metadata: `{"id": "vlen-array", "dtype": }` The `` might be e.g. "
I mentioned this in the meeting but wanted to record this comment here: instead of just defining awkward arrays on top of regular zarr arrays, which imposes a lot of...
You raise some very interesting points regarding Apache Arrow: - Apache Arrow defines a complex data model with a lot of nice features, including nested structures and variable length lists,...
@martindurant What you are describing I would characterize as support for an "irregular chunk grid", in contrast to the "regular chunk grid" that zarr currently supports --- I agree it...
@martindurant Thanks for the clarification --- that implementation strategy didn't occur to me. I see that the strategy you propose of irregular chunking solves some of the problems with a...
> > the metadata file has a size of O(number of chunks) > > This doesn't appear to be a problem. The size of this data will always be much...
Available memory isn't the only practical limit on chunk size. The optimal chunk size depends on the access patterns as well --- for example I have a lot of use...