numcodecs Codec creation: getting the shape of a chunk?

Is there any way for a codec (as applied to encoding/decoding within zarr) to be reliably provided the shape of the chunk it is decoding?

My use-case here is to write a codecs that apply dynamic scaling and quantization (based on planes of a 3+-dimension array, normalizing by local min/max within a chunk) and/or two-dimensional linear prediction (extending numcodecs.Delta, essentially).

When calling Codec.encode() this is not a problem; the buffer supplied is a full array-like unless an earlier filter stage has done something. However, on decoding the codec is only reliably supplied a byte-stream without shape information. The out parameter to decode() seems to be inconsistently supplied.

Obviously, Zarr knows what shape of chunk it is seeking to fill. Without that information, I'll have to encode the array shape information in the output datastream. That's unnecessary redundancy, and more importantly it is aesthetically displeasing.

Oct 13 '23 14:10 csubich

I previously have pushed for the concept of "context", which would be passed by zarr to both the codec's encode/decode methods and to the storage layer, specifying where in the array we are, the shape, key, ... and other useful pieces of information that are available at call time. Currently, the context (zarr.context.Context) only has meta_array: NDArrayLike, I see no reason not to populate it further.

Oct 18 '23 15:10 martindurant

Context of the chunk within the larger super-array would also be interesting, since it could allow some special-case encoders that apply data transforms along the way.

Oct 18 '23 15:10 csubich