zarr-specs
zarr-specs copied to clipboard
r* data type should be parametrized via configuration
Instances of the r* data type are parametrized by a length, so no zarr metadata will contain {..., "data_type": "r*"} but rather {..., "data_type": "r8"} or similar. As a result of this design, the r* data type does not have a fixed name, unlike all the other data types defined in the spec. An alternative specification of the data type could easily result in a fixed name:
{
"name": "r*",
"configuration": {
"length": <length in bits>
}
}
The current design prevents implementations from creating finite mappings between string names and data types. I imagine the example of the r* data type could also be confusing to people creating data type extensions.
The spec says:
Note
We are explicitly looking for more feedback and prototypes of code using the r*, raw bits, for various endianness and whether the spec could be made clearer.
Does this mean the design of the r* data type is provisional?
I agree that the language here is also not clear, e.g.,
In addition to these base types, an implementation should also handle the raw/opaque pass-through type designated by the lower-case letter
rfollowed by the number of bits, multiple of 8.
doesn't have the "should" here capitalized.
I'm not sure, though, whether I would go so far as using "provisional" (though, I guess we'd need to define that term for ourselves first). My inclination would be to start a deprecation process for them if the dynamic nature of the naming causes issues.