RFC-2: clarification about the intent of non OME-NGFF keys
This question came as a follow-up of the ongoing OME 2024 challenge which aims to generate Zarr datasets largely based on the structure defined in RFC 2 with additional metadata requirements.
Looking specifically at the attributes of one of the generated dataset - https://deploy-preview-36--ome-ngff-validator.netlify.app/?source=https://uk1s3.embassy.ebi.ac.uk/idr/share/ome2024-ngff-challenge/idr0044/4007801.zarr, I noted that the _creator field has been stored under the ome key.
My immediate expectation would be to keep such metadata as a top-level keys. Incidentally, this raised the issue that this was not specified in the current RFC-2 and could use some clarification.
As per https://ngff.openmicroscopy.org/rfc/2/index.html#changes-to-the-ome-zarr-metadata, the current proposal is
- OME-Zarr metadata will be stored under a dedicated ome key in the Zarr array or group attributes.
For non OME-Zarr metadata, I can imagine three behaviors:
- no recommendation i.e. preserve the current statu quo, allow additional keys to be defined anywhere and only specify the semantics of some of the keys under
ome - strict i.e. only keys defined in the specification must be stored under the
omenamespace and other keys must be stored at a different level - lenient i.e. only keys defined in the specification should be stored under the
omenamespace and other keys should be stored at a different level
For the last two points, it would be valuable for the specification to have a clear list of all the official keys which would be expected to be found under ome in the specification.
RFC-2 doesn't change the behavior of additional keys. That means, there is nothing that forbids them.
Since we rely on versioning in the OME-Zarr metadata, this doesn't seem to be a problem. If we move to a non-versioned future-proof model, I think we should be strict and only allow keys that are defined by the spec to allow future RFCs to add new keys without worrying about conflicts due to custom keys. That is the behavior in the core Zarr 3 metadata. But that is a discussion for another RFC.
Thanks Norman. Thinking of this it in terms of JSON schema, this means the ome key should behave the same as other keys defined in this specification and allow any additional properties by default - https://json-schema.org/understanding-json-schema/reference/object#additionalproperties.
No objection to keeping the flexibility from my side, I was primarily looking into an implementation transforming the metadata from 0.4 to RFC-2 and realised any such code will need to make a decision for these extra keys. Happy to close this issue as resolved or leave it open to gather additional feedback (maybe with a deadline so that it's not hanging forever).
@normanrz, thoughts on the next steps here?
I would close this for now and revisit as part of our plan to build extension mechanisms in OME-Zarr. We might need to get strict about additional metadata at that point.
Ok. Closing since we necessarily want the back and forth here (instead on the RFC) but I've linked this from a project so we re-review. Thanks, both.