flexible version handling with tag
Since using tag seems the way to go (#268) I was thinking about ways to be more flexible with tag versioning inside ASDF schema definitions.
Currently an exact match of the tag+version string like tag: http://stsci.edu/schemas/asdf/core/software-1.0.0 is needed for validation. This leads to minor changes and version bumps in core schemas like ndarray resulting in necessary changes in higher level schemas (and maybe more changes higher up because of that)
Since tag should allow more flexibility regarding implementations in the asdf API, what do you think about supporting a subset of versions with tag by pinning Major/Minor/Patch versions?
for example:
tag: http://stsci.edu/schemas/asdf/core/software # allow every version
tag: http://stsci.edu/schemas/asdf/core/software-1 # fix major version
tag: http://stsci.edu/schemas/asdf/core/software-1.* # s.a.
tag: http://stsci.edu/schemas/asdf/core/software-2.1
tag: http://stsci.edu/schemas/asdf/core/software-2.1.*
Resolving the actual tags on read/write would still be handled on the API side (and throw any related errors)
Since this would be optional and basically include the old behavior the impact of any weird side effects (like changing version numbers on a simple read/write cycle?) could be decided per schema/extension.
I like this idea a lot! We actually have a situation where extensions across multiple packages depend on one another, and updating a tag lower down in the dependency tree causes us a lot of tedious work updating the schemas that depend on it.
We're moving away from strict requirements on the structure of the URI, so I'd suggest that we allow the * wildcard anywhere in the tag: value (even multiple locations). With that scheme, I think your examples would be expressed like this:
tag: http://stsci.edu/schemas/asdf/core/software-* # allow every version
tag: http://stsci.edu/schemas/asdf/core/software-1.* # fix major version
tag: http://stsci.edu/schemas/asdf/core/software-2.1.* # fix major and minor version
but we could also do this:
tag: http://stsci.edu/schemas/asdf/transform/* # allow any transform tag
tag: http://stsci.edu/schemas/asdf/transform/*-2.* # allow any 2.x transform tag
I guess we'd express a >= version like this:
# equivalent to software >= 1.3
allOf:
- tag: http://stsci.edu/schemas/asdf/core/software-1.*
- not:
anyOf:
- tag: http://stsci.edu/schemas/asdf/core/software-1.1.*
- tag: http://stsci.edu/schemas/asdf/core/software-1.2.*
A little clumsy, but a necessary sacrifice I think if we want to support free-form URIs.
Glad to hear, I could really see this simplifying some things
From my understanding whatever goes into the tag: value should actually be independent from URI structure requirements since it just gets parsed against the "real" URI according to whatever syntax gets implemented.
Looking at the other JSON schema features, it seems that regular expressions are the pattern matching tool of choice. For example:
https://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.6.3.3 https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.9.3.2.2
and some general guidance on how to use them here:
https://json-schema.org/draft/2019-09/json-schema-core.html#regex
We could add an additional property that validates the tag against a regular expression, something like tagPattern or patternTag. It's a little annoying to escape the . in URIs:
tagPattern: '^http://stsci\.edu/schemas/asdf/core/software-1\..*$'
but the flexibility makes this feature useful beyond just pinning version numbers.
Using regex would certainly be more powerful, especially when trying to parse the string part of the URI.
Personally I would prefer keeping it simple wherever possible. In this case +1 for creating tagPattern or similar that supports regex syntax for advanced use-cases.
When using tag: for validation (like in oneOf) only validating the version number could easily keep the current readability and behavior. Allowing this
tag: http://stsci.edu/schemas/asdf/core/software # allow every version
tag: http://stsci.edu/schemas/asdf/core/software-1 # fix major version
tag: http://stsci.edu/schemas/asdf/core/software-1.2 # fix minor version
tag: http://stsci.edu/schemas/asdf/core/software-1.2.3 # fix patch version
should be possible with a simple change to https://github.com/asdf-format/asdf/pull/838, I will see if I can add it soon.
This should keep it readable, completely backwards compatible and simple when using tag:.
What I mean about relaxing the URI structure is that I want to allow tags like this:
http://somewhere.org/tags/2018-02/basic-recipe
http://somewhere.org/tags/2018-02/basic-ingredient
or anything else that users want to do. Splitting on hyphen and matching the prefix won't be appropriate in every case.
I have a proposed implementation here: https://github.com/asdf-format/asdf/pull/858. That PR is branched off of another open PR, so here's a link to the relevant commit: https://github.com/asdf-format/asdf/pull/858/commits/1a33ab1c53faffa93d49d560de852c1769822a55
@CagtayFabry does that work for you? It allows patterns like the following:
http://stsci.edu/schemas/asdf/core/software-*
http://stsci.edu/schemas/asdf/core/software-1.*
http://stsci.edu/schemas/asdf/core/software-1.0.*
http://stsci.edu/schemas/asdf/core/software-1.0.0
but would not automatically match a pattern like http://stsci.edu/schemas/asdf/core/software-1 to all 1.x versions.
See my comment on https://github.com/asdf-format/asdf/pull/858 (which apparently I was typing at the same time ;) )
besides I don't mind using http://stsci.edu/schemas/asdf/core/software-1.* over http://stsci.edu/schemas/asdf/core/software-1 at all
If tag and version validation are separated it should also be very easy to implement various syntax versions later on.