code-gov-web
code-gov-web copied to clipboard
policy schema - versioning
We need an approach to tagging versions of the schema for agencies to conform to, given that agency data calls can take several months and the schema may change in the interim. Tagging commits is useful for the technical team but less so for agencies.
Considerations:
- strategy at the file level for snapshotting releases while keeping the main file (and its commit history in tact
- ux - presenting historical versions of the schema
- need to add a "schema version" dimension to the schema itself
We should consider the current state the first snapshot - agencies are already building their inventories against it.
@mattbailey0: Is there any reason why we wouldn't version the schema in sync with the API (ex: API is currently at v0.1
, when we hit v1.0
we'll be stable)?
Is versioning for the API tied primarily to the data model or to the codebase? It sounds like the latter - i.e. you might move to v0.2 because you refactored something on the backend, but the fields and validation wouldn't necessarily change.
The versioning for the schema would be for changes to the specification itself, which seems like something different (?)
Major API versions usually communicate changes that downstream users of the API should be aware of. For example, if you have an API at v1, located at https://myapi.com/1.0, but are about to introduce some breaking changes, it would make sense to continue to maintain v1, but then deploy a new version v2 to https://myapi.com/2.0. Then, you'd communicate what the changes are to the community (and let them know when v1 will be phased out).
Breaking changes could be changes to how existing API endpoints accept parameters or respond to those parameters. Certain changes to the schema could obviously introduce such breaking changes. If changes are simply bug fixes or don't affect anyone's downstream code, then it doesn't make much sense to come out with a new version of the API. These kind of code changes can be reflected in the subversion (which we record in the package.json
) - which most people won't see.
This is why I'm suggesting that it might be easier for us to maintain versioning for the API and schema in sync. Otherwise, people who use the API will have to know that there's a difference between versioning in the API and versioning in the schema. They'll need to know that v1 of the API supports v7 of the schema - and that v2 of the API will support v20 of the schema.
Automated Scheme Validation and Verification is Required
You will need some form of crawler to automatically inspect the content of all the metadata files to do quality control, and identify where the scheme is incomplete, incorrect, or differs from the cannonical scheme.
This is not a task that a human should be doing. There is no way this will be sustainable if its not automated.
Building on @michael-balint , I agree and wonder if we should / could take a page out of the GitHub API book: https://developer.github.com/v3/
They periodically add new functionality and bug fixes, but because they don't break existing functionality, they have been at API version 3 for a while (since June 2011).