code-gov-web icon indicating copy to clipboard operation
code-gov-web copied to clipboard

policy schema - versioning

Open mattbailey0 opened this issue 8 years ago • 5 comments

We need an approach to tagging versions of the schema for agencies to conform to, given that agency data calls can take several months and the schema may change in the interim. Tagging commits is useful for the technical team but less so for agencies.

Considerations:

  • strategy at the file level for snapshotting releases while keeping the main file (and its commit history in tact
  • ux - presenting historical versions of the schema
  • need to add a "schema version" dimension to the schema itself

We should consider the current state the first snapshot - agencies are already building their inventories against it.

mattbailey0 avatar Oct 18 '16 15:10 mattbailey0

@mattbailey0: Is there any reason why we wouldn't version the schema in sync with the API (ex: API is currently at v0.1, when we hit v1.0 we'll be stable)?

michael-balint avatar Oct 18 '16 15:10 michael-balint

Is versioning for the API tied primarily to the data model or to the codebase? It sounds like the latter - i.e. you might move to v0.2 because you refactored something on the backend, but the fields and validation wouldn't necessarily change.

The versioning for the schema would be for changes to the specification itself, which seems like something different (?)

mattbailey0 avatar Oct 18 '16 16:10 mattbailey0

Major API versions usually communicate changes that downstream users of the API should be aware of. For example, if you have an API at v1, located at https://myapi.com/1.0, but are about to introduce some breaking changes, it would make sense to continue to maintain v1, but then deploy a new version v2 to https://myapi.com/2.0. Then, you'd communicate what the changes are to the community (and let them know when v1 will be phased out).

Breaking changes could be changes to how existing API endpoints accept parameters or respond to those parameters. Certain changes to the schema could obviously introduce such breaking changes. If changes are simply bug fixes or don't affect anyone's downstream code, then it doesn't make much sense to come out with a new version of the API. These kind of code changes can be reflected in the subversion (which we record in the package.json) - which most people won't see.

This is why I'm suggesting that it might be easier for us to maintain versioning for the API and schema in sync. Otherwise, people who use the API will have to know that there's a difference between versioning in the API and versioning in the schema. They'll need to know that v1 of the API supports v7 of the schema - and that v2 of the API will support v20 of the schema.

michael-balint avatar Oct 18 '16 16:10 michael-balint

Automated Scheme Validation and Verification is Required

You will need some form of crawler to automatically inspect the content of all the metadata files to do quality control, and identify where the scheme is incomplete, incorrect, or differs from the cannonical scheme.

This is not a task that a human should be doing. There is no way this will be sustainable if its not automated.

rafael5 avatar Oct 18 '16 20:10 rafael5

Building on @michael-balint , I agree and wonder if we should / could take a page out of the GitHub API book: https://developer.github.com/v3/

They periodically add new functionality and bug fixes, but because they don't break existing functionality, they have been at API version 3 for a while (since June 2011).

IanLee1521 avatar Oct 19 '16 01:10 IanLee1521