schema icon indicating copy to clipboard operation
schema copied to clipboard

Use YAML to maintain the input schema

Open bdarcus opened this issue 5 years ago • 11 comments

I find working with JSON awkward, and would like to suggest we move to a YAML representation instead.

Digging around a bit, it seems people are using YAML to maintain and validate JSON schemas.

  • https://github.com/coveooss/json-schema-for-humans/issues/33
  • https://github.com/Grokzen/pykwalify

Here's our's in YAML:

https://gist.github.com/bdarcus/4afa2ba2a0a57d8f9af7ccf8af7c5320

We could have a Github Action that would keep a JSON variant in sync, but do new development on the YAML.

Thoughts?

cc @dhimmel @bwiernik @denismaier

bdarcus avatar Jul 12 '20 16:07 bdarcus

Yes, please!

bwiernik avatar Jul 12 '20 16:07 bwiernik

Seems like a good idea to me to switch the source to YAML, and then auto-generate the JSON if that's needed for downstream applications.

I don't envision any issues besides that some YAML fields get parsed in ways that are not supported by JSON. For example, I think key: 2020-07-12 in YAML would get parsed as a date. If you tried to in Python to dump the object as JSON it would give an error, since JSON doesn't support date object. The automated export to JSON therefore might be sufficient protection against YAML-only constructs being added to the schema.

dhimmel avatar Jul 12 '20 17:07 dhimmel

I don't envision any issues besides that some YAML fields get parsed in ways that are not supported by JSON. For example, I think key: 2020-07-12 in YAML would get parsed as a date. If you tried to in Python to dump the object as JSON it would give an error, since JSON doesn't support date object. The automated export to JSON therefore might be sufficient protection against YAML-only constructs being added to the schema.

I hadn't thought about that.

So to test, we'd simply want to convert back and forth, and validate against a test schema at each step?

I did just check, and your example converts to this JSON (so a string):

{
  "key": "2020-07-12"
}

bdarcus avatar Jul 12 '20 17:07 bdarcus

So to test, we'd simply want to convert back and forth

If you can read the YAML and export to JSON in Python, it should be okay. But just be aware that you can do things in YAML that create types that are not supported in JSON.

dhimmel avatar Jul 12 '20 17:07 dhimmel

So the GitHub Action; we'd just keep the json file, and update it from the yml when it changes?

bdarcus avatar Jul 12 '20 17:07 bdarcus

One option for CI is to generate the JSON file from the YAML file and fail if the JSON file has any changes according to git. This option requires the user to export an up-to-date JSON file.

If you want CI to commit the changes to the JSON file, things get more tricky. If you were to just do this for branches it's easier, but it's challenging to find an approach that works for pull requests and pushes.

One final option is to not keep the JSON files on the master branch, but instead have them deployed to a separate branch of outputs.

dhimmel avatar Jul 12 '20 18:07 dhimmel

One final option is to not keep the JSON files on the master branch, but instead have them deployed to a separate branch of outputs.

This work work sort of like gh-pages then?

Perhaps we could bundle both the json generation and publishing, and document generation and publishing, together then?

bdarcus avatar Jul 12 '20 18:07 bdarcus

This work work sort of like gh-pages then?

Yes, this is a common usage pattern for gh-pages.

Perhaps we could bundle both the json generation and publishing, and document generation and publishing, together then?

Yes, but isn't document generation far off? But conceptually the generated JSON and docs are both the same: they are both outputs.

dhimmel avatar Jul 12 '20 18:07 dhimmel

Yes, but isn't document generation far off?

I don't think so. We just need to add the remaining annotations to the properties; just copying and pasting mostly.

To be clear, though, I'm only talking about for the input schema; not the spec and such on the documentation repo.

bdarcus avatar Jul 12 '20 18:07 bdarcus

One wrinkle: how we would deal with versioning in that situation?

Just use the CI script to keep the "latest" up to date, and manually generate the versions?

In any case, I just added the gh-pages branch, and the start of this.

bdarcus avatar Jul 12 '20 22:07 bdarcus

This Action even allows cross-site Pages deployment.

https://github.com/marketplace/actions/deploy-to-github-pages

@rmzelle - if I were to do this, where would I push the HTML file to on the website repo?

bdarcus avatar Jul 13 '20 00:07 bdarcus