Use YAML to maintain the input schema
I find working with JSON awkward, and would like to suggest we move to a YAML representation instead.
Digging around a bit, it seems people are using YAML to maintain and validate JSON schemas.
- https://github.com/coveooss/json-schema-for-humans/issues/33
- https://github.com/Grokzen/pykwalify
Here's our's in YAML:
https://gist.github.com/bdarcus/4afa2ba2a0a57d8f9af7ccf8af7c5320
We could have a Github Action that would keep a JSON variant in sync, but do new development on the YAML.
Thoughts?
cc @dhimmel @bwiernik @denismaier
Yes, please!
Seems like a good idea to me to switch the source to YAML, and then auto-generate the JSON if that's needed for downstream applications.
I don't envision any issues besides that some YAML fields get parsed in ways that are not supported by JSON. For example, I think key: 2020-07-12 in YAML would get parsed as a date. If you tried to in Python to dump the object as JSON it would give an error, since JSON doesn't support date object. The automated export to JSON therefore might be sufficient protection against YAML-only constructs being added to the schema.
I don't envision any issues besides that some YAML fields get parsed in ways that are not supported by JSON. For example, I think
key: 2020-07-12in YAML would get parsed as a date. If you tried to in Python to dump the object as JSON it would give an error, since JSON doesn't support date object. The automated export to JSON therefore might be sufficient protection against YAML-only constructs being added to the schema.
I hadn't thought about that.
So to test, we'd simply want to convert back and forth, and validate against a test schema at each step?
I did just check, and your example converts to this JSON (so a string):
{
"key": "2020-07-12"
}
So to test, we'd simply want to convert back and forth
If you can read the YAML and export to JSON in Python, it should be okay. But just be aware that you can do things in YAML that create types that are not supported in JSON.
So the GitHub Action; we'd just keep the json file, and update it from the yml when it changes?
One option for CI is to generate the JSON file from the YAML file and fail if the JSON file has any changes according to git. This option requires the user to export an up-to-date JSON file.
If you want CI to commit the changes to the JSON file, things get more tricky. If you were to just do this for branches it's easier, but it's challenging to find an approach that works for pull requests and pushes.
One final option is to not keep the JSON files on the master branch, but instead have them deployed to a separate branch of outputs.
One final option is to not keep the JSON files on the master branch, but instead have them deployed to a separate branch of outputs.
This work work sort of like gh-pages then?
Perhaps we could bundle both the json generation and publishing, and document generation and publishing, together then?
This work work sort of like gh-pages then?
Yes, this is a common usage pattern for gh-pages.
Perhaps we could bundle both the json generation and publishing, and document generation and publishing, together then?
Yes, but isn't document generation far off? But conceptually the generated JSON and docs are both the same: they are both outputs.
Yes, but isn't document generation far off?
I don't think so. We just need to add the remaining annotations to the properties; just copying and pasting mostly.
To be clear, though, I'm only talking about for the input schema; not the spec and such on the documentation repo.
One wrinkle: how we would deal with versioning in that situation?
Just use the CI script to keep the "latest" up to date, and manually generate the versions?
In any case, I just added the gh-pages branch, and the start of this.
This Action even allows cross-site Pages deployment.
https://github.com/marketplace/actions/deploy-to-github-pages
@rmzelle - if I were to do this, where would I push the HTML file to on the website repo?