run against JSON files when only schema has changed
This may well not be possible, but it doesn't hurt to ask.
I have a repo containing JSON files as well as a schemas for those JSON files. When I change a JSON file, check-jsonschema diligently checks those JSON files. However, when I change a schema file, check-jsonschema isn't checking whether my JSON files still match the (now updated) schema.
When I add the always_run: yes toggle, it complains about there not being any instancefiles. I guess this means that git does not provide access to files that are not part of the commit. But perhaps there's a way?
Sorry for the delayed response -- I missed this when it was filed.
I think I would tend to solve this with pre-commit run -a to run on all files.
However, you could use always_run: true so long as you also set options to control the filenames passed to the hook.
Setting pass_filenames: false and args: ["path/to/file1.json", "path/to/file2.json"] should do the trick.
All that said, I'm going to leave this open and see if I can think of a good solution for this use-case.
It makes sense to me that you would want to specify that, for example, schemas/schema1.json applies to data/data1/**/*.json, and so forth. It may be that in order to support this, check-jsonschema would need to support configuration data, directory traversal/globbing, and other features similar to what pre-commit is doing. But I'm happy to tinker with it if it sounds useful.
I think it does sound useful, as new files may be added all the time so it's not really possible to pass them all in the options. I'll watch this space :) And I have another suggestion I'll file in a separate issue :) THanks!
I have continued to think about this issue, and how it could be solved. I want to share some of my thoughts for two reasons:
- to show what I think the usage will look like and let anyone point out potential problems
- to record the plan for myself, in case I don't get back to this for a while
Basically, the configuration data which would be needed is a map from schema filenames to glob paths. The basic plan is to add a JSON config file:
$ check-jsonschema --config .check_jsonschema.json foo/bar/schema.json foo/bar/items/instance1.json ignored-file.txt
where the contents of .check_jsonschema.json are
{
"schema_map": {
"foo/bar/schema.json": {
"include": ["foo/bar/**/*.json", "foo/baz/items/bar/**/*.yaml"],
"exclude": ["foo/bar/**/schema.json"]
}
}
}
If all of it shakes out as I hope, the result would be pre-commit usage like
- id: check-jsonschema
name: "Check jsonschema on foo/ files"
files: ^foo/
types: [json, yaml]
args: ["--config", ".check_jsonschema.json"]
Naturally, I would also provide a schema and hook for validating your check-jsonschema config! 😉
This is all highly speculative at this point. I haven't started on an implementation at all yet, and I might not be able to take time to work on this anytime soon, but this is the sort of direction I'd like to take things.
How I solved this, and it works well enough, is to accept the behaviour of pre-commit, and run on Github Actions pre-commit run check-jsonschema --all-files. We actually run it with tox:
[testenv:jsonschema]
skip_install = true
deps =
pre-commit
commands =
pre-commit run check-jsonschema --all-files
and on GitHub Actions:
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install tox tox-gh-actions
- name: json schema validation
run: tox -e jsonschema
So locally you can always execute the command yourself, of course, to check if your changes will affect the validation. Then, if any mistakes will pass the development process, Gihub will scream to let you know.