New builtin schema type: CITATION.cff?
Would it make sense to add CITATION.cff to the built-in schemas? It would be nice to have a flag and a pre-commit check for it, currently it requires:
uvx check-jsonschema --schemafile https://raw.githubusercontent.com/citation-file-format/citation-file-format/main/schema.json --no-cache --default-filetype yaml CITATION.cff
and
- id: check-jsonschema
name: Validate CITATION.cff
files: ^CITATION.cff$
types_or: [file]
args:
- --schemafile=https://raw.githubusercontent.com/citation-file-format/citation-file-format/main/schema.json
- --default-filetype=yaml
- --no-cache
(This also doesn't work for pre-commit.ci, it can't pull this file, being built-in should fix that)
Ref for CITATION.cff: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files and links from there. Most notably, https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionsreferencedate-published. The link above is from SchemaStore.
PS: I'm not sure why --no-cache is required, it gets a completely incorrect file if not passed. Maybe it's caching on the name schema.json?
I'm generally pretty open to adding new schemas, so sure!
(I'd like to get all of schemastore included at some point, but there are a few problems associated with that.)
If you want to open a PR, I think I have the process documented in the contrib doc. Or I can look at this when I'm back next week. The only thing which might need special handling is the .cff extension. Is the file actually JSON?
PS: I'm not sure why
--no-cacheis required, it gets a completely incorrect file if not passed. Maybe it's caching on the nameschema.json?
It is using the name, yeah. 😢 I want to redo the caching mechanism at some point to fix this, which shouldn't be that hard. The CLI option for setting the filename to use in the cache could go away too, which is a win.
I would still like to have a cache which can be navigated without a special tool, so I will probably do something like...
- dir per schema, named as a url hash
- schema file and metadata file inside
Or maybe a small SQLite DB.
I'd like to get all of schemastore included at some point
I've got a mostly automatic mirror of the pyproject.toml part of SchemaStore at https://github.com/henryiii/validate-pyproject-schema-store, it might look something like that.
The .cff file is JSON. It's not recognized as JSON automatically since it's an overloaded extension. This file, though, will always have the base name CITATION.
I can do it eventually, but it might be a week or two.
Looking at https://citation-file-format.github.io/ and the implementation in https://github.com/citation-file-format/citation-file-format/blob/main/examples/validator.py, it seems safe to assume the contents of CITATION.cff are always valid YAML.
This issue completely fell off my radar. I'd be happy to take a PR to wire up support.
I still intend to add this at some point if nobody else does, but it's not really a priority for me because I'm not a user of CITATION.cff. (at least, not yet!)
Looking back over the issue, I also see mention of --no-cache and filename based caching, so I feel the need to mention:
The buggy/bad behavior in question has since been fixed, and caching is done based on full URL.
It's still not perfect/beautiful, but no longer incorrect in common cases. 😄
Here you go: https://github.com/python-jsonschema/check-jsonschema/pull/616.