json-schema-spec
json-schema-spec copied to clipboard
Linter
Per @Relequestual's recommendation, I'm creating a ticket about a linter and to collect linting rules that people would want.
When authoring a JSON schema, it is desirable to have a linter to check the strictness, specificity, satisfiability, and descriptiveness of a schema. Maybe Ben envisions the JSON schema workgroup to produce a reference linter implementation?
Anyway, I'll start with a list of linting rules that would be desirable (assuming draft 07):
- Objects
- Is
additionalPropertiesexplicitly set? Is it set to eitherfalseor a schema (not justtrue)?- If
additionalPropertiesis defined, ispropertyNamesalso defined?
- If
- Is
requireddefined? Are all fields listed inrequired? - Is there any contradiction among
properties,additionalProperties, andrequired? E.g. this schema is not satisfiable:{ "type": "object", "properties": {}, "required": "foo", "additionalProperties": false }foois required, but is not defined amongproperties, yet we also disallow additional properties. - If either
additionalPropertiesis notfalseor some fields are missing fromrequired:- Are
minPropertiesandmaxPropertiesalso defined?- Is
maxProperties>= the length ofrequired?
- Is
- Are
- Are (
$refor$ref-in-allOf) and"additionalProperties": falsepresent together? Usually that's a sign of the author trying to do poor man's inheritance in JSON schema. The"additonalProperties": falsewill likely cause an object to not validate against the schema, counter to the user's expectations.
- Is
- Numbers and integers
- Does every number or integer have:
multipleOfminimum/exclusiveMinimumandmaximum/exclusiveMaximumminimumandexclusiveMinimumshould not be defined togethermaximumandexclusiveMaximumshould not be defined together- Are the (exclusive) minimum and (exclusive) maximum satisfiable?
- if
minimumandmaximumare defined,minimumshould be <=maximum; - if
minimumandexclusiveMaximumare defined,minimumshould be <exclusiveMaximum; - if
exclusiveMinimumandmaximumare defined,exclusiveMinimumshould be <maximum; - if
exclusiveMinimumandexclusiveMaximumare defined:- if
typeis"integer":exclusiveMaximum - exclusiveMinimumshould be > 1 - if
typeis"number":exclusiveMinimumshould be <exclusiveMaximum
- if
- if
- Does every number or integer have:
- Strings
- Do all strings have a
format? - Do all strings have
minLengthandmaxLength?- Is
minLength<=maxLength?
- Is
- Do all strings have a
- Arrays
- Do all arrays have
itemsschemas? - Are all arrays either:
- tuple-validated - i.e.
itemsas a fixed-length array of schemas; or - list-validated - i.e.
itemsas a single schema
- tuple-validated - i.e.
- in the case of list validation, are
minItemsandmaxItemsdefined?- Is
minItems<=maxItems?
- Is
- Is
uniqueItemsexplicitly set? Eithertrueorfalseis ok. We just want it to be explicit.
- Do all arrays have
- General descriptiveness
- Does everything have a
title? - Does everything have a
description? - Does everything have a
$comment? - Does everything have a
default?- Does the
defaultsatisfy the schema?
- Does the
- Does everything have
examples?- Do all
examplessatisfy the schema?
- Do all
- Does everything have a
Ideally, the linter shall be able to generate output that is machine-parsable, with the appropriate JSON pointer so that the author can choose to silence/ignore certain warnings.
I know there are more ways for a schema to become unsatisfiable. E.g. a string could have a regex pattern that can only be 3 characters long, but the author also specifies "minLength": 10. This kind of satisfiability violation is too complex to check. I don't expect any linter to be able to check for that.
I'll move this elsewhere once an appropriate place exists! 😅
Based on w3c/wot-thing-description#1194 it would also be nice to be able to check this as well:
- General descriptiveness:
- If a schema contains
const, does theconstsatisfy the schema? - If a schema contains
enum, does each item inenumsatisfy the schema?
- If a schema contains
Two more:
- If
$refpoints to a remote resource over HTTP, it should usehttpsschema and nothttp. anyOf,allOf,oneOfwith just one option (although this is used often as a workaround in old specs for extending$ref-referenced schema).
* If `$ref` points to a remote resource over HTTP, it should use `https` schema and not `http`.
@mitar Not trying to shoot your idea down, but I find this a bit debatable. If I understand the semantics correctly, the "URL" in the "$ref" isn't really a URL. It's more like an identifier. In the most general sense, you could have a resolver that takes that "URL" and resolves it to a schema, which isn't necessarily retrieved via http or https at all.
I agree. The linting tool should definitely have a way to enable/disable rules to use.
Lint that all string properties that define a regex pattern are internally consistent and obey this regex pattern in their default / example values.
For example, the default and examples in this simple schema are no good! however, they seem to pass in this validator:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"properties": {
"lowerLettersOnly": {
"pattern": "^[a-z]*$",
"default": "bad_default01",
"examples": [
"imatch",
"i-do-NOT-match01"
]
}
}
}
I actually think this should be an invalid schema spec, not just a linting issue, @kze should that be a separate GH issue?
Covering default and examples is a good idea.
I actually think this should be an invalid schema spec
This isn't currently possible with stock JSON Schema because we'd have to reference data (the pattern) and apply schema validations against that data. (It is supported with this vocabulary, though.)
I'll move this elsewhere once an appropriate place exists! 😅 - @Relequestual
A place now exists :smile:. I've added some things from here.
Is it OK for everyone if we create a Github Project dashboard to co-create and visualize all the rule proposals? I suggest this lifecycle:
- Proposal : Someone suggest a new rule.
- Confirmed : The community agrees in adding it.
- Ready to Implement : The rule has all documentation need to implement it.
- Implemented : The rule has been implemented.
- Released : The rule has been released.
- Retired : The rule has been retired.
- Cancelled : The proposal has been cancelled.
Is it OK for everyone if we create a Github Project dashboard to co-create and visualize all the rule proposals? - @benjagm
We need to do this in the linting repo, https://github.com/json-schema-org/json-schema-linting
Based on w3c/wot-thing-description#1194 it would also be nice to be able to check this as well:
- General descriptiveness:
- If a schema contains
const, does theconstsatisfy the schema?- If a schema contains
enum, does each item inenumsatisfy the schema?- @FadySalama
I've commented on that issue.
tl;dr - The rules for const and enum should be that they should be defined in isolation. There's no reason to add more constraints to schemas that contain these keywords because they fully constrain the value.
If
$refpoints to a remote resource over HTTP, it should usehttpsschema and nothttp- @mitar
URIs are only identifiers, not locators. Whether http or https is used is inconsequential because implementations (by default) shouldn't be accessing the internet to resolve references. These resources should be pre-registered with the implementation prior to validation.