jsonschema icon indicating copy to clipboard operation
jsonschema copied to clipboard

Less Specific Validation Errors for "oneOf" Schemas with Nested Sub-Schemas On Transition from 4.7 -> 4.8

Open michaelschmit opened this issue 1 year ago • 3 comments

On upgrade from 4.7 to 4.8 we ended up getting much less specific errors on "oneOf" schema validation with nested sub-schemas. I assume this is tied to the commit bd5ea732d2571b6522ecd83d79bd185590bac27d - "Don't let best_match traverse into applicators with equally bad sub-errors".

Previously, if a sub-schema under the "oneOf" would have something like a missing required field, the validation error would indicate an error like "'[property name]' is a required property". After upgrade to 4.8, the validation error that is now returned is "[json string] is not valid under any of the given schemas".

This is a much less specific error and would require having to find the matching oneOf schema and manually having to traverse each sub-schema by hand essentially to locate what was causing the validation issue.

michaelschmit avatar Jul 29 '22 15:07 michaelschmit

This should only be happening if you have schemas where all the branches are equally wrong (or equally right). I'd need to see a specific example to see whether it's improvable.

Note by the way that if you just want to do so indiscriminately, you probably can do:

import jsonschema.exceptions
validator = jsonschema.Draft202012Validator({"oneOf": [{"required": ["foo"]}, {"required": ["bar"]}, {"required": ["baz"]}]})

error, = validator.iter_errors({})
while error.context:
    error = jsonschema.exceptions.best_match(error.context)
print(error)

Julian avatar Jul 29 '22 15:07 Julian

We are validating a very complex document that has nested "oneOf" and "allOf". I'll try to put a document and schema together today that will highlight what we are seeing.

michaelschmit avatar Jul 29 '22 15:07 michaelschmit

I really struggled with not specific error messages when using "oneOf". What ended up working for me was to patch jsonschema._validators.oneOf to print out all error messages:

for e in all_errors:
    print(str(e))
    print("\n\n" + "-" * 80 + "\n\n")

hauntsaninja avatar Aug 06 '22 21:08 hauntsaninja

I think due to a bug in the schema we built it was throwing a validation error for the wrong reason. This change then caused it to error for a more valid reason. Closing ...

michaelschmit avatar Aug 18 '22 15:08 michaelschmit

Hooray :) glad to hear, thanks for following up.

Julian avatar Aug 18 '22 15:08 Julian

Hmm, we're seeing a significant regression in user messages, presumably because of this. Granted, it's hard to provide good error messages for something like

{'anyOf': [{'additionalProperties': False,
                'properties': {'hostname': {'type': 'string'},
                               'network_type': {'enum': ['managed',
                                                         'unmanaged'],
                                                'type': 'string'},
                               'port_id': {'type': 'string'},
                               'switch_id': {'type': 'string'},
                               'switch_info': {'type': 'string'}},
                'required': ['port_id', 'switch_id'],
                'type': 'object'},
               {'additionalProperties': False,
                'properties': {'hostname': {'type': 'string'},
                               'network_type': {'enum': ['managed',
                                                         'unmanaged'],
                                                'type': 'string'},
                               'port_id': {'type': 'string'},
                               'switch_id': {'type': 'string'},
                               'switch_info': {'type': 'string'}},
                'required': ['port_id', 'hostname'],
                'type': 'object'},
               {'additionalProperties': False,
                'properties': {'hostname': {'type': 'string'},
                               'network_type': {'enum': ['unmanaged'],
                                                'type': 'string'},
                               'port_id': {'type': 'string'},
                               'switch_id': {'type': 'string'},
                               'switch_info': {'type': 'string'}},
                'required': ['network_type'],
                'type': 'object'},
               {'additionalProperties': False, 'type': 'object'}]}

in all case, but e.g. all of them have 'additionalProperties': False. Previously (<= 4.7 I think) we used to have Additional properties are not allowed, now we only have ... is not valid under any of the given schemas.

dtantsur avatar Aug 29 '22 15:08 dtantsur

Another pretty unfortunately example from our API:

STANDARD_TRAITS = # ... some list
CUSTOM_TRAIT_PATTERN = "^CUSTOM_[A-Z0-9_]+$"
TRAITS_SCHEMA = {'anyOf': [
    {'type': 'string', 'minLength': 1, 'maxLength': 255,
     'pattern': CUSTOM_TRAIT_PATTERN},
    {'type': 'string', 'enum': STANDARD_TRAITS},
]}

{
    'type': 'object',
    'properties': {
        'description': {'type': ['string', 'null'], 'maxLength': 255},
        'extra': {'type': ['object', 'null']},
        'name': TRAITS_SCHEMA,
        'steps': {'type': 'array', 'items': api_utils.DEPLOY_STEP_SCHEMA,
                  'minItems': 1},
        'uuid': {'type': ['string', 'null']},
    },
    'required': ['steps', 'name'],
    'additionalProperties': False,
}

With CUSTOM_AAAAA (a really long string) we used to have 'CUSTOM_AAAAA....' is too long, now again the generic error message. It basically means that we need to change our error handling to something like https://github.com/python-jsonschema/jsonschema/issues/977#issuecomment-1207283937 or migrate away from jsonscheme.

dtantsur avatar Aug 29 '22 15:08 dtantsur

Actually, opened a new bug instead of poking the already closed one: https://github.com/python-jsonschema/jsonschema/issues/991

dtantsur avatar Aug 30 '22 09:08 dtantsur