jsonschema icon indicating copy to clipboard operation
jsonschema copied to clipboard

required, additionalProperties, propertyNames, dependentRequired and any other validators that reference property *names* need a way to indicate the erroring names in ValidationErrors

Open martinpengellyphillips opened this issue 12 years ago • 29 comments
trafficstars

Trying to figure out if there is a way to retrieve the name of a missing required property from an error (without hacking it out of error.message).

For example:

from jsonschema import Draft4Validator as Validator

schema = {
    'type': 'object',
    'properties': {
        'a': {
            'type': 'string'
        },
        'b': {
            'type': 'string',
            'pattern': '^hello$'
        }
    },
    'required': ['a', 'b']
}

errors = Validator(schema).iter_errors({'b': 'world'})
for error in errors:
    print error.path, error.schema_path, error.message

# deque([]) deque(['required']) 'a' is a required property
# deque(['b']) deque(['properties', 'b', 'pattern']) 'world' does not match '^hello$'

In error.path and error.schema_path there is no mention of 'a'. Is there a way to determine that the property missing was 'a' without using error.message?

martinpengellyphillips avatar Aug 16 '13 17:08 martinpengellyphillips

Hey.

There isn't unfortunately. There should be. It just doesn't exist because I haven't thought of a nice way to expose it yet.

There's an informal agreement that .path and .schema_path should always refer to things that exist, so there is not the right place.

I've wavered back and forth on adding an extra_info attribute to ValidationError which could store arbitrary extra info for each failure as decided by a validator, and there would be a reasonable place. Say, required would set error.extra_info to something like {"attributes" : {"all", "the", "missing", "ones}}.

I'm willing to live with something like that I think.

Julian avatar Aug 18 '13 22:08 Julian

Doesn't the property technically exist for .schema_path? As an error occurs for each missing property, it seems to me like that would be an appropriate place. So it would be schema_path == ['required', 'a'].

Adding the information elsewhere feels like it would make calling code more complicated (more code branches to deal with extra_info).

For context, I am building a GUI dynamically from the schemas and then validating the populated values as an object. From the errors I create an error tree that maps back to the GUI component structure and shows error icons (with tooltip) next to the correct field. For required properties I am choosing to display the error next to the missing field rather than at the group level, so this is why I need the property reference.

martinpengellyphillips avatar Aug 19 '13 12:08 martinpengellyphillips

I see. Yeah, putting it in schema_path would be reasonable.

(as ["required", 0] though – assuming what you had was just a typo, but just to clarify, the guarantee is that someone can always pull out the element that is being referred to in the path attributes).

Julian avatar Aug 19 '13 12:08 Julian

Ah, yes - I forgot it was an array not an object there.

So then the logic in my system would be to retrieve the associated schema from .schema and use .schema_path to find out that the property name was 'a'? If so, that sounds great.

martinpengellyphillips avatar Aug 19 '13 12:08 martinpengellyphillips

I agree this would be useful info to have. I'm not so sure about having it in schema_path though, as currently that always points to the json schema keyword, and I like that consistency. I sorta like the extra_info idea, as there are some other validators that could use it, for example, it would be nice to know with the additionalProperties validator which extra properties caused it to fail.

gazpachoking avatar Aug 19 '13 14:08 gazpachoking

Another idea would be that validator_value becomes the single value being evaluated (validator_value == 'a'). Note sure if that breaks conventions though.

Happy with any solution that makes it easy to get at the appropriate value though.

On that note I have noticed that .schema refers to the original schema whilst .schema_path may be a subschema path so my previous idea of using .schema_path to index into .schema wouldn't work.

martinpengellyphillips avatar Aug 19 '13 14:08 martinpengellyphillips

Hmm, I think .schema_path should always be traversable in the .schema of any given error. In the instance of a subschema, the .schema of the error should be the subschema. Is that not the case?

gazpachoking avatar Aug 19 '13 14:08 gazpachoking

I realise I actually meant the other way round, but, yeah, it doesn't seem to work. Have opened a new issue to avoid diverting this one (#120).

martinpengellyphillips avatar Aug 19 '13 16:08 martinpengellyphillips

I've came to this issue too, it would be nice to know what property caused an error, not only for the required validator.

Maybe a good approach is to return the JSON Path of the property: http://goessner.net/articles/JsonPath/

ricobl avatar Dec 04 '13 12:12 ricobl

Bumped into this too. What's the way to go here? I can send a PR with what we decide to do.

danmilon avatar Feb 19 '14 14:02 danmilon

I haven't looked at this carefully recently certainly, but I still don't really think there's a good way to do this without just doing ValidationError.extra_info

Julian avatar Feb 19 '14 18:02 Julian

Voting +1 for this issue.

alexanderad avatar Feb 24 '14 12:02 alexanderad

Having this issue as well.

I like the idea of adding the property name to error.schema_path.

Additionally I would suggest something like this:

 def required_draft4(validator, required, instance, schema):
     if not validator.is_type(instance, "object"):
         return
     for property in required:
         if property not in instance:
-            yield ValidationError("%r is a required property" % property)
+            yield ValidationError(
+                "%r is a required property" % property,
+                path=[property],
+            )

fre-sch avatar Mar 13 '14 08:03 fre-sch

For reference, I currently use the following workaround:

# Custom validators
def _required(validator, required, instance, schema):
    '''Validate 'required' properties.'''
    if not validator.is_type(instance, 'object'):
        return

    for index, requirement in enumerate(required):
        if requirement not in instance:
            error = jsonschema.ValidationError(
                '{0!r} is a required property'.format(requirement)
            )
            error.schema_path.append(index)
            yield error


# Construct validator as extension of Json Schema Draft 4.
Validator = jsonschema.validators.extend(
    validator=jsonschema.validators.Draft4Validator,
    validators={
        'required': _required
    }
)

martinpengellyphillips avatar Apr 15 '14 09:04 martinpengellyphillips

:+1: I too would like to pile on this issue. For the time being I will be looking into using the workaround above (thanks for providing that @thesociable)

abemusic avatar Apr 15 '14 17:04 abemusic

+1 (thanks for the workaround @thesociable).

It would also be nice to have the parent paths to the "required" attribute.

fredrikbonander avatar May 08 '14 13:05 fredrikbonander

The parents of the attribute is just error.absolute_path.

Julian avatar May 08 '14 13:05 Julian

We also bumped in a similar limitation, accessing the list of fields for errors like Additional properties are not allowed ('field1', 'field2' were unexpected).

As said earlier, with a minimum of indication on the expected strategy we can submit a PR :)

Thanks !

leplatrem avatar Apr 22 '16 08:04 leplatrem

+1, for now we show directly the error message via our API, but we'll need soon to able to translate and to give more human compliant error messages ;-)

ludovic-gasc avatar Apr 22 '16 10:04 ludovic-gasc

@leplatrem the expected strategy I think is still to just introduce extra_info, which is the most reasonable thing I think unfortunately of the current options.

Julian avatar Apr 22 '16 11:04 Julian

@fre-sch the sense of entitlement in your comment is unwelcome.

Your comment literally quotes the alternative I've suggested, continually, and consistently, over the same period of time.

Would you like to contribute a patch implementing it?

Julian avatar Oct 25 '16 11:10 Julian

@Julian I've removed my latest comments and apologize for the tone. I'm working on a few RESTful HTTP APIs for my employer, and verbose error reporting is one of the goals of these APIs. So I've come across this issue once again and I guess my repeated need for the afore mentioned workaround just made me irritable.

fre-sch avatar Oct 25 '16 19:10 fre-sch

@fre-sch appreciated. I certainly would like to see this resolved personally as well.

Julian avatar Oct 25 '16 23:10 Julian

@Julian Regarding .extra_info: considering ValidationError already has a lot of attributes, I would feel hesitant to add another new attribute.

Have you considered .context? I see it's currently used to collect nested validation errors, but to me the name .context also implies "value depends on validator context".

Perhaps specialising ValidationError into <validator>ValidationError would be an alternative?

That way you could create a subclass RequiredValidationError with a missing_properties attribute. This would also work well for additionalProperties: AdditionalPropertiesValidationError having a additional_properties attribute? Another example I see: ValidationError.cause, it's only used for format validation errors, this would also extend into FormatValidationError with a .cause attribute.

Another benefit of specialising ValidationError would be a "cleaner", "simpler" ValidationError class.

How do you feel about that?

fre-sch avatar Oct 26 '16 05:10 fre-sch

Sure, that sounds reasonable too -- it's trading cognitive burden on understanding the ValidationError class for needing to understand now an inheritance hierarchy, but that seems OK.

I don't like the fact that we used .context for as limited of a use case as we did, but deprecating ValidationError.context entirely and introducing FormatValidationError.nested_validation_errors instead sounds probably reasonable too.

If you're interested, I'd gladly review a PR implementing thse changes .

Julian avatar Oct 29 '16 16:10 Julian

You could add the "required" field into your schema; and once you validate; you know which properties are missing. Worth a try.

pampas93 avatar Jul 24 '17 15:07 pampas93

I too am interested in machine-readable properties describing the error. Seems like an ID (REQUIRED_ITEM_MISSING, TYPE_MISMATCH, etc.) and fields like "expected" and "actual" are appropriate:

  • required item missing: expected = path in the schema, actual = None
  • additional item: expected = None, actual = whatever item is found
  • type mismatch: expected = expected type, actual = type of whatever item is found

jason-s avatar Jul 25 '17 17:07 jason-s

I'm glad I found this issue. This has been my question for the last couple of weeks. I'd been working on a solution to this when I came across this thread.

My 2 cents: ValidationError is far too general. I like the idea that @fre-sch had about specializing the errors. The exceptions thrown should be specific to the case they refer to. For instance, I should be able to catch an AdditionalPropertiesError when I specifically looking for validation errors regarding additional properties (say I'm tallying the validation failures when an additional property is present). And since I'm catching an AdditionalProperitesError, the documentation for that error will show what attributes contain the information I'm after. I'm of the mind that errors can have arbitrary bits attached to them as long as they are documented. Other than the attributes that Exception defines, I don't feel like these specialized exceptions should have an attribute convention. The attributes should be specific to the exception and documented (like they are with the built-in exceptions). I envision each validation function having its own exception.

groutr avatar Jul 26 '17 02:07 groutr

Something like the approach suggested by @groutr sounds great.

It would also be great if the error would contain the affected "span" in the original JSON string (so that a GUI could highlight the location of the error), although I'm not sure if this is possible since jsonschema seems to work only on already parsed data (dict, not json string).

dbrgn avatar May 12 '18 08:05 dbrgn