jsonschema
jsonschema copied to clipboard
required, additionalProperties, propertyNames, dependentRequired and any other validators that reference property *names* need a way to indicate the erroring names in ValidationErrors
Trying to figure out if there is a way to retrieve the name of a missing required property from an error (without hacking it out of error.message).
For example:
from jsonschema import Draft4Validator as Validator
schema = {
'type': 'object',
'properties': {
'a': {
'type': 'string'
},
'b': {
'type': 'string',
'pattern': '^hello$'
}
},
'required': ['a', 'b']
}
errors = Validator(schema).iter_errors({'b': 'world'})
for error in errors:
print error.path, error.schema_path, error.message
# deque([]) deque(['required']) 'a' is a required property
# deque(['b']) deque(['properties', 'b', 'pattern']) 'world' does not match '^hello$'
In error.path and error.schema_path there is no mention of 'a'. Is there a way to determine that the property missing was 'a' without using error.message?
Hey.
There isn't unfortunately. There should be. It just doesn't exist because I haven't thought of a nice way to expose it yet.
There's an informal agreement that .path and .schema_path should always refer to things that exist, so there is not the right place.
I've wavered back and forth on adding an extra_info attribute to ValidationError which could store arbitrary extra info for each failure as decided by a validator, and there would be a reasonable place. Say, required would set error.extra_info to something like {"attributes" : {"all", "the", "missing", "ones}}.
I'm willing to live with something like that I think.
Doesn't the property technically exist for .schema_path? As an error occurs for each missing property, it seems to me like that would be an appropriate place. So it would be schema_path == ['required', 'a'].
Adding the information elsewhere feels like it would make calling code more complicated (more code branches to deal with extra_info).
For context, I am building a GUI dynamically from the schemas and then validating the populated values as an object. From the errors I create an error tree that maps back to the GUI component structure and shows error icons (with tooltip) next to the correct field. For required properties I am choosing to display the error next to the missing field rather than at the group level, so this is why I need the property reference.
I see. Yeah, putting it in schema_path would be reasonable.
(as ["required", 0] though – assuming what you had was just a typo, but just to clarify, the guarantee is that someone can always pull out the element that is being referred to in the path attributes).
Ah, yes - I forgot it was an array not an object there.
So then the logic in my system would be to retrieve the associated schema from .schema and use .schema_path to find out that the property name was 'a'? If so, that sounds great.
I agree this would be useful info to have. I'm not so sure about having it in schema_path though, as currently that always points to the json schema keyword, and I like that consistency. I sorta like the extra_info idea, as there are some other validators that could use it, for example, it would be nice to know with the additionalProperties validator which extra properties caused it to fail.
Another idea would be that validator_value becomes the single value being evaluated (validator_value == 'a'). Note sure if that breaks conventions though.
Happy with any solution that makes it easy to get at the appropriate value though.
On that note I have noticed that .schema refers to the original schema whilst .schema_path may be a subschema path so my previous idea of using .schema_path to index into .schema wouldn't work.
Hmm, I think .schema_path should always be traversable in the .schema of any given error. In the instance of a subschema, the .schema of the error should be the subschema. Is that not the case?
I realise I actually meant the other way round, but, yeah, it doesn't seem to work. Have opened a new issue to avoid diverting this one (#120).
I've came to this issue too, it would be nice to know what property caused an error, not only for the required validator.
Maybe a good approach is to return the JSON Path of the property: http://goessner.net/articles/JsonPath/
Bumped into this too. What's the way to go here? I can send a PR with what we decide to do.
I haven't looked at this carefully recently certainly, but I still don't really think there's a good way to do this without just doing ValidationError.extra_info
Voting +1 for this issue.
Having this issue as well.
I like the idea of adding the property name to error.schema_path.
Additionally I would suggest something like this:
def required_draft4(validator, required, instance, schema):
if not validator.is_type(instance, "object"):
return
for property in required:
if property not in instance:
- yield ValidationError("%r is a required property" % property)
+ yield ValidationError(
+ "%r is a required property" % property,
+ path=[property],
+ )
For reference, I currently use the following workaround:
# Custom validators
def _required(validator, required, instance, schema):
'''Validate 'required' properties.'''
if not validator.is_type(instance, 'object'):
return
for index, requirement in enumerate(required):
if requirement not in instance:
error = jsonschema.ValidationError(
'{0!r} is a required property'.format(requirement)
)
error.schema_path.append(index)
yield error
# Construct validator as extension of Json Schema Draft 4.
Validator = jsonschema.validators.extend(
validator=jsonschema.validators.Draft4Validator,
validators={
'required': _required
}
)
:+1: I too would like to pile on this issue. For the time being I will be looking into using the workaround above (thanks for providing that @thesociable)
+1 (thanks for the workaround @thesociable).
It would also be nice to have the parent paths to the "required" attribute.
The parents of the attribute is just error.absolute_path.
We also bumped in a similar limitation, accessing the list of fields for errors like Additional properties are not allowed ('field1', 'field2' were unexpected).
As said earlier, with a minimum of indication on the expected strategy we can submit a PR :)
Thanks !
+1, for now we show directly the error message via our API, but we'll need soon to able to translate and to give more human compliant error messages ;-)
@leplatrem the expected strategy I think is still to just introduce extra_info, which is the most reasonable thing I think unfortunately of the current options.
@fre-sch the sense of entitlement in your comment is unwelcome.
Your comment literally quotes the alternative I've suggested, continually, and consistently, over the same period of time.
Would you like to contribute a patch implementing it?
@Julian I've removed my latest comments and apologize for the tone. I'm working on a few RESTful HTTP APIs for my employer, and verbose error reporting is one of the goals of these APIs. So I've come across this issue once again and I guess my repeated need for the afore mentioned workaround just made me irritable.
@fre-sch appreciated. I certainly would like to see this resolved personally as well.
@Julian Regarding .extra_info: considering ValidationError already has a lot of attributes, I would feel hesitant to add another new attribute.
Have you considered .context? I see it's currently used to collect nested validation errors, but to me the name .context also implies "value depends on validator context".
Perhaps specialising ValidationError into <validator>ValidationError would be an alternative?
That way you could create a subclass RequiredValidationError with a missing_properties attribute. This would also work well for additionalProperties: AdditionalPropertiesValidationError having a additional_properties attribute? Another example I see: ValidationError.cause, it's only used for format validation errors, this would also extend into FormatValidationError with a .cause attribute.
Another benefit of specialising ValidationError would be a "cleaner", "simpler" ValidationError class.
How do you feel about that?
Sure, that sounds reasonable too -- it's trading cognitive burden on understanding the ValidationError class for needing to understand now an inheritance hierarchy, but that seems OK.
I don't like the fact that we used .context for as limited of a use case as we did, but deprecating ValidationError.context entirely and introducing FormatValidationError.nested_validation_errors instead sounds probably reasonable too.
If you're interested, I'd gladly review a PR implementing thse changes .
You could add the "required" field into your schema; and once you validate; you know which properties are missing. Worth a try.
I too am interested in machine-readable properties describing the error. Seems like an ID (REQUIRED_ITEM_MISSING, TYPE_MISMATCH, etc.) and fields like "expected" and "actual" are appropriate:
- required item missing: expected = path in the schema, actual = None
- additional item: expected = None, actual = whatever item is found
- type mismatch: expected = expected type, actual = type of whatever item is found
I'm glad I found this issue. This has been my question for the last couple of weeks. I'd been working on a solution to this when I came across this thread.
My 2 cents: ValidationError is far too general. I like the idea that @fre-sch had about specializing the errors. The exceptions thrown should be specific to the case they refer to. For instance, I should be able to catch an AdditionalPropertiesError when I specifically looking for validation errors regarding additional properties (say I'm tallying the validation failures when an additional property is present). And since I'm catching an AdditionalProperitesError, the documentation for that error will show what attributes contain the information I'm after. I'm of the mind that errors can have arbitrary bits attached to them as long as they are documented. Other than the attributes that Exception defines, I don't feel like these specialized exceptions should have an attribute convention. The attributes should be specific to the exception and documented (like they are with the built-in exceptions). I envision each validation function having its own exception.
Something like the approach suggested by @groutr sounds great.
It would also be great if the error would contain the affected "span" in the original JSON string (so that a GUI could highlight the location of the error), although I'm not sure if this is possible since jsonschema seems to work only on already parsed data (dict, not json string).