jsonschema
jsonschema copied to clipboard
Make json look prettier in errors
I'm not sure if this is a real issue or not, but something that bothered me is that validation errors use "%r"
as a format string which causes your json to look pretty ugly in the error message with all strings looking like {u'key':u'value'}
instead of {"key":"value"}
.
In my json API when I return an error it might be confusing for the developer using my API that there's suddenly and u
in a string because that's not valid JSON.
Of course I could build a custom error message from all the errors I get, but that would be a bit tedious. Would it be something to consider to print the schemas and validated data as json strings instead of python dicts in error messages?
Perhaps it's something worth discussing.
I'll follow up with a longer reply when I get a sec (sorry) -- the general answer I've had for this question has been "error messages are for developers, not end users" -- showing reprs is important for developers because reprs are what's generally informative to know what's going on.
That being said, I think it's come up to parametrize the actual thing used here because of other reasons (truncating long instances via reprlib.repr
) and so possibly adding that feature would make you be able to do this too if you felt like it.
Also FWIW it'd be less than tedious I think depending on exactly what you're after, but yes a slight bit of copy pasting from ValidationError.__str__
if you wanted the exact same format.
I can follow up with some more, definitely worth discussing.
Also FWIW it'd be less than tedious I think depending on exactly what you're after, but yes a slight bit of copy pasting from ValidationError.str if you wanted the exact same format.
The problem with this is that the validators in jsonschema._validators
also user __repr__
in their error messages if I recall correctly. So sadly enough a bit less trivial.
Currently now I hacked in some regex that removed the u
which is of course a bit of a fragile hack..
I'm curious about your idea to parameterize stuff and would love to hear more!
+1! I'm brand new as a user of jsonschema
; I used it yesterday to set up a validation for some data I'd refactored out of the HTML in a legacy static website I'm maintaining. The package validates correct data beautifully, so many thanks for that! But the errors that come out when the data do not conform to the schema will be intimidating to the often unsophisticated users who will be modifying the data in the future. I appreciate that the goal of this project may not to be to serve as a lightweight data validation tool for end-users, but it's very close to achieving that goal as is.
On similar note, one thing that annoyed me a bit was that the validation error didn't tell you exactly where in the json doc the error occurred. I wrote a small utility function which fixed that (see https://github.com/ccpgames/jsonschema-errorprinter). This is how the example error from the readme file would look:
>>> print check_json({"name" : "Eggs", "price" : "Invalid"}, schema)
Schema check failed for '?'
Error in line 2:
1: {
2: >>> "price": "Invalid",
3: "name": "Eggs"
4: }
'Invalid' is not of type 'number'
Failed validating 'type' in schema['properties']['price']:
{'type': 'number'}
On instance['price']:
'Invalid'
+1 Actually, considering that the output will be different depending on the Python version used (u'value' in Python 2 vs. 'value' in Python 3), it is not just about "prettier" output, but about portability and consistency. So I'm wondering whether this should be classified as bug, not enhancement?
That's a feature, not a bug :) -- what makes you say it's a bug?
Tracebacks are not meant to be falsely consistent, they should tell the developer what's wrong so they can figure out how to fix it :)
:) What I was trying to suggest is that a developer who's user of jsonschema (unlike the one who develops for jsonschema itself) is probably more interested in whether a given JSON object is of a certain JSON type (say, an array) than whether its internal string representation is of class "unicode" when jsonschema runs on Python2.7 and "str" when jsonschema runs on Python3.5. And, frankly, getting different error messages ("None is not of type u'array'" vs "None is not of type 'array'") depending on python version makes writing portable test cases or web APIs returning error messages just more cumbersome than adding value...
@fzdarsky I'm all for making things more usable! Please don't take any of this as shooting down ideas :)
I think a developer who's using jsonschema should be just as interested in which Python type he/she has, since that affects the behavior they see. That's exactly what a repr
is supposed to show: "here's as much information as you might need to be able to tell what kind of object you have".
As for differing error messages across versions: error messages are not part of jsonschema
's public API, and tests that assert against them are "at your own risk" (usually it's just done because someone's unaware of the better way from what I've seen). The way to assert against validation errors is to use the programmatic API, and that does not differ across versions, i.e. self.assertEqual((error.validator, error.instance), ("type", None))
, not an assertion against the message.
@mattig That's really cool, and it nicely solves what I was getting at in my above comment. Wish I'd known of it at the time. I no longer work at the job where my above comment was raised (time flies, haha), but they're open sourcey, so if I'm feeling ambitious in my free time I may add your functionality to their stack.
@Julian I am curious about your vision for the scope of jsonschema
- it seems from this discussion that there are two modes of usage for this package. One is for developers, who may want to read exception dumps and/or catch them in flexible ways. Another is for "data librarians" for lack of a better term, who want pretty printing of validation failures so they can fix their data. jsonschema-errorprinter
serves the second class of user by wrapping the exception with minimal Python (<100 lines of code). Is that the kind of thing you'd rather have in a separate project, or integrated into this project?
@ramanshah I forget if I've seen that library before, but from a quick 10 second read it's quite minimal in what it changes. I'm all for making the tracebacks as useful as possible to developers yeah, although there's certainly 2 schools of thought on how long tracebacks ought to be, and so I try to strike a bit of a balance. In theory though, if it's liked enough, that sort of thing could go here.
What I don't want to get into in this library is supporting messaging for end-users, because it's a can of worms. I can provide all the hooks necessary to create such a thing, and I believe all those hooks should exist already (would love to create any that don't, so if any are missing please file an issue), but I don't think that error messages emitted by jsonschema should be directly rendered for non-developer users at the moment -- jsonschema needs to pick one of the two, developers or end-users, and it currently picks the former.
Thanks for the clarification.
@Julian, thanks for the advise. I missed that error messages are not part of the public API, so your point is very valid.
For more examples here, see the schema+instance in #350 which produces a ridiculous output. The problem there is more about indentation, but let's be sure we cover it.