trafaret
trafaret copied to clipboard
Error classification
I'd like to start a discussion about everything that comes with errors.
At the moment we have only one type of error - DataError
, which is when something is invalid.
The problem I see here is when you want to give any extra details about what gone wrong, you can't.
In my application I've come up across this issue and the only way I could deal with it is something like this:
from enum import Enum
class ErrorCode(Enum):
required = 'required'
too_big = 'too_big'
too_short = 'too_short'
invalid = 'invalid'
def make_code(text: str) -> ErrorCode:
if text == 'is required':
return ErrorCode.required
if any(phrase in text for phrase in ('long', 'is greater', 'should be less', 'too many')):
return ErrorCode.too_big
if any(phrase in text for phrase in ('short', 'is less', 'should be greater')):
return ErrorCode.too_short
return ErrorCode.invalid
error: trafaret.DataError
# some logic to get an error
error_code = make_code(error.error)
This is ugly and absolutely not reliable. Trafarets tend to have different messages describing such cases, and next time I'm gonna add a trafaret I'm not using just yet, I can't be sure it works like expected. Not speaking about custom extensions...
In my opinion there should be internal error classification and all the trafarets we have so far should stick to it. And honestly, I think we need to drop error messages too because really they're out of scope of this library, it is simply too much specific context that generally lives on app level. Such system also has to be extendable for specific cases when trafaret's basics aren't enough. I mean you should be able to easily make your own error types and use them inside library ecosystem, not around it.
First approach I'd like to consider is introduce some classes like generic InvalidError
along with more specific RequiredError
, TooBigError
and so on, which all subclass DataError
. Also it would be handy to add some shortcuts for Trafaret
class as well. And maybe deal with OnError
trafaret in some way to go along with the changes?
I'm willing to make a pull request with those improvements, but of course we have to come to terms first about how it should be done. What do you think?
I think it is good to have DataError subclasses, due it is how exceptions in python designed to be. And code
property would be nice to have too. Like RequiredError().code == 'required'
.
Different error classes are for catching them. Do you have a use case for it? Maybe just comprehensive error reporting (error text and auxiliary attributes) would be enough?
Codes and exception hierarchy is good thing. Why not to have it? It will not hurt experience, but can provide more use cases of library. You can safely ignore hierarchy and codes.
Up to you
@asvetlov typical use case would be providing structured information (possibly in JSON) to frontend application reporting what validation problems were encountered.
Existing solution is somehow usable for English speaking applications, but it is rather limited.
Existing exception is good in that it provides nested information about the failure. What is missing is reliable distinction of error types and ability to address multiple errors at once (the later could be probably omited).
The concept of solution could be:
- raise nested exception as now
- each exception instance on any level would provide:
- error type (either by class name, error code or something similar)
- readable text, describing the problem
- of course expression addressing the part of data structure causing the problem (as is today)
Things are best developed under some real scenario, so I would think of taking any of existing exception and expressing them in form of JSON data structure (think of the the fact, keys can be only strings).
Taking an example from README.rst:
{'b': DataError({1: DataError(value is not a string)})}
it could be expressed e.g. as:
{
"address": [
{
"type": "key",
"value": "b"
},
{
"type": "index",
"value": 1
}
],
"message": "value is not a string",
"error_type": "InvalidType"
}
This is just very first proposal, there might be better ways for addressing the element.
Exact JSON schema is up to you, it will not be a part of trafaret. So I vote for exceptions codes and hierarchy. And probably we will need to modify OnError to be able to raise particular exception and code in addition to message.
I agree on what @asvetlov is talking about.
Rather than doing something like this:
import trafaret as t
try:
t.Int(gt=5, lt=10).check(some_var)
except t.TooShortError:
message = 'not big enough'
except t.TooBigError:
message = 'way too big'
except t.InvalidTypeError:
message = 'expecting an integer'
except t.DataError: # all other errors I don't care about
message = 'invalid value'
I would prefer this way:
import trafaret as t
errors = {
'too_short': 'not big enough',
'too_big': 'way too big'
}
try:
t.Int(gt=5, lt=10).check(some_var)
except t.DataError as exc:
message = errors.get(exc.code, 'invalid value')
It is much less hassle to deal with codes rather than specific exceptions. I can hardly imagine someone needs those, can you? Making subclasses for different types of errors assumes you need to handle those errors in different ways. But in reality you don't. All this lib is about is checking whether you can go with the data provided or not, at least in my point of view.
So if it's not the case, I'm going only for codes.
Quoting import this
There should be one-- and preferably only one --obvious way to do it.
I would strongly prefer the first way (using Exception classes) for following reasons:
- it seems much more readable
- it is common patter in Python to handle exceptions this way.
- error codes are linear, exceptions allow inheritance, so one can implement general case with specific specialization exceptions and catch it on arbitrary level of abstraction
- If you need just custom translation of error messages, you may use dictionary such as:
{t.TooShortError: "not big enough",
t.TooBigError: "way too big",
t.InvalidTyeError: "expecting an integer",
t.DataError: "invalid value"}
Alternatively there is also functools.singledispatch
which allows object-type based handler. But this is rather implementation detail.
To summarize: I would prefer class based solution as code-based one seems to introduce unnecessary complexity without significant benefit.
@vlcinsky if you want to play with words -- error codes could be the only and obvious way :)
- Validation exceptions have no native hierarchy.
TooBigError
is for number or for list/tuple? Should tuple use a different exception type than list? - singledispatch doesn't work well: for nested types like
Dict
the exception instance should be visited down to deepest nested error. - What is the reason for catching articular
TooBigError
? Most likely you need to catch all validation exceptions by the singleexcept
. - Error codes still are required for error serialization (JSON and others), classes are not serializable in most formats.
- What are specific attributes for concrete exceptions? How and when user has access it? A stairway with 50
try/except
clauses? Really?
@asvetlov I like your call for real use case as this could be the tool to evaluate, which approach works better.
Less important notes:
- Validation exception hierarchy: I do not necessarily ask for one. But if someone defines custom data types and starts raising custom exceptions, there could be some. The hierarchy could be useful here to keep different exceptions (from different modules) apart.
- singledispatch: this is also not a requirement. But even diving deep into nested exception, at any moment trafaret related exception should be handled somehow (serializad, printed...), it could serve.
- Catching exceptions by single
except
: no problem if we derive all exceptions fromtrafaret.DataError
. But then, further processing of (possibly nested) exception is challenging regardless of what approach (error codes or exception classes) is taken. - JSON serialization: why not using full class name (incl. modules). It is a string and this namespace is well organized in Python.
- here I got a bit lost. You probably have some use case in mind, which I am not aware of.
Why I prefer Exceptions over error codes
I did some refactoring of error handling in my web application and got a bit unhappy with error codes finding subclassing Exception better:
- as a class, it has natural name and it has natural place to put docstring to (and read about it when coding). Documenting a dictionary is much more difficult.
- as a class, one could customize attributes and even methods as needed (if needed).
- managing error codes could become a problem once one admits, some library user will subclass it and extend it. Codes could easily clash
Important: Searching for use cases
There are always multiple ways to do it, clearly defined use cases could help detecting the most practical approach.
I have already mentioned the use case with serializing the exception in such a way, JavaScript could use it for reporting the problem to user.
Any more cases?
And there are other questions about the exceptions:
- is there a need for exception hierarchy?
- are the exceptions to be nested? Probably yes. But are we expecting single path to single deep exception or we anticipate even multiple exceptions on different places of the data structure at once?
Exceptions can be used to write trafarets. Anyway I don't see anything harmful in exceptions hierarchy.
please, don't make PRs for master branch, due I don't want to merge it with version2.
OK, I'll take some time to think through everything posted here and will make a PR towards version2 branch hopefully soon.