cattrs
cattrs copied to clipboard
structure: how to report path to invalid data element
- cattrs version: 0.9.0
- Python version: 3.6.ř
- Operating System: Debian 9
Description
I want to cattrs to load complex nested data and in case some validation/conversion fails, I want to provide reasonable context information about what part of data did not work properly.
What I Did
Having attrs based classes: Config with attributes source, fetch and publish, each holding value of specific (attrs based) class Source, Fetch and Publish.
If some data element is wrong (e.g. expecting integer and providing string "5a"), the structure process fails raising ValueError("could not convert string to float: '5a'",)
However, the error does not include any contextual information about where in my nested input the problem was read from.
It would be nice to get some sort of path in the exception, which I could use. marshmallow and trafaret are examples of similar solutions providing contextual information.
Here is possibly quite crazy idea how to report an error incl. path within input data leading to the failure.
Requirements:
- initially focus only on
structureand assume input data in form ofdict - allow reporting path to input data element causing raised conversion error
- keep required changes to code (structure_hook functions) to bare minimum
- tolerate structure_hook implementation not implementing new approach (possibly at cost of loosing part of path information)
- run fast, try to avoid any extra operations during happy scenario
Concept:
- path has form of list of
__getitem__arguments, e.g.["oak", 1]data["oak"][1] - no need to cover non-iterable data types
- focus on iterables. Store current position of iteration in local variable with agreed name
cattrs_i. This is the only required change tostructure_hookfunction implementation. - all path detection to be done within
cattr.structurefunction- detection is done by traversing traceback stack, inspecting local variables and collecting all values of
cattrs_ivariables in resulting path list. - store the path in
.pathexception property and raise the catched exception
- detection is done by traversing traceback stack, inspecting local variables and collecting all values of
Here is code to demonstrate how to detect the path from an exception raised in deeply nested call. If you store the code into test_path_detection.py, it shall be executable using pytest (expecting python 3.6+).
def int_structure_hook(val, dtype):
return int(val)
def list_structure_hook(lst, dtype):
return [int_structure_hook(itm, int) for cattrs_i, itm in enumerate(lst)]
def dict_structure_hook(dct, dtype):
return [list_structure_hook(val, list) for cattrs_i, val in dct.items()]
def structure(val):
try:
return dict_structure_hook(val, dict)
except ValueError as exc:
path = []
tb = exc.__traceback__
while tb:
path_elm = tb.tb_frame.f_locals.get("cattrs_i")
if path_elm:
path.append(path_elm)
tb = tb.tb_next
exc.path = path
raise exc
def test_it():
try:
res = structure({"oak": [1, "0aa", 3], "birch": [9, 2, 0]})
print(f"Happy result is: {res}")
except ValueError as exc:
print(f"Path {exc.path}: has problem: {exc}")
When called:
$ pytest test_path_detection.py -sv
the printed output related to reported path is
Path ['oak', 1]: has problem: invalid literal for int() with base 10: '0aa'
What do you think of that? No perfect results, but something, what helps navigating close to source of problem in many cases. Definitely would require (small) modifications in existing converters.
This is something I would definitely like to support, since getting an error somewhere deep can be very annoying indeed. Need to think about it.
@Tinche take your time, it is not an easy problem.
Here is alternative method: pass path via explicit argument to conversion function:
"""alternative passing path context via argument `path`
Converters have singature: func(val, dtype, *path)
where `path` is the path to the current element (list of values)
When calling, one uses original `path` value with * and adds new selector to the end
fun(val, dtype, *path, index)
what results in extended `path` value within the deeper function.
"""
def int_structure_hook(val, dtype, *path):
return int(val)
def list_structure_hook(lst, dtype, *path):
return [int_structure_hook(itm, int, *path, i) for i, itm in enumerate(lst)]
def dict_structure_hook(dct, dtype, *path):
return [list_structure_hook(val, list, *path, key) for key, val in dct.items()]
def structure(val, dtype):
try:
return dict_structure_hook(val, dict)
except ValueError as exc:
path = []
tb = exc.__traceback__
while tb:
deeper_path = tb.tb_frame.f_locals.get("path")
if deeper_path:
path = deeper_path
tb = tb.tb_next
exc.path = path
raise exc
def test_it():
try:
res = structure({"oak": [1, "0aa", 3], "birch": [9, 2, 0]}, dict)
print(f"Happy result is: {res}")
except ValueError as exc:
print(f"Path {exc.path}: has problem: {exc}")
assert exc.args[0] == "invalid literal for int() with base 10: '0aa'"
assert isinstance(exc, ValueError)
assert exc.path == ("oak", 1)
To avoid confusion with intermediate functions using path argument, traversing __traceback__ may check, that givel locals are within function which is registered at converter.
Incidentally, one of the hardest things to debug is when you have a NoneType that can't be converted into whatever the expected type is. Without a path, currently there's no way to even guess at which of the many nulls in your input it's failing on.
We could copy a few ideas from jsonschema, and potentially yield errors iteratively? Or maybe that's not really in scope, since cattrs need to return the new result.
But have a look at their ValidationError, there's a few fields there that we could potentially use.
Hey, sorry for kinda necro-post, but have there been any progress on this? It would really be very handy to have this feature :)
This is probably the next big feature I work on :)
Nice to hear it! Sorry for the question, but do you have any ETA for it?
So there's https://catt.rs/en/stable/validation.html#transforming-exceptions-into-error-messages in the last release, 23.1.x.
I'm going to close this as complete, let's open new tickets for any desired improvements!