msgspec icon indicating copy to clipboard operation
msgspec copied to clipboard

Should we make use of attrs' `field(converter=...)` when decoding attrs types?

Open jcrist opened this issue 2 years ago • 5 comments
trafficstars

Description

Split out from https://github.com/jcrist/msgspec/issues/491#issuecomment-1652031394

From @jcrist:

We currently don't do anything with attrs converters. My reasoning was that IMO converters are for handling flexibility on the python side (so users can call Example(1, "2") and have things "just work"). But when decoding they felt like the wrong place to handle this logic. You might want your python code to be able to construct an attrs object from a wider set of types than you'd want to support in a REST api - coupling the types supported in __init__ to the types supported in decode felt wrong. That said, I'm not an attrs user so I'm not 100% sure what's expected behavior here.

From @aedify-swi:

I can't speak for the wider attrs user community, but I would expect those converters to be called when using msgspec.convert. I for one would never use them for inputs from the python side, because with type hints and all I am pretty clear on what a class expects me to pass. Instead, I am primarily using it for converting non-standard inputs (str -> UUID7) from users (REST API users that is) or for converting remote API data, that needs to be converted into my schema or python types. There is one API we're calling to get tracking data, for example, that sends IDs as ints in their REST API, even though I know it is a String ID in their system. So my converter takes that int, converts it to str and left-pads it with zeros.

Should we be calling a converter if configured on a field when decoding into an attrs type? Or should we treat that as an attrs-specific mechanism (or something meant only for configuring how the __init__ works) and rely on #483 for customization decoding/validation/conversion?

jcrist avatar Jul 26 '23 16:07 jcrist

cc @Tinche for any thoughts on what a typical attrs/cattrs user might expect here.

import attrs

def maybe_hex(x):
    return hex(x) if isinstance(x, int) else str(x)

@attrs.define
class Example:
    x: int
    y: str = attrs.field(converter=maybe_hex)

# Should the converter be called here? Or should it error since `y` isn't a `str`?
msgspec.json.decode(b'{"x": 1, "y": 2}', type=Example)

Right now we don't do anything with attrs' converter field options when decoding into attrs types. Should we?

I'd been treating those as for providing more flexibility to callers from python (anything calling __init__), and instead strictly enforcing the type annotations when decoding. This means the above would error, since y is not a str, rather than calling the converter. This made sense to me at the time, since relying on the converter would couple the types supported by the attrs constructor to the types supported when decoding from JSON, when you might want more flexibility on the python side and more strictness on the JSON side.

In #483 we might add a general mechanism for customizing decoding/validation across any type, so there's not necessarily a reason to rely on the attrs functionality here unless it's what users expect.

What's your experience been with cattrs? Is calling the converter when decoding a good idea? Or is coupling __init__ and decode like that something that may result in issues down the line?

jcrist avatar Jul 26 '23 16:07 jcrist

Do you go through __init__ when decoding? You don't, right?

Tinche avatar Jul 26 '23 16:07 Tinche

Correct, __init__ is not called. We do call the __attrs_post_init__/__attrs_pre_init__ hooks though.

jcrist avatar Jul 26 '23 16:07 jcrist

cattrs goes through __init__ so converters run, and I think folks would expect them to (also probably safer). attrs validators too!

Tinche avatar Jul 27 '23 00:07 Tinche

Makes sense - thanks for the quick feedback @Tinche!

jcrist avatar Jul 29 '23 01:07 jcrist