cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

NamedTuple to dictionary serialization

Open karlicoss opened this issue 2 years ago • 7 comments

  • cattrs version: 23.1.2
  • Python version: 3.10
  • Operating System: Ubuntu 22.04

Description

For my project, cachew, I've been trying to switch from using my custom serialization cold to some existing library. After some evaluation, cattrs seems like the most promising since it works so well with python typing annotations. One thing that stops me is that seems like it doesn't support NamedTuple? I feel like it should be treated just like a @dataclass, but seems like cattrs just treats them as 'unknown' types and leaves intact.

What I Did

This is kinda the minimal snippet that shows the issue:

from dataclasses import dataclass
from typing import NamedTuple

import cattrs

@dataclass
class D:
    value: int


class N(NamedTuple):
    value: int


und = cattrs.unstructure(D(value=123), D)
unn = cattrs.unstructure(N(value=123), N)

print(und, type(und))
print(unn, type(unn))

The result is

{'value': 123} <class 'dict'>
N(value=123) <class '__main__.N'>

So the dataclass is serialized just as expected, but the namedtuple is just left intact.

I searched in cattrs source code, and it seems like NamedTuple isn't mentioned at all -- so perhaps it was just forgotten rather than deliberately left out? I know I can implement a custom adapter, but feels like it makes sense to have them treated as dataclasses by default?

karlicoss avatar Sep 11 '23 01:09 karlicoss

Howdy,

I think it's a combination of nobody having asked for it and supporting libraries probably being able to handle named tuples natively.

Most of the time, output from cattrs is directly piped to another library, like orjson or msgpack, and those probably support named tuples already (since they are tuples)?

Although now that I think about it, the current approach is inadequate because it will not unstructure deeply. So yeah, agreed on it needing to be fixed. A little bit unsure on how to unstructure named tuples though, my first instinct would be to unstructure to lists (like ordinary tuples), not dicts.

Tinche avatar Sep 11 '23 20:09 Tinche

Ah thanks, good to know! IMO they are more like dataclass rather than plain tuples since they are typed properly and have field names. But perhaps this could be configurable if people prefer 'legacy' behaviour for some reason.

karlicoss avatar Sep 17 '23 10:09 karlicoss

Hi,

I'd like to contribute to the implementation of this.

A "standard library" way to serialize a NamedTuple would be something like the below, which I adapted from [1] and [2]. (typing.NamedTuple is implemented as a function and so isinstance(a_named_tuple, typing.NamedTuple) throws a TypeError.)

def is_namedtuple(x):
    type_ = type(x)
    bases = type_.__bases__
    if len(bases) != 1 or bases[0] != tuple:
        return False
    fields = getattr(type_, '_fields', None)
    if not isinstance(fields, tuple):
        return False
    return all(isinstance(f, str) for f in fields)

def unstructure(obj):
    if isinstance(obj, dict):
        return {key: unstructure(value) for key, value in obj.items()}
    elif isinstance(obj, list):
        return list(map(unstructure, obj))
    elif is_namedtuple(obj):
        return {key: unstructure(value) for key, value in obj._asdict().items()}
    elif isinstance(obj, tuple):
        return tuple(map(unstructure, obj))
    else:
        return obj

I would suggest implementing the following way:

  1. Add is_namedtuple to src/_compat.py.
  2. Add gen_unstructure_namedtuple to cattrs.Converter.
  3. Add self.register_unstructure_hook_factory(is_namedtuple, self.gen_unstructure_namedtuple) to cattrs.Converter._init_.
  4. Add tests and documentation.

I would very happily put the work detailed in the list in a PR. Do let me know if you'd approach this any other way :)

References:

  1. https://stackoverflow.com/questions/2166818/how-to-check-if-an-object-is-an-instance-of-a-namedtuple
  2. https://stackoverflow.com/questions/33181170/how-to-convert-a-nested-namedtuple-to-a-dict?noredirect=1&lq=1

gorkaerana avatar Nov 29 '23 14:11 gorkaerana

Interesting!

A couple of points.

I think we will want to unstructure namedtuples into tuples for backwards compatibility. I see your code unstructures them into dictionaries. This is also a useful feature, but it should be a strategy instead. Then anyone can apply this strategy to any converter and get the behavior.

My first instinct would be to try to reuse the gen.make_dict_unstructure_fn functionality. It already supports attrs classes and dataclasses, and I think it can be adapted for NamedTuples easily. This would require a slight refactor, probably splitting part of that function into a private one that takes a list[Attribute]. Using this shares code and will be fast.

Then, write an adapter that can transform a NamedTuple class (not instance) into a list of attributes, like this.

Then, in the strategy, override the hooks to generate the proper unstructuring functions.

That's how I would go about it.

Happy to provide guidance if you decide to try!

Tinche avatar Nov 30 '23 17:11 Tinche

Sounds good! How about I give it a try in a PR and we take it from there?

gorkaerana avatar Jan 23 '24 10:01 gorkaerana

Go for it!

Tinche avatar Jan 23 '24 22:01 Tinche

Howdy, I've just merged https://github.com/python-attrs/cattrs/pull/491. This adds the namedtuples-as-tuples functionality to cattrs. I'll leave this issue open while other folks work on a strategy for namedtuples-as-dataclasses.

Tinche avatar Jan 28 '24 22:01 Tinche

this is fantastic!

do you plan a new release to include namedtuples?

gnzsnz avatar Jun 23 '24 08:06 gnzsnz

Yep, finishing it up now. The only thing left is to rework the readme

Tinche avatar Jun 23 '24 10:06 Tinche

Yep, finishing it up now. The only thing left is to rework the readme

I know that I'm a pain in the ... but I'm looking forward to have namedtuples in cattrs.

I'm happy to contribute with any changes needed, just let me know.

gnzsnz avatar Jul 23 '24 18:07 gnzsnz