cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

KeyError when using configure_tagged_union with existing field

Open bvitaz-ck opened this issue 1 year ago • 3 comments

  • cattrs version: 23.2.3
  • Python version: 3.11.5
  • Operating System: Windows 10

Description

I am trying to create a union type which uses an existing key to determine the type to use.

Looking at an example object, the "type" key word is used by the application to set up the correct type of device, and I want to use it to enable additional configuration settings:

    {
        "device": {
            "type": "special",
            "hostname": "hostname",
            "port": 1234
        }
    }

When I use configure_tagged_union(..., tag_name="type" ...), the "type" key seems to be consumed and is no longer available for the Device data class. This is causing a key error during the structure process.

What do I need to do to keep the "type" data available and use it for the tagged union?

What I Did

Example code:

import attrs
from cattrs import Converter
from cattrs.strategies import configure_tagged_union


@attrs.define
class SpecialDevice:
    hostname: str
    port: int
    type: str


@attrs.define
class StandardDevice:
    hostname: str
    type: str


Device = SpecialDevice | StandardDevice


@attrs.define
class Settings:
    device: Device


def tag_generator(t) -> str:
    return {
        SpecialDevice: "special",
        StandardDevice: "standard"
    }[t]


c = Converter()
configure_tagged_union(Device, c, tag_name="type", tag_generator=tag_generator, default=StandardDevice)
settings = c.structure(
    {
        "device": {
            "type": "special",
            "hostname": "hostname",
            "port": 1234
        }
    }, Settings)
print(settings)

Traceback:

  + Exception Group Traceback (most recent call last):
  |   File "AppData\Roaming\JetBrains\PyCharm2023.3\scratches\scratch.py", line 36, in <module>
  |     settings = c.structure(
  |                ^^^^^^^^^^^^
  |   File ".virtualenvs\gantry-control-1-NYP4DbyD\Lib\site-packages\cattrs\converters.py", line 332, in structure
  |     return self._structure_func.dispatch(cl)(obj, cl)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "<cattrs generated structure __main__.Settings>", line 9, in structure_Settings
  | cattrs.errors.ClassValidationError: While structuring Settings (1 sub-exception)
  +-+---------------- 1 ----------------
    | Exception Group Traceback (most recent call last):
    |   File "<cattrs generated structure __main__.Settings>", line 5, in structure_Settings
    |   File ".virtualenvs\gantry-control-1-NYP4DbyD\Lib\site-packages\cattrs\converters.py", line 632, in _structure_union
    |     return handler(obj, union)
    |            ^^^^^^^^^^^^^^^^^^^
    |   File ".virtualenvs\gantry-control-1-NYP4DbyD\Lib\site-packages\cattrs\strategies\_unions.py", line 106, in structure_tagged_union
    |     return _tag_to_hook[val.pop(_tag_name)](val)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File ".virtualenvs\gantry-control-1-NYP4DbyD\Lib\site-packages\cattrs\strategies\_unions.py", line 57, in structure_union_member
    |     return _h(val, _cl)
    |            ^^^^^^^^^^^^
    |   File "<cattrs generated structure __main__.SpecialDevice>", line 19, in structure_SpecialDevice
    | cattrs.errors.ClassValidationError: While structuring SpecialDevice (1 sub-exception)
    | Structuring class Settings @ attribute device
    +-+---------------- 1 ----------------
      | Traceback (most recent call last):
      |   File "<cattrs generated structure __main__.SpecialDevice>", line 15, in structure_SpecialDevice
      | KeyError: 'type'
      | Structuring class SpecialDevice @ attribute type
      +------------------------------------

bvitaz-ck avatar Jan 13 '24 00:01 bvitaz-ck

There's a way but it's somewhat complex so let me ask you a few questions first.

The type field is a constant, right? You'd expect all instances of SpecialDevice to have type set to the value special, right?

If that is true, maybe this is redundant information and instead of checking for device.type you could check isinstance(device, SpecialDevice) instead. Conceptually, at the Python level, the "tag" would be instance.__class__ (which the isinstance check would use).

A second easy approach you can take is turn your class into:

from typing import Literal

@attrs.define
class SpecialDevice:
    hostname: str
    port: int
    type: Literal["special"] = "special"

Then cattrs will still remove the field from the payload, but there is a default on the class level so the value will still be set.

You can take this one step further an make it a class variable instead of an instance variable:

from typing import ClassVar, Literal

@attrs.define
class SpecialDevice:
    hostname: str
    port: int
    type: ClassVar[Literal["special"]] = "special"

These change the data modeling strategy somewhat so let me know if they work for you, if not we can consider different solutions.

Tinche avatar Jan 13 '24 13:01 Tinche

Thank you for your response. You're correct that the type field is a constant that the application uses to determine which type of device to instantiate, but the constant is a descriptive name like "trio", "bosch", or "vimba". All of the devices up to this point only require the configuration provided by a StandardDevice, and a new device requires additional fields. If I understand your recommendation, I would set up a data class for each device type to compare the data class instance or add a literal value to specify the type name. I will consider that and I can see that there are benefits to specific data classes for each type of device.

If this is not possible to keep the value of the "type" field when using the key for configure_tagged_union, I can look into using JSON schema validation for the configuration file to validate that the necessary settings exist and cattrs will use the correct data class type. Ideally for my case, the tagged key's field would remain in the payload so it can be used, but I can see why the removal can be a good idea in certain circumstances.

bvitaz-ck avatar Jan 14 '24 01:01 bvitaz-ck

Ok, so the problem is that the tagged_union strategy deletes the type key out. If you want to keep the key in the payload, we can apply some good old function composition (i.e. wrapping functions). This is what cattrs is all about ;)

First, we change the tag name to _type instead of type:

configure_tagged_union(
    Device, c, tag_name="_type", tag_generator=tag_generator, default=StandardDevice
)

That way, the strategy will pop out _type and leave us with type.

Now, we need to actually copy type into _type.

First, we'll fetch the function the strategy generated for us. Then, we just wrap this function in our own function, and register that on the converter.

base_hook = c._union_struct_registry[Device]


def our_hook(value, _):
    value["_type"] = value["type"]
    return base_hook(value, _)


c.register_structure_hook(Device, our_hook)

And now structuring should work.

I think I'll rework how some of this works internally so this process will be cleaner in the next version of cattrs (you won't have to use an internal registry).

Tinche avatar Jan 17 '24 23:01 Tinche