msgspec icon indicating copy to clipboard operation
msgspec copied to clipboard

fallback to dict for unknown type in tagged union?

Open tlambert03 opened this issue 1 year ago • 3 comments

Question

Hello, and thanks as always for the amazing library. I have a use case where I'm decoding a document with a huge amount of types, only a handful of which I care about. The schema uses tagged unions, and I'm hoping there is a way to to essentially ignore and/or simply leave as dict and objects that have an unrecognized type. As a simple example, I'd like to be able to deal with the "type": "Other" object below:

import msgspec

class Get(msgspec.Struct, tag=True):
    key: str

class Put(msgspec.Struct, tag=True):
    key: str
    val: str

msg = msgspec.json.encode(
    [
        {"type": "Put", "key": "my key1", "val": "my val"},
        {"type": "Get", "key": "my key2"},
        {"type": "Other", "somekey": "who knows"},
    ]
)
dec = msgspec.json.Decoder(list[Get | Put])
print(dec.decode(msg))
Traceback (most recent call last):
  File "/Users/talley/dev/self/slydb/y.py", line 23, in <module>
    print(dec.decode(msg))
          ^^^^^^^^^^^^^^^
msgspec.ValidationError: Invalid value 'Other' - at `$[2].type`

alternatives I have considered

I tried using something like msgspec.json.Decoder(list[Get | Put | dict]), but that results in:

TypeError: Type unions may not contain more than one dict-like type (`Struct`, `dict`, `TypedDict`, `dataclass`) - type `__main__.Get | __main__.Put | dict` is not supported

the only other thing I can think of is to (laboriously) define stub Structs for every key I ever encounter but don't care about. i.e. add:

class Other(msgspec.Struct, tag=True):
    ...
dec = msgspec.json.Decoder(list[Get | Put | Other])

and then hope i don't encounter something later that I haven't seen before...

tips?

tlambert03 avatar May 18 '24 14:05 tlambert03

Leaving Other empty seems to work:

class Other(msgspec.Struct, tag=True):
    pass

>>> msgspec.json.decode(msg, type=list[Get | Put | Other])
[Put(key='my key1', val='my val'), Get(key='my key2'), Other()]

dcwatson avatar May 19 '24 06:05 dcwatson

Yes it works, but it means that you need to know ahead of time the literal string of every type name you're ever going to encounter, even if you don't care about them (and would be happy to either leave them as unstructured dicts or empty structs)

(Assume that there are many additional names, not just Other)

tlambert03 avatar May 19 '24 10:05 tlambert03

Sorry, I misunderstood!

dcwatson avatar May 19 '24 12:05 dcwatson