msgspec
msgspec copied to clipboard
Omitting defaults does not work with array-like structs
Description
When setting both omit_defaults and array_like to True, defaults are not omitted.
For example, without array_like=True:
class Position(
msgspec.Struct,
frozen=True,
forbid_unknown_fields=True,
omit_defaults=True,
):
longitude: float
latitude: float
altitude: float = 0.0
Performing a roundtrip works as expected:
>>> pos = msgspec.json.Decoder(Position).decode('{ "longitude": 1.0, "latitude": 2.0 }')
>>> pos
Position(longitude=1.0, latitude=2.0, altitude=0.0)
>>> msgspec.json.encode(pos)
b'{"longitude":1.0,"latitude":2.0}'
However, adding array_like=True to the Position definition above causes rountripping to fail:
>>> pos = msgspec.json.Decoder(Position).decode('[1.0,2.0]')
>>> pos
Position(longitude=1.0, latitude=2.0, altitude=0.0)
>>> msgspec.json.encode(pos)
b'[1.0,2.0,0.0]'
Hi there!
Just noticed this yesterday as well.
From what I can see ,this doesn't only affect the json encoding.
This is an example of MsgPack acting the same way:
from msgspec.msgpack import decode, encode
from msgspec.structs import Struct
class User(Struct, tag=1, array_like=True):
username: str
is_admin: bool = False
class Comment(Struct, tag=2, array_like=True):
user: User
content: str
likes: int = 0
is_highlighted: bool = False
username = "ipseitas"
content = "Lovin' this lib, please add donations!"
comment = Comment(User(username), content)
commentlike = (2, (1, username), content)
>>> encode(comment)
... b"\x95\x02\x93\x01\xa8ipseitas\xc2\xd9&Lovin' this lib, please add donations!\x00\xc2"
>>> encode(commentlike)
... b"\x93\x02\x92\x01\xa8ipseitas\xd9&Lovin' this lib, please add donations!"
>>> decode(encode(commentlike), type=Comment) == comment
... True
Implementing fits the theme of being efficient that your package shines at. If this helps you deliver, please add donation #542. I'll gladly transfer some of the time your lib saved me and am sure others would as well :)
It is not possible to omit in some cases. Imagine:
ENCODER = msgspec.msgpack.Encoder()
class A(msgspec.Struct):
a: bool = False
b: bool = False
ENCODER.encode(A(b=True)) # would be [True]
It's possible if we assume a specific behavior. For example, only strip from the end of the array. That way the fields that are more likely to be modified could be put to the front.
This Struct:
class User(Struct, array_like=True):
username: str
profile_pic_url: str | None = None
is_admin: bool = False
Could be:
["robert"] # has no pic, not an admin
["jon", None, True] # is an admin, so we need to keep the default pic value anyway
Another behavior could be a flag that allows the gap to be filled with None.
That way, the encoder would save space on non-nullable fields, and the decoder would use the default value.
So that this message:
class TransportMethod(StrEnum):
One = "one"
Two = "two"
AnotherOne = "another_one"
class Package(Struct, array_like=True, lax_array_defaults=True):
content: str
to_dispatcher: TransportMethod = TransportMethod.One
from_dispatcher: TransportMethod = TransportMethod.One
Could encode to these:
["content"] # Uses default transport methods
["content", None, "two"] # Only one transport is modified
But then this message:
class Package(Struct, array_like=True, lax_array_defaults=True):
content: str
to_dispatcher: TransportMethod | None = TransportMethod.One
from_dispatcher: TransportMethod = TransportMethod.One
Would be encoded as:
["content"] # Uses default transport methods
["content", "one", "two"] # Only one transport is modified, but setting the first one to None would change its meaning