msgspec
msgspec copied to clipboard
Callbacks to `Encoder`/`Decoder` are not respected in `datetime` objects
Description
Description
Both dec_hook and enc_hook arguments are not respected in all encoders and decoders (tested on JSON and YAML) when datetime objects are used. Note that the print functions in both hooks are not run, and the variable buf contains an ISO 8601 duration string instead of a number (as seen from enc_hook).
Attached is a sample script to show that custom decoding of datetime.timedelta objects is not supported. It also doesn't work for datetime.datetime objects.
import msgspec
from typing import Any, Type
from datetime import timedelta
def enc_hook(obj: Any) -> Any:
print("Encoding")
if isinstance(obj, timedelta):
# convert the timedelta to a number
return obj.total_seconds()
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type(obj)} are not supported")
def dec_hook(type: Type, obj: Any) -> Any:
print("Decoding", type)
# `type` here is the value of the custom type annotation being decoded.
if type is timedelta:
# Convert ``obj`` (which should be a ``number``) to a timedelta
return timedelta(seconds=obj)
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type} are not supported")
class MyMessage(msgspec.Struct):
field_1: str
field_2: timedelta
enc = msgspec.json.Encoder(enc_hook=enc_hook)
dec = msgspec.json.Decoder(MyMessage, dec_hook=dec_hook)
msg = MyMessage("some string", timedelta(seconds=5))
# Doesn't work for JSON decoder
buf = enc.encode(msg)
print(buf)
a = dec.decode(buf)
print(a)
# Doesn't work for YAML decoders either
buf = msgspec.yaml.encode(msg, enc_hook=enc_hook)
print(buf)
a = msgspec.yaml.decode(buf, type=MyMessage, dec_hook=dec_hook)
print(a)
Update: This was broken sometime between version 0.16.0 and version 0.17.0.
Update: It was this specific commit that broke the hook for datetime.timedelta objects: 2b72ebbf91ec0e294e049ba584e81400a71ef37a
Update: Seems like hooks for datetime.datetime objects were broken since the start
.encode and .decode methods under the hood call msgspec.to_builtins and msgspec.convert functions respectively.
Both functions have parameter builtin_types, which disables processing of specified builtin types by the msgspec, but it does not pass those types to *_hook methods, only non-builtin types are passed to *_hooks.
Wether this is a bug or by design - only @jcrist can tell (no pun intended :-) But it definitely feels like a bug.
The above can be illustrated with:
import msgspec as ms
import datetime as dt
def enc_hook(obj: Any) -> Any:
print("Encoding")
if isinstance(obj, T):
return obj.name
if isinstance(obj, dt.timedelta):
# convert the timedelta to a number
return obj.total_seconds()
else:
# Raise a NotImplementedError for other types
raise NotImplementedError(f"Objects of type {type(obj)} are not supported")
class T:
def __init__(self, name='some name'):
self.name = name
class MyMessage(ms.Struct):
field_1: T
field_2: dt.timedelta
msg = MyMessage(T(), dt.timedelta(seconds=5))
msg_encoded = ms.to_builtins(
msg,
builtin_types=(
dt.timedelta,
),
enc_hook=enc_hook
)
print(msg_encoded)
The above outputs:
Encoding
{'field_1': 'some name', 'field_2': datetime.timedelta(seconds=5)}
I can see 2 ways to overcome this behaviour until (if ever) it gets changed:
- Implement your own
encode/decodemethod where you can control what happens to dict produced bymsgspecbefore it gets sent toen/de-coders. - Wrap builtin type in custom type to be handled by
_hooks.