msgspec
msgspec copied to clipboard
What other examples should we add?
In #147 we added a new Examples
section to the docs, and added an example of using msgpsec to write a concise (and performant) GeoJSON implementation.
What other examples should we add? This might be other common schemas (JSON-RPC perhaps?) or integrations with other tools (ASGI/Starlette, requests, sqlite, ...)?
Depending on what kind of examples are being looked for, it might make sense to compare to common serialization framework (like Protobuf, Cap'n Proto, Flatbuffers, etc.). This could be potentially useful for users looking at Msgspec and some of the other options when making a decision or it could be useful for users porting to Msgspec from something else. Just a thought 🙂
That might make sense? msgspec
is mostly a nice way of working with JSON/msgpack, it's not really a format itself. The benchmarks already compare against the most performant version of those for python (pyrobuf) - the protobuf/flatbuffers/capnproto implementations for python are all fairly slow, even if the formats can be fast for other languages.
i have this example which uses multiple features https://gist.github.com/banteg/cee923b6616a8f70db1e18a61b91e39c
Sorry should have clarified above, wasn't so much talking about a performance comparison as a code written comparison. IOW showing an example with an implementation in protobuf and then in msgspec. Thinking about this in terms of how to show users familiar with one message passing library on how they can do similar things in msgspec.
More examples of dec_hook would be wonderful. My colleague and I were deciding between Pydantic and this, and from existing documentation, custom validators weren't clear.
If you'd accept it, I would be happy to volunteer a documentation PR.
The canonical example that was brought up was email validation:
import re
from typing import List, Optional, Type, Any
from msgspec import Struct
from msgspec.json import Decoder
regex_email = re.compile(r'([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{2,})+')
regex_phone = re.compile(r'[0-9]{3}-[0-9]{3}-[0-9]{4}')
class phone_number(str):
@staticmethod
def validate(obj):
if not regex_phone.fullmatch(obj):
raise ValueError()
return phone_number(obj)
class email(str):
@staticmethod
def validate(obj):
if not regex_email.fullmatch(obj):
raise ValueError()
return email(obj)
class TestEmail(Struct):
phone_number: phone_number
email: email
def dec_hook(type: Type, obj: Any) -> Any:
if type is email:
if not regex_email.fullmatch(obj):
raise ValueError()
return type(obj)
if type is phone_number:
if not regex_phone.fullmatch(obj):
raise ValueError()
return type(obj)
return type(obj)
def dec_hook2(type: Type, obj: Any) -> Any:
return type.validate(obj)
t = '{"email": "[email protected]", "phone_number": "999-999-9999"}'
response_decoder = Decoder(TestEmail, dec_hook=dec_hook).decode(t)
print(response_decoder)
response_decoder2 = Decoder(TestEmail, dec_hook=dec_hook2).decode(t)
print(response_decoder2)
I think this is a good useful example case to add, I'd be happy to accept a PR if you wanted to get things going. I like the idea of establishing a convention of using validate
methods on the custom types to handle the parsing and validation, then dispatching directly to those in the dec_hook
(as you do in response_decoder2
). I'd probably make those classmethod
instead of staticmethod
so you can reference cls
in the method itself, but that's just a stylistic preference.
Note that in this specific case, since all your validators are regex based you can make use of the existing constraints functionality and avoid using dec_hook
altogether. This is also nice since the runtime type of phone_number
and email
is just str
, only the extra validators are attached.
from typing import Annotated
import msgspec
Email = Annotated[
str,
msgspec.Meta(
pattern=r"^([A-Za-z0-9]+[.-_])*[A-Za-z0-9]+@[A-Za-z0-9-]+(\.[A-Z|a-z]{2,})+$"
),
]
Phone = Annotated[str, msgspec.Meta(pattern=r"^[0-9]{3}-[0-9]{3}-[0-9]{4}$")]
class Example(msgspec.Struct):
phone_number: Phone
email: Email
t = '{"email": "[email protected]", "phone_number": "999-999-9999"}'
msg = msgspec.json.decode(t, type=Example)
print(msg)
#> Example(phone_number='999-999-9999', email='[email protected]')
Very nice, thanks! The Meta api is something I hadn’t explored yet, this looks super clean.