imap-codec
imap-codec copied to clipboard
bindings(python): better message type
Congratulations on the release of the first version of the Python bindings! 🎉
As I mentioned in issue #559 and in the bindings/python
documentation:
Access to data of message types (e.g., Greeting) is currently only available through dictionary representations.
To address this, I attempted to implement a library that facilitates the conversion between dict
and Python classes using msgspec
. You can check it out here.
I chose
msgspec
overpydantic
becausemsgspec
is lightweight and sufficient for the task. However, it can be easily replaced withpydantic
if necessary.
Initially, I adjusted the original dict
structure:
{
"Ok": {
"tag": "a001",
"code": null,
"text": "Message 17 is the first unseen message"
}
}
Currently, enum variant names are used as keys. These can be used as a tag field in tagged unions, so I placed them as the value of codec_model
:
{
"codec_model": "Ok",
"tag": "a001",
"code": null,
"text": "Message 17 is the first unseen message"
}
Specifically, if the value isn't an object or if it represents a Rust enum variant object, the value is placed in a codec_data
field (as implemented in utils.py
):
{
"Unseen": 17
}
// transforms to
{
"codec_model": "Unseen",
"codec_data": 17
}
Next, I created some msgspec
structs in the models
directory to build upon this structure:
Unlike Rust enums, Python enums do not support Algebraic Data Types (ADT), so I used Union
to simulate them:
class TaggedBase(Struct, tag_field="codec_model"):
pass
# Example where value isn't an object
class Unseen(TaggedBase):
codec_data: int
# Example where value is an object
class AppendUid(TaggedBase):
uid_validity: NoZeroUint
uid: NoZeroUint
# Example where the value is a Rust enum variant object
class Untagged(TaggedBase):
kind: StatusKind
code: Code | None
text: str
class Tagged(TaggedBase):
tag: str
body: StatusBody
class Bye(TaggedBase):
code: Code | None
text: str
class Status(TaggedBase):
codec_data: Untagged | Tagged | Bye
And so on...
In my repository, I referred to imap-types
to define all the structures that will be used in the Python bindings. I also defined some functions in validate.py
for use:
_, command = type_codec_decode(CommandCodec, b"ABCD UID FETCH 1,2:* (BODY.PEEK[1.2.3.4.MIME]<42.1337>)\r\n")
>>> Command(
tag="ABCD",
body=Fetch(
sequence_set=[Single(codec_data=Value(codec_data=1)), Range(codec_data=(Value(codec_data=2), "Asterisk"))],
macro_or_item_names=MessageDataItemNames(
codec_data=[NameBodyExt(section=Mime(codec_data=[1, 2, 3, 4]), partial=(42, 1337), peek=True)]
),
uid=True,
),
)
type_codec_encode(command).dump()
>>> b"ABCD UID FETCH 1,2:* (BODY.PEEK[1.2.3.4.MIME]<42.1337>)\r\n"
model_dump(command)
>>> {
"tag": "ABCD",
"body": {
"Fetch": {
"sequence_set": [{"Single": {"Value": 1}}, {"Range": [{"Value": 2}, "Asterisk"]}],
"macro_or_item_names": {
"MessageDataItemNames": [
{"BodyExt": {"section": {"Mime": [1, 2, 3, 4]}, "partial": [42, 1337], "peek": True}}
]
},
"uid": True,
}
},
}
You can find some tests in the tests
directory, which can be run using pytest
.
I haven’t tested all the structures yet, so there might be some structural mistakes. The main goal of this issue is to propose a potential solution to improve the typing experience on the Python side as I understand it. Additionally, I’m interested in exploring any effective methods to test the consistency of the structure between imap-types
and imap-codec-model
(it does seem a bit overwhelming).
If it's feasible, perhaps this could be integrated into the Python bindings library. I’d be happy to contribute further!