pycapnp
pycapnp copied to clipboard
robust way to serialize to json
It is convenient in various scenarios to translate capnproto from and to json. The seemingly straightforward way to do soe would be to call json.dumps(builder.to_dict())
, which is basically what capnp-json.py in scripts does. Unfortunately, that doesn't really work for Data fields, because json needs to be valid unicode and Data fields end up as bytes
in the generated dictionary (see attached example code).
As far as I can tell capnproto itself does not yet have an official json serialization yet (which would include a specification how to encode Data fields as unicode), so maybe rather than having .to_json
, having something like to_dict(bytes=False)
which serializes Data to e.g. base64-encoded unicode and then always base64-decoding unicode for Data fields in new_message/from_dict
might work.
import os
import capnp
if __name__ == '__main__':
with open('test.capnp', 'w') as fh:
fh.write('''
@0xc49a5731242fa476;
struct TestStruct {
uint @0 :UInt64;
blob @1 :Data;
}
''')
schema = capnp.load('test.capnp')
with open('test_ok.out', 'wb') as fh:
schema.TestStruct.new_message(
blob=b'valid utf8:\0\1\2"',
uint=123,
).write(fh)
with open('test_fail.out', 'wb') as fh:
schema.TestStruct.new_message(
blob=b'valid utf8:\0\1\2 invalid: \xc3\x28"',
).write(fh)
#
os.system('./capnp-json.py decode test.capnp TestStruct <test_ok.out ')
os.system('./capnp-json.py decode test.capnp TestStruct <test_fail.out')
capnproto (in git master) does have a JSON codec, though I'm not sure there's been a capnproto release since its addition (and I also don't think pycapnp wraps the functionality yet).
We canonically encode UInt64
and Int64
as strings of base-10 digits in ASCII, since Javascript's numeric type only affords 53 bits of precision.
We canonically encode Data
as an array of numbers in the range [0, 255].
That said, I believe it is possible on the C++ side to provide your own encoding logic if you wanted to (for instance) base64 Data
instead of making it an array of byte values.
This sounds like a good idea, and I'm not opposed to structs having 'to_json/from_json' methods. Unfortunately I'm on vacation this week and I'm not sure when I'll be able to to find time to work on this.
I'm always happy to review PRs though :)