dissect.cstruct icon indicating copy to clipboard operation
dissect.cstruct copied to clipboard

Make the Instance class dict compatible to allow for JSON serialization.

Open qkaiser opened this issue 2 years ago • 3 comments

I have a specific use case where I want to serialize dissect.cstruct instances to JSON.

This MR introduces two changes to make both Instance and EnumInstance inherit from dict, allowing users of the library to do things like:

my_struct = cstruct()
my_struct.load(definition)
record = my_struct.Record(data)
print(json.dumps(record, cls=CustomEncoder))

Since these instances can contain bytes attributes and that JSON does not support bytes, the specificities of the JSON encoding is left to the user of dissect.cstruct. In the attached example, a UTF-8 decoding with surrogate escape is used but we could also imagine a base64 encoding if the structure holds lots of raw binary data.

A demo example is provided in examples/mirai_json.py

These changes do not introduce API changes nor do they break the test suite. However, I'm open to writing unit tests if you're open to merge this MR :)

Thanks again for this wonderful project !

qkaiser avatar Apr 02 '23 14:04 qkaiser

Just fixed my code with your linter config.

qkaiser avatar Apr 04 '23 15:04 qkaiser

Thanks for the nice idea! Unfortunately there are some issues with subclassing Instance from dict as we have plans to merge the Instance and Structure classes into a single class. Having that new class be based off of dict would be difficult.

Would it be feasible to have an as_dict() function instead? The EnumInstance could then be dealt with in a CustomEncoder as I can imagine other users wanting to have just the value of an enum (possibly even just the name) instead of a key/value pair.

I see. We'll probably implement the serialization on our end then, np :)

Since you plan on merging Instance and Structure, I just wanted to mention that outside of JSON serialization, it can also be a bottleneck for users of dissect.cstruct wanting to transfer Instance objects between processes since multiprocessing in Python rely on pickle.

I'll close this once we're sure of the direction to take.

Again, thanks for all your work on this project !

qkaiser avatar Apr 18 '23 10:04 qkaiser

@qkaiser is this still an issue for you with the new dissect cstruct v4?

Miauwkeru avatar Aug 23 '24 08:08 Miauwkeru