jsonbinpack-poc icon indicating copy to clipboard operation
jsonbinpack-poc copied to clipboard

Memory / GC use?

Open zellyn opened this issue 2 years ago • 3 comments

@jbr-square mentioned that here at Square, we had massive memory/GC issues with [one of our core services, written in Java] until we realized we were unnecessarily deserializing JSON instead of protos.

If possible, it would be very interesting to get some kind of measure of memory and GC impact. I expect JSON Binpack would be more akin to Protos than JSON, but it'd be interesting.

zellyn avatar Mar 25 '22 15:03 zellyn

heh, sorry, that was years ago, so not really something to look at in present times.....

jbr-square avatar Mar 25 '22 15:03 jbr-square

If possible, it would be very interesting to get some kind of measure of memory and GC impact. I expect JSON Binpack would be more akin to Protos than JSON, but it'd be interesting.

I'm definitely interested on this! On one of my papers (https://arxiv.org/abs/2201.02089), I'm explaining my findings where the runtime-efficiency / memory-usage characteristics of a serialization format are actually characteristics of the specifics implementations, and rarely of the serialization formats themselves. Therefore, serialization formats are not easily comparable across each other. One implementation of a format in one programming language may exceed another format, and vice versa.

The current state of JSON BinPack in this repo is a pre-production/prototype of the tool that I implemented within the time limits of my dissertation. It is implemented using TypeScript and its probably not very runtime or memory efficient.

I recently submitted my dissertation (https://www.jviotti.com/assets/dissertation.pdf) and the plan going forward is to spend more time implementing a production-ready and optimized version of JSON BinPack in C++ that is also usable on the context of embedded development. I hope for that implementation to result in a space-efficient format that is also pretty fast.

The C++ implementation is already on the works, so expect to see more stuff soon. As a spoiler, I'm building a custom lazy JSON parser on C++ and a JSON Schema C++ code generator to power it.

Once that production-ready implementation is done, then I plan to do other types of benchmarks around it.

You can watch this repo (for releases only or more) if you are interesting in staying up-to-date!

jviotti avatar Mar 26 '22 00:03 jviotti

I definitely understand the frustration with the vast differences in implementation speed and other characteristics. All the speed comparisons I've been able to find between protos and gzipped JSON severely predate the incredibly fast modern JSON parsers, some of which even use SIMD!

zellyn avatar Mar 26 '22 02:03 zellyn