hyperjson
hyperjson copied to clipboard
Zero-copy string deserialization
Over on Reddit, @mikeyhew mentioned that there might be an option to parse JSON strings without copying:
Just wanted to point out that serde-json isn't zero-copy because it will copy strings to turn escape sequences like "\n" and "" into the character they represent. To parse JSON without copying, you could make a custom string type, JsonStr, which is utf-8 like str but can contain escape sequences.
I forgot about that, but it's actually a great idea! Here's the upstream discussion on serde-json. We should give this custom string type some serious consideration, as string allocation takes a big part of the encoding/decoding process at the moment.
If anyone wants to give it a shot, go for it.
I don't think it was originally my idea, but thanks for the mention.
Here is a proof-of-concept, with a basic test: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=130c6682231d50c744b96f4c8e2ebd43. I used the diagrams on https://json.org as a reference
EDIT: here's an updated version with some more tests and a link to the gist https://play.rust-lang.org/?gist=334122cd0104ad3509388074be4351ba