serde-json-core icon indicating copy to clipboard operation
serde-json-core copied to clipboard

Deserializing strings in place.

Open sammhicks opened this issue 1 year ago • 1 comments
trafficstars

De-escaping JSON strings will always produce a shorter (or equal length) string, so it's safe to de-escape strings in place, thus allowing types to borrow plaintext (not escaped) strings from the buffer after deserialization.

This is a semver breaking change as it requires the JSON input to be passed in mutably to allow for the de-escaping in place. This is safe because once the serialization has passed the string, it never reads that part of the buffer again.

sammhicks avatar Dec 31 '23 13:12 sammhicks

Not to nag, but I've now fixed the tests, they run successfully on my fork, and thus it should now be ready to test and merge. Sorry for the repeated sync, and absolutely no rush :)

sammhicks avatar Jan 05 '24 09:01 sammhicks

How about the following design?:

  • The deserializer takes a shared &str, as per the design before this pull request.
  • When deserializing JSON strings, the deserializer scans for escape sequences
    • If there are no escape sequences, it calls visitor.visit_borrowed_str(v), which will allow zero-copy deserialization
    • If there are escape sequences, it decodes the escape sequence using a provided buffer, and calls visitor.visit_str(v)
  • serde_json_core has a EscapedString newtype struct which contains an escaped &str, with utility methods to iterator over it, where the iterator returns either a &str of characters with no escape sequences, or an unescaped char
    • EscapedString is the only structure that is allowed to borrow escaped string data, and uses a special constant to signal to the deserializer that it's special.

I believe that this design will also solve #74

sammhicks avatar Apr 13 '24 11:04 sammhicks

That definitely sounds like a nice approach to me! It may be useful to try and look at some cases where string escaping is useful to see if this design is onerous for the end user. Do you have a sample use case in mind? That may be helpful in figuring out if this is a good approach.

ryan-summers avatar Apr 13 '24 20:04 ryan-summers

I've written a bare-metal HTTP server framework, and would like it to be able to deserialize JSON encoded POST bodies into arbitary data structures, and on soundness grounds and to have separation of concerns, would like all decoding and unescaping to happen before the data is passed to the request handler.

sammhicks avatar Apr 14 '24 10:04 sammhicks

Is this open-sourced and/or something I could look at to get an idea of how you're interested in using it? What I'm really trying to figure out is what this API would look like in a real-world example with actual code

ryan-summers avatar Apr 15 '24 08:04 ryan-summers

In case the message got lost in the post (hurrah for email), I've closed this pull request and opened a new one at https://github.com/rust-embedded-community/serde-json-core/pull/83 with my new design.

sammhicks avatar Jun 06 '24 16:06 sammhicks