jsonc-parser
jsonc-parser copied to clipboard
JSONC parser fails to correctly parse non-BMP escape sequences
In accordance with RFC 8258 § 7, the non-BMP character 𝄞
(U+1D11E) should be escaped as the escaped surrogate pair \uD834\uDD1E
. Therefore, I expect the following Rust code to compile and run successfully:
use jsonc_parser::JsonValue;
use jsonc_parser::parse_to_value;
fn main() {
let src = r#""\uD834\uDD1E""#;
let v = parse_to_value(src, &Default::default()).unwrap().unwrap();
if let JsonValue::String(s) = v {
assert_eq!("\u{1D11E}", s)
}
else {
panic!();
}
}
However, on the latest version of jsonc-parser
(as of writing, this is version 0.21.0), this code panics at the unwrap
on line 6 with the message "Invalid unicode escape sequence. 'D834' is not a valid UTF8 character"
.
Not entirely sure, but this recently merged RFC might be relevant.
Ron has adopted it in their v0.9
release instead of base64 for properly supporting roundtripping with byte strings. serde_json
didn't have the issue though.