toml
toml copied to clipboard
Uneeded escapes with multiline string character escapes
Hello, I'm trying to write and load a string with newlines and tabs with toml, and I'm seeing some strange behavior with CR and HT, though this might be the intended behavior.
Here's a small reproduction:
// [dependencies]
// serde = { version = "1.0.204", features = ["derive"] }
// toml = "0.8.14"
#[derive(Debug, serde::Serialize, serde::Deserialize)]
struct Simple {
message: String,
}
fn main() {
let simple = Simple {
message: "\tHello\r\nWorld!".into(),
};
let output = toml::to_string(&simple).unwrap();
println!("{output}");
let round_tripped: Simple = toml::from_str(&output).unwrap();
assert!(simple.message == round_tripped.message);
}
I would expect (or prefer) the output to be:
message = """
Hello
World!"""
Instead, the output is:
message = """
\tHello\r
World!"""
CR and HT are escaped.
According to the toml spec, for multiline strings:
Any Unicode character may be used except those that must be escaped: backslash and the control characters other than tab, line feed, and carriage return (U+0000 to U+0008, U+000B, U+000C, U+000E to U+001F, U+007F)
It looks like this library is escaping some characters that don't necessarily need to be escaped. However, I request that at least CR be written unescaped, as the library is currently splitting CRLFs. This makes interaction with any line ending normalizers, like git, very messy.
I would also prefer tabs to be written unescaped as well. I am using tabs to format the interior content of a multiline string, and the formatting is lost (at least visually) when passed through this library.