toml
toml copied to clipboard
Add a way to customize the literal-ness (single/double quotes) of a string
Hi everyone (and epage in particular), thanks for all your hard work on toml! Really appreciating all the work y'all have done integrating toml_edit and toml.
For guppy, tried upgrading toml to 0.7 and sadly ran into a deal-breaking issue. Guppy currently produces output that has single quotes in it, e.g. this file. When I migrate to toml 0.7, it becomes double-quoted as seen in https://gist.github.com/sunshowers/0ed4ab2143ea1efa8afb10f1a500e61e#file-gistfile0-txt-L4-L28.
The output format cannot be changed for backwards compatibility reasons -- these files are checked into repositories and one of the promises made by hakari is to not change how checked-in files are recorded.
So it seems like what I would need is a way to change how strings are recorded so that they're presented as single quotes.
I spent some time looking around and it looks like while this is supported internally, there's no API to access this. Value::String is a Formatted<String>, which doesn't seem to be high-enough fidelity. It seems like what we would really need is a StringWithQuotes or similar, which would track both the string as well as the quoting style.
What do you think? Without this, or some other solution, I'll have to be stuck at toml 0.5 forever. This would be a shame :(
Huh, looks like this is the first time someone has requested customizing of strings.
If backwards compatibility on formatting is important for you, I would recommend auditing for it any auto formatting because we do not guarantee compatibility on it.
I spent some time looking around and it looks like while this is supported internally, there's no API to access this. Value::String is a Formatted<String>, which doesn't seem to be high-enough fidelity. It seems like what we would really need is a StringWithQuotes or similar, which would track both the string as well as the quoting style.
The way this would be handled is
impl Formatted<String> {
pub fn style(&mut self, ...);
}
Which would overwrite the Repr stored inside the Formatted to preserve the information
I'm fine with doing this, we just need to work out the details of the API and make sure the string formatting is hardened enough to support this. By keeping it internal, I'm less likely to run into corner cases like someone over-specifying a string style that cannot be used.
If backwards compatibility on formatting is important for you, I would recommend auditing for it any auto formatting because we do not guarantee compatibility on it.
Absolutely, we have a number of tests which catch issues related to backwards compatibility (not just formatting, but also ordering of table entries, and so on). Those tests are how I found out about this issue.
#781 is having a similar discussion but for integers.
(Definitely something I still want, but toml 0.5 has been trucking along so it hasn't been a huge priority for me)
As a really terrible workaround, you can pass the string you want to toml_write::TomlKeyBuilder or toml_write::TomlStringBuilder to create a custom TOML representation and then parse it as a Key or a Value and overwrite the related value within toml_edit.
I need something like this as well
Another reason for supporting this is that literal strings are necessary to represent Windows paths. The specific issue is when a config struct contains:
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
struct MyConfig {
some_file: PathBuf,
}
The PathBuf gets serialized by toml as a basic string, and without escapes. On Windows this produces TOML like:
some_path = "C:\Users\me\AppData\Local\Temp\.tmpLFS8VU\some_file.txt"
And as you can see from the red in the syntax highlighting, this encoding produced by the toml crate fails to be parsed by the toml crate due to \U being interpreted as a TOML Unicode escape sequence. If there were some way to either tell toml that this should be a literal string, or have toml (or serde?) detect that PathBuf needs to be handled in this way, then the write would produce:
some_path = 'C:\Users\me\AppData\Local\Temp\.tmpLFS8VU\some_file.txt'
which should parse fine.
toml producing an invalid string would be a bug.
Unsure which version is being used by the playground but
use std::path::PathBuf;
#[derive(Clone, Debug, Default, serde::Deserialize, serde::Serialize)]
struct MyConfig {
some_file: PathBuf,
}
fn main() {
let config = MyConfig {
some_file: PathBuf::from(r#"C:\Users\me\AppData\Local\Temp\.tmpLFS8VU\some_file.txt"#),
};
let toml = toml::to_string(&config).unwrap();
println!("{toml}");
}
produced:
some_file = 'C:\Users\me\AppData\Local\Temp\.tmpLFS8VU\some_file.txt'
If you have a reproduction case, please open an issue.
I spent the last hour trying and failing to figure out why I wasn't able to reproduce the issue I could clearly see in my main code, and finally figured out that the literal-string logic is not doing any detection of whether or not it's a Path or PathBuf, but instead whether or not a string contains a \. My main code has a default value that is just a filename in the same folder and has no path separators, so it was reliably getting printed with double quotes. And the one test where I had a Windows problem turns out to be because that default value's output (in an example config file) was copied and modified outside of toml by test code to include a Windows path.
So my original comment turns out to be irrelevant to this issue, but instead raises another motivation: it is confusing to users if a field sometimes uses double quotes (especially in examples) and sometimes needs to use single quotes. If I could customise this, I could ensure that all of the PathBufs in my example config are written using single quotes even when they are not "needed", so that when a user copies the example and modifies it, the result will do the right thing.