Quotes from doc comments are unnecessarily escaped
I literally copied godot/tests/docs.rs into a new project and got this:
The quotes are "double"-escaped because of Rust string literals:
https://github.com/godot-rust/gdext/blob/90d8064202f51d3c1a248868317ac0d3676d8e9c/godot-macros/src/docs.rs#L113-L123
Can probably be fixed with litrs:
.flat_map(|doc| {
doc.into_iter().map(|x| {
litrs::StringLit::parse(x.to_string()).unwrap().value().to_string()
})
})
but not sure adding a new dependency is a good idea
The discussion about litrs has come up before, not only for registered docs, but also for proc-macro attribute parsing. It's quite a lightweight crate, so if it helps simplify our code in many places, we can maybe consider it? :thinking:
What didn't convince me last time was the suggestion to create extra intermediate representations for input literals -- if at all, litr types should just be used at the parsing stage, and then converted to canonical Rust ones (String, int, etc).
Maybe we should check more concretely which other places could benefit from it?
I believe you linked wrong snippet: https://github.com/godot-rust/gdext/blob/90d8064202f51d3c1a248868317ac0d3676d8e9c/godot-macros/src/docs.rs#L139. The one you linked is crudely filtered (with .filter(|x| x.get_single_path_segment().is_some_and(|x| x == "doc"))) to handle inline attributes such as:
#[doc = r#"this is very documented"#]
#[var]
item: f32,
@Yarwin you've probably missed the comment before this snippet:
https://github.com/godot-rust/gdext/blob/90d8064202f51d3c1a248868317ac0d3676d8e9c/godot-macros/src/docs.rs#L102-L102
The doc comments are expanded to regular, non-raw strings. This means that quotes inside these doc strings are always escaped with \ (not "), and the linked snippet handles them, so it is correct and your example is not relevant here.
Also, note that some other characters are also escaped. For example, try using a tab inside a doc string:
/// a b
You'll get a\tb in Godot. This clearly demonstrates that xml_escape is absolutely unrelated here
You are right, sorry for doubting you! – it was a bit late. Alternatively, we can just convert all the escaped characters to non escaped ones :upside_down_face:. It works albeit is a bit crude and unelegant. (very naive ofc) PoC:
pub fn unescape(escaped: &str) -> String {
let mut unescaped_string = String::new();
let mut iterator = escaped.chars();
while let Some(c) = iterator.next() {
match c {
'\\' => match iterator.next() {
Some('t') => {
unescaped_string.push_str(r#" "#)
},
Some('"') => {
unescaped_string.push('"')
},
Some(other) => {
unescaped_string.push(c);
unescaped_string.push(other);
}
None => unescaped_string.push(c),
},
_ => unescaped_string.push(c)
}
}
unescaped_string
}
(…)
unescape(&x.to_string()
.trim_start_matches('r')
.trim_start_matches('#')
.trim_start_matches('"')
.trim_end_matches('#')
.trim_end_matches('"'))
})
(obviously replace(r#"\""#, r#"""#) replace(r#"\t"#, r#" "#) works too, albeit is a bit slower)
Or we try with litrs 😉
we can just convert all the escaped characters to non escaped ones
This is what I originally planned to do, but then I opened The Rust Reference and got #[doc = "sc\x61r\u{0065}d"] 🙃
marking as good first issue since it should be pretty straightforward if litrs just works, if litrs doesn't help then we can remove good first issue from this