Unsoundness in get_nodes
Hello, thank you for your contribution in this project, I an testing our static analysis tool in github's Rust project and I notice the following code:
fn get_nodes(src: &str, path: Rc<Path>) -> MResult<Vec<SNode<'static>>, PError> {
let src = unsafe { mem::transmute::<&str, &'static str>(src) };
let nodes = parse(path, src.trim_end())?;
Ok(nodes)
}
The unsoundness occurs in the get_nodes function where it uses unsafe { mem::transmute::<&str, &'static str>(src) } to extend the lifetime of a string reference to 'static. This leads to dangling references when the original string is dropped but the transformed 'static references continue to be used. A valid path to call this fn: pub fn resolve_imports -> fn add_nodes -> fn get_nodes_from_path -> fn get_nodes
POC
use std::mem;
use std::rc::Rc;
// 模拟你的代码中的类型
struct Path;
type MResult<T, E> = Result<T, E>;
type PError = String;
#[derive(Debug)]
struct SNode<'a> {
content: &'a str
}
// 原始的有问题函数 (这是我们要证明有问题的函数)
fn get_nodes(src: &str, path: Rc<Path>) -> MResult<Vec<SNode<'static>>, PError> {
let src = unsafe { mem::transmute::<&str, &'static str>(src) };
let nodes = parse(path, src)?;
Ok(nodes)
}
// 辅助函数
fn parse<'a>(_path: Rc<Path>, src: &'a str) -> MResult<Vec<SNode<'a>>, PError> {
Ok(vec![SNode { content: src }])
}
// 全部安全代码的证明
fn main() {
println!("开始测试...");
// 创建一个字符串数组,用于持有多个字符串
let mut strings = Vec::new();
// 添加第一个字符串
strings.push(String::from("字符串1"));
// 从第一个字符串创建节点
let nodes = {
let first_string = &strings[0];
get_nodes(first_string, Rc::new(Path)).unwrap()
};
// 打印初始内容
println!("节点初始内容: {}", nodes[0].content);
// 修改第一个字符串
strings[0] = String::from("修改后的字符串");
// 理论上,如果 get_nodes 是安全的,nodes[0].content 应该引用的是旧字符串
// 但由于 get_nodes 中的 transmute 创建了悬垂引用,所以这里会打印垃圾数据或崩溃
println!("节点现在的内容: {}", nodes[0].content);
// 另一种方式:添加新字符串,确保分配新的内存
for i in 0..10 {
strings.push(format!("新字符串 {}", i));
}
// 再次尝试访问节点内容
// 这将更有可能导致问题,因为原始内存可能已被重新分配
println!("最终节点内容: {}", nodes[0].content);
// 如果程序能正常结束,输出一条消息
println!("测试完成 - 如果看到这条消息且上面没有崩溃或打印垃圾数据,说明我们运气好");
}
Run with miri
error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)
--> C:\Users\ROG\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\fmt\mod.rs:2403:1
|
2403 | fmt_refs! { Debug, Display, Octal, Binary, LowerHex, UpperHex, LowerExp, UpperExp }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered a dangling reference (use-after-free)
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE:
= note: inside `<&str as std::fmt::Display>::fmt` at C:\Users\ROG\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\fmt\mod.rs:2393:71: 2393:78
= note: inside `core::fmt::rt::Argument::<'_>::fmt` at C:\Users\ROG\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\fmt\rt.rs:177:76: 177:95
= note: inside `std::fmt::write` at C:\Users\ROG\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\fmt\mod.rs:1189:21: 1189:44
= note: inside `<std::io::StdoutLock<'_> as std::io::Write>::write_fmt` at C:\Users\ROG\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\std\src\io\mod.rs:1884:15: 1884:43
Yes, it's seems unsoundness but the owned and borrowed are a single pair, (String, Vec<SNode<'static>>)
https://github.com/zzau13/yarte/blob/316f579d31f0dfa931b0a0f60853845352a2ba75/yarte_hir/src/imports.rs#L18-L24
https://github.com/zzau13/yarte/blob/316f579d31f0dfa931b0a0f60853845352a2ba75/yarte_hir/src/imports.rs#L45-L57 It's a little hack for zero copy parser. This couple always goes together and the lifetime is the same for both.
In other words, the user after being free never happens.
The crate needs a deep refactoring anyway, inheritance needs to be managed differently and the parser needs to be redone correctly with a library for this. v_escape is being refactored, the library that uses escaping, https://github.com/zzau13/v_escape/pull/150, and will have to adapt to the new behavior. I would also like to be able to implement a virtual machine for it to work in runtime.
Thanks for reviewing my code.