tree-sitter
tree-sitter copied to clipboard
probably UB: `QueryMatches::next` in Rust library clobbers previously returned values.
Consider test.c
:
void
do_stuff()
{
int x = 42;
char const *y = "Hello, world!";
float z = 3.14159;
}
and main.rs
:
use tree_sitter::{Parser, Query, QueryCursor};
fn main() {
let mut parser = Parser::new();
let l = tree_sitter_c::language();
parser.set_language(l).unwrap();
let contents = std::fs::read("/home/brennan/test.c").unwrap();
let tree = parser.parse(&contents, None).unwrap();
let q = Query::new(l, "(declaration type: (primitive_type) @t)").unwrap();
let mut qc = QueryCursor::new();
let matches: Vec<_> = qc
.matches(&q, tree.root_node(), contents.as_slice())
.collect();
for m in matches {
println!(
"{}",
m.captures[0].node.utf8_text(contents.as_slice()).unwrap()
);
}
}
This prints:
float
float
float
rather than the expected
int
char
float
The issue is that each call to QueryMatches::next
constructs a QueryMatch
object that points to data returned by ts_query_cursor_next_match
. However, this data references buffers that become invalidated on each new call to ts_query_cursor_next_match
. So it's in general not okay to keep the old QueryMatch
objects alive when constructing new ones.
As an aside / tangentially related issue: the C functions should be documented to clearly explain the lifetime of validity of the data they return.
Closing as a duplicate of #2843.