tree-sitter
tree-sitter copied to clipboard
Is `node_has_error()` supposed to return `true` on terminal error nodes?
Problem
Consider the following example using the R bindings to tree-sitter (I can try and make a Rust tree-sitter test for this if needed)
I was expecting node_has_error() to return true on the inner terminal ERROR node produced here.
According to the docs it should:
Check if the node is a syntax error or contains any syntax errors
library(treesitter)
text <- "1 + }"
parser <- parser(treesitter.r::language())
tree <- parser_parse(parser, text)
root <- tree_root_node(tree)
root
#> <tree_sitter_node>
#>
#> ── Text ────────────────────────────────────────────────────────────────────────
#> 1 + }
#>
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 5)]
#> (float [(0, 0), (0, 1)])
#> (ERROR [(0, 2), (0, 5)]
#> "+" [(0, 2), (0, 3)]
#> (ERROR [(0, 4), (0, 5)])
#> )
#> )
outer_error <- root |>
node_child(2)
outer_error
#> <tree_sitter_node>
#>
#> ── Text ────────────────────────────────────────────────────────────────────────
#> + }
#>
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (ERROR [(0, 2), (0, 5)]
#> "+" [(0, 2), (0, 3)]
#> (ERROR [(0, 4), (0, 5)])
#> )
# These make sense
node_is_error(outer_error)
#> [1] TRUE
node_has_error(outer_error)
#> [1] TRUE
inner_error <- outer_error |>
node_child(2)
inner_error
#> <tree_sitter_node>
#>
#> ── Text ────────────────────────────────────────────────────────────────────────
#> }
#>
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (ERROR [(0, 4), (0, 5)])
# According to the docs, `node_has_error()` should return `TRUE`?
node_is_error(inner_error)
#> [1] TRUE
node_has_error(inner_error)
#> [1] FALSE
Steps to reproduce
See above
Expected behavior
Return true anytime node_is_error() also returns true
Tree-sitter version (tree-sitter --version)
tree-sitter 0.23
Operating system/version
macOS
Should ts_node_has_error() basically do ts_node_is_error(self) || <current behavior>, or is something deeper wrong?
Or maybe the docs are wrong and ts_node_has_error() is really only supposed to return true if it has children that are errors?
https://github.com/tree-sitter/tree-sitter/blob/7e3f57265549f26f4fe3ac1ee8ee3b1c6ee182f4/lib/src/node.c#L493-L500
Interestingly it seems like it is working right if you call node_has_error() directly on a MISSING node (which should also return true IIUC)
library(treesitter)
text <- "{ 1 + 2"
parser <- parser(treesitter.r::language())
tree <- parser_parse(parser, text)
root <- tree_root_node(tree)
root
#> <tree_sitter_node>
#>
#> ── Text ────────────────────────────────────────────────────────────────────────
#> { 1 + 2
#>
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 7)]
#> (braced_expression [(0, 0), (0, 7)]
#> open: "{" [(0, 0), (0, 1)]
#> body: (binary_operator [(0, 2), (0, 7)]
#> lhs: (float [(0, 2), (0, 3)])
#> operator: "+" [(0, 4), (0, 5)]
#> rhs: (float [(0, 6), (0, 7)])
#> )
#> close: "}" MISSING [(0, 7), (0, 7)]
#> )
#> )
# Makes sense, there was a syntax error but it was "recovered" with the MISSING
node_has_error(root)
#> [1] TRUE
missing <- root |>
node_child(1) |>
node_child_by_field_name("close")
missing
#> <tree_sitter_node>
#>
#> ── Text ────────────────────────────────────────────────────────────────────────
#>
#>
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> "}" MISSING [(0, 7), (0, 7)]
# This seems right
node_has_error(missing)
#> [1] TRUE
Created on 2024-09-12 with reprex v2.0.2
@DavisVaughan is this still an issue? I tried on the master branch and wasn't able to reproduce it. In R the example 1 + } parses to (program (float) (ERROR (UNEXPECTED '}'))) (with the latest version of the R grammar and the ERROR has has_error = true (at least, from the Rust bindings).
It would be easy enough to patch the C library to add ts_node_is_error(self) || as you describe, but I'd like to have at least one example where it's necessary before making a PR for it…