tree-sitter icon indicating copy to clipboard operation
tree-sitter copied to clipboard

Is `node_has_error()` supposed to return `true` on terminal error nodes?

Open DavisVaughan opened this issue 1 year ago • 2 comments

Problem

Consider the following example using the R bindings to tree-sitter (I can try and make a Rust tree-sitter test for this if needed)

I was expecting node_has_error() to return true on the inner terminal ERROR node produced here.

According to the docs it should:

Check if the node is a syntax error or contains any syntax errors

library(treesitter)

text <- "1 + }"

parser <- parser(treesitter.r::language())

tree <- parser_parse(parser, text)
root <- tree_root_node(tree)

root
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> 1 + }
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 5)]
#>   (float [(0, 0), (0, 1)])
#>   (ERROR [(0, 2), (0, 5)]
#>     "+" [(0, 2), (0, 3)]
#>     (ERROR [(0, 4), (0, 5)])
#>   )
#> )

outer_error <- root |>
  node_child(2)

outer_error
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> + }
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (ERROR [(0, 2), (0, 5)]
#>   "+" [(0, 2), (0, 3)]
#>   (ERROR [(0, 4), (0, 5)])
#> )

# These make sense
node_is_error(outer_error)
#> [1] TRUE
node_has_error(outer_error)
#> [1] TRUE

inner_error <- outer_error |>
  node_child(2)

inner_error
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> }
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (ERROR [(0, 4), (0, 5)])

# According to the docs, `node_has_error()` should return `TRUE`?
node_is_error(inner_error)
#> [1] TRUE
node_has_error(inner_error)
#> [1] FALSE

Steps to reproduce

See above

Expected behavior

Return true anytime node_is_error() also returns true

Tree-sitter version (tree-sitter --version)

tree-sitter 0.23

Operating system/version

macOS

DavisVaughan avatar Sep 12 '24 18:09 DavisVaughan

Should ts_node_has_error() basically do ts_node_is_error(self) || <current behavior>, or is something deeper wrong?

Or maybe the docs are wrong and ts_node_has_error() is really only supposed to return true if it has children that are errors?

https://github.com/tree-sitter/tree-sitter/blob/7e3f57265549f26f4fe3ac1ee8ee3b1c6ee182f4/lib/src/node.c#L493-L500

DavisVaughan avatar Sep 12 '24 18:09 DavisVaughan

Interestingly it seems like it is working right if you call node_has_error() directly on a MISSING node (which should also return true IIUC)

library(treesitter)

text <- "{ 1 + 2"

parser <- parser(treesitter.r::language())

tree <- parser_parse(parser, text)
root <- tree_root_node(tree)

root
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> { 1 + 2
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> (program [(0, 0), (0, 7)]
#>   (braced_expression [(0, 0), (0, 7)]
#>     open: "{" [(0, 0), (0, 1)]
#>     body: (binary_operator [(0, 2), (0, 7)]
#>       lhs: (float [(0, 2), (0, 3)])
#>       operator: "+" [(0, 4), (0, 5)]
#>       rhs: (float [(0, 6), (0, 7)])
#>     )
#>     close: "}" MISSING [(0, 7), (0, 7)]
#>   )
#> )

# Makes sense, there was a syntax error but it was "recovered" with the MISSING
node_has_error(root)
#> [1] TRUE

missing <- root |>
  node_child(1) |>
  node_child_by_field_name("close")

missing
#> <tree_sitter_node>
#> 
#> ── Text ────────────────────────────────────────────────────────────────────────
#> 
#> 
#> ── S-Expression ────────────────────────────────────────────────────────────────
#> "}" MISSING [(0, 7), (0, 7)]

# This seems right
node_has_error(missing)
#> [1] TRUE

Created on 2024-09-12 with reprex v2.0.2

DavisVaughan avatar Sep 12 '24 18:09 DavisVaughan

@DavisVaughan is this still an issue? I tried on the master branch and wasn't able to reproduce it. In R the example 1 + } parses to (program (float) (ERROR (UNEXPECTED '}'))) (with the latest version of the R grammar and the ERROR has has_error = true (at least, from the Rust bindings).

It would be easy enough to patch the C library to add ts_node_is_error(self) || as you describe, but I'd like to have at least one example where it's necessary before making a PR for it…

wetneb avatar Oct 04 '25 15:10 wetneb