NextCapture return inaccurate index
I have a query with multiple captures like this
(import_statement
source: (string (string_fragment) @deps)
)
(call_expression
function: (import)
arguments: (arguments
(string (string_fragment) @dynamic-deps)
)
)
[
"async"
"await"
; '...'
(spread_element)
; import * as blah from
(import_statement
(import_clause
(namespace_import)
)
)
] @need-tslib
Then I would use it to query over a .tsx file as follow
for {
cap, idx, ok := qc.NextCapture()
if !ok {
break
}
name := q.CaptureNameForId(idx)
switch name {
case "deps":
for _, c := range cap.Captures {
i := c.Node.Content(data)
fmt.Println("DEBUG", fileName, name, i)
}
case "dynamic-deps":
for _, c := range cap.Captures {
i := c.Node.Content(data)
fmt.Println("DEBUG", fileName, name, i)
}
case "need-tslib":
default:
log.Fatalf("Unexpected capture name %s", name)
}
}
The expected result is that I would eventually get all 3 capture groups.
Actual result is
DEBUG file.ts deps rxjs
DEBUG file.ts deps ../router/router
DEBUG file.ts deps ./user
DEBUG file.ts deps async
DEBUG file.ts deps await
DEBUG file.ts deps async
DEBUG file.ts deps async
Using the same query on tree-sitter-cli yielded expected result
(v0.20.1) ~/work/misc/tree-sitter-typescript/tsx> tree-sitter -V
tree-sitter 0.20.7
(v0.20.1) ~/work/misc/tree-sitter-typescript/tsx> tree-sitter query typescript.scm file.ts
file.ts
pattern: 0
capture: 0 - deps, start: (0, 25), end: (0, 29), text: `rxjs`
pattern: 0
capture: 0 - deps, start: (8, 20), end: (8, 36), text: `../router/router`
pattern: 0
capture: 0 - deps, start: (9, 22), end: (9, 28), text: `./user`
pattern: 2
capture: 2 - need-tslib, start: (178, 2), end: (178, 7), text: `async`
pattern: 2
capture: 2 - need-tslib, start: (192, 6), end: (192, 11), text: `await`
pattern: 2
capture: 2 - need-tslib, start: (198, 2), end: (198, 7), text: `async`
pattern: 2
capture: 2 - need-tslib, start: (203, 2), end: (203, 7), text: `async`
Changed it to
cap, _, ok := qc.NextCapture()
if !ok {
break
}
name := q.CaptureNameForId(cap.Captures[0].Index)
which worked ok... But feels like a hack, especially around cap.Captures[0] part.
Am I wrong to expect NextCapture to return the index of the capture?
The NextCapture API returns the index within the match (always zero in your case), not within the query. This is the same behaviour as the underlying C API. I've not used this before, but it looks like the intention is to allow iterating over multiple captures in order, when they might not be in order in the Captures slice.
For your use case I would just use NextMatch and use .Captures[0] as you are doing.
import (
"context"
"fmt"
sitter "github.com/smacker/go-tree-sitter"
"github.com/smacker/go-tree-sitter/javascript"
"testing"
)
func TestTreeSitterQueries(t *testing.T) {
code := []byte(`
function hello() {
// comment line
console.log('hello')
if (true) { console.log('true') }
return "value"
}
`)
query := `
[
"function"
"if"
"return"
] @keyword
(comment) @comment
`
// Parse source code
lang := javascript.GetLanguage()
n, _ := sitter.ParseCtx(context.Background(), code, lang)
// Execute the query
q, _ := sitter.NewQuery([]byte(query), lang)
qc := sitter.NewQueryCursor()
qc.Exec(q, n)
for {
m, ok := qc.NextMatch()
if !ok { break }
m = qc.FilterPredicates(m, code)
for _, c := range m.Captures {
name := q.CaptureNameForId(c.Index)
content := c.Node.Content(code)
fmt.Println(c.Node.StartPoint(), c.Node.EndPoint(), name, c.Node.Type(), content)
}
}
}
output
{1 0} {1 8} keyword function function
{2 1} {2 17} comment comment // comment line
{4 1} {4 3} keyword if if
{5 1} {5 7} keyword return return
example