go-tree-sitter icon indicating copy to clipboard operation
go-tree-sitter copied to clipboard

NextCapture return inaccurate index

Open sluongng opened this issue 3 years ago • 3 comments

I have a query with multiple captures like this

(import_statement
	source: (string (string_fragment) @deps)
)

(call_expression
	function: (import)
	arguments: (arguments
		(string (string_fragment) @dynamic-deps)
	)
)

[
  "async"
  "await"
  ; '...'
  (spread_element)
  ; import * as blah from
  (import_statement
    (import_clause 
      (namespace_import)
    )
  )
] @need-tslib

Then I would use it to query over a .tsx file as follow

		for {
			cap, idx, ok := qc.NextCapture()
			if !ok {
				break
			}

			name := q.CaptureNameForId(idx)
			switch name {
			case "deps":
				for _, c := range cap.Captures {
					i := c.Node.Content(data)
					fmt.Println("DEBUG", fileName, name, i)
				}
			case "dynamic-deps":
				for _, c := range cap.Captures {
					i := c.Node.Content(data)
					fmt.Println("DEBUG", fileName, name, i)
				}
			case "need-tslib":
			default:
				log.Fatalf("Unexpected capture name %s", name)
			}
		}

The expected result is that I would eventually get all 3 capture groups.

Actual result is

DEBUG file.ts deps rxjs
DEBUG file.ts deps ../router/router
DEBUG file.ts deps ./user
DEBUG file.ts deps async
DEBUG file.ts deps await
DEBUG file.ts deps async
DEBUG file.ts deps async

Using the same query on tree-sitter-cli yielded expected result

(v0.20.1) ~/work/misc/tree-sitter-typescript/tsx> tree-sitter -V
tree-sitter 0.20.7

(v0.20.1) ~/work/misc/tree-sitter-typescript/tsx> tree-sitter query typescript.scm file.ts
file.ts
  pattern: 0
    capture: 0 - deps, start: (0, 25), end: (0, 29), text: `rxjs`
  pattern: 0
    capture: 0 - deps, start: (8, 20), end: (8, 36), text: `../router/router`
  pattern: 0
    capture: 0 - deps, start: (9, 22), end: (9, 28), text: `./user`
  pattern: 2
    capture: 2 - need-tslib, start: (178, 2), end: (178, 7), text: `async`
  pattern: 2
    capture: 2 - need-tslib, start: (192, 6), end: (192, 11), text: `await`
  pattern: 2
    capture: 2 - need-tslib, start: (198, 2), end: (198, 7), text: `async`
  pattern: 2
    capture: 2 - need-tslib, start: (203, 2), end: (203, 7), text: `async`

sluongng avatar Mar 10 '23 19:03 sluongng

Changed it to

			cap, _, ok := qc.NextCapture()
			if !ok {
				break
			}

			name := q.CaptureNameForId(cap.Captures[0].Index)

which worked ok... But feels like a hack, especially around cap.Captures[0] part.

Am I wrong to expect NextCapture to return the index of the capture?

sluongng avatar Mar 10 '23 19:03 sluongng

The NextCapture API returns the index within the match (always zero in your case), not within the query. This is the same behaviour as the underlying C API. I've not used this before, but it looks like the intention is to allow iterating over multiple captures in order, when they might not be in order in the Captures slice.

For your use case I would just use NextMatch and use .Captures[0] as you are doing.

didroe avatar Jul 25 '23 09:07 didroe

import (
	"context"
	"fmt"
	sitter "github.com/smacker/go-tree-sitter"
	"github.com/smacker/go-tree-sitter/javascript"
	"testing"
)

func TestTreeSitterQueries(t *testing.T) {
	code := []byte(`
function hello() { 
	// comment line 
	console.log('hello') 
	if (true) { console.log('true') }
	return "value"
}
`)

	query := `
[
  "function"
  "if"
  "return"
] @keyword

(comment) @comment
`

	// Parse source code
	lang := javascript.GetLanguage()
	n, _ := sitter.ParseCtx(context.Background(), code, lang)

	// Execute the query
	q, _ := sitter.NewQuery([]byte(query), lang)
	qc := sitter.NewQueryCursor()
	qc.Exec(q, n)

	for {
		m, ok := qc.NextMatch()
		if !ok { break }
		m = qc.FilterPredicates(m, code)
		for _, c := range m.Captures {
			name := q.CaptureNameForId(c.Index)
			content := c.Node.Content(code)
			fmt.Println(c.Node.StartPoint(), c.Node.EndPoint(), name, c.Node.Type(), content)
		}
	}
}

output

{1 0} {1 8} keyword function function
{2 1} {2 17} comment comment // comment line 
{4 1} {4 3} keyword if if
{5 1} {5 7} keyword return return

example

vipmax avatar Aug 17 '23 11:08 vipmax