tree-sitter
tree-sitter copied to clipboard
Tree Sitter query from stdin instead of a file
It would be very useful to be able to use the Tree Sitter CLI query
command on chunks of code provided on standard input.
~~It would also be useful to be able to manually select a language, instead of attempting to detect it automatically from the input file.~~ Done! See https://github.com/tree-sitter/tree-sitter/issues/1511#issuecomment-1040733058
The ability to manually select a language would also facilitate reading from stdin, because your only failure mode is "parse error", rather than accidentally detecting the wrong language.
I am not a Rust developer at all, but the code doesn't look that difficult to modify. I can try to put together a PR if the Tree Sitter authors are interested (or maybe this would be a "good first issue" for someone in general).
Apologies if this has been discussed elsewhere; I didn't see it in the issue tracker.
+1 to this.
I've tried to use tree-sitter-cli parse
on stdin, either by passing /dev/stdin
or -
as the filename. The former "works" but the CLI tool believes the input is zero-length (perhaps it attempts to stat()
the file before reading to determine its length?); the latter just tries to open a regular file called "-". Both of these do not do what is required.
$ node_modules/tree-sitter-cli/tree-sitter parse /dev/stdin
print "Hello"
(source_file [0, 0] - [0, 0])
Workaround:
$ cat >tmpfile
print "Hello"
$ node_modules/tree-sitter-cli/tree-sitter parse tmpfile
(source_file [0, 0] - [1, 0]
(ERROR [0, 0] - [0, 13]
(call_expression [0, 0] - [0, 13]
function_name: (identifier [0, 0] - [0, 5])
args: (argument [0, 6] - [0, 13]
(string_double_quoted [0, 6] - [0, 13])))))
tmpfile 0 ms (ERROR [0, 0] - [0, 13])
It would also be useful to be able to manually select a language, instead of attempting to detect it automatically from the input file.
You can use the --scope
option to do this:
$ cat >python.txt
import foo
print(foo.x)
$ tree-sitter parse --scope source.python python.txt
The particular --scope
value to use comes from the tree-sitter.scope
field of the grammar's package.json
file (e.g. python).
Thanks @dcreager, that at least covers the 1st request! Although it's a bit clunky to have to browse through the output of tree-sitter dump-languages
to make sure that you have the scope names right.
With respect to:
I've tried to use tree-sitter-cli parse on stdin, either by passing /dev/stdin or - as the filename. The former "works" but the CLI tool believes the input is zero-length (perhaps it attempts to stat() the file before reading to determine its length?);
As a slightly different take, here, the following:
$ echo "(def a 1)" | tree-sitter parse /dev/stdin
or:
$ echo "(def a 1)" | tree-sitter parse /dev/fd/0
produces the output:
(source [0, 0] - [1, 0]
(list_lit [0, 0] - [0, 9]
value: (sym_lit [0, 1] - [0, 4]
name: (sym_name [0, 1] - [0, 4]))
value: (sym_lit [0, 5] - [0, 6]
name: (sym_name [0, 5] - [0, 6]))
value: (num_lit [0, 7] - [0, 8])))
Admittedly, that might be kind of awkward for longer source.
Perhaps this is not platform agnostic though.
In any case, I think this feature would be a nice one to have working via some means. May be treating -
as a filename as suggested or an option like --stdin
(perhaps even without an argument?).
For query
, I think the choice is less obvious regarding what input on standard input should count as. My initial thought was that it should be the query rather than source, but I suppose there might be a use for the reverse case...
There is an implementation of parsing via stdin in ahlinc's alpha branch mentioned here [1].
It has been working well for me and I'm glad to be able to use it :)
Sample uses:
$ echo ":a" | tree-sitter parse -
(source [0, 0] - [1, 0]
(kwd_lit [0, 0] - [0, 2]
name: (kwd_name [0, 1] - [0, 2])))
and:
$ tree-sitter parse -
(def a 1)
(source [0, 0] - [1, 0]
(list_lit [0, 0] - [0, 9]
value: (sym_lit [0, 1] - [0, 4]
name: (sym_name [0, 1] - [0, 4]))
value: (sym_lit [0, 5] - [0, 6]
name: (sym_name [0, 5] - [0, 6]))
value: (num_lit [0, 7] - [0, 8])))
Note that in the second use example, after the invocation I typed (def a 1)
followed by enter and then Ctrl-D.
[1] If you already have rustup and friends, installation is straight-forward (see link above for details).