tree-sitter 🐛 Hidden rule hides single anonymous string literal token

Problem

Lexical precedence is currently achieved by wrapping prec() with token(). The problem is that if the rule is hidden and contains a string literal, it can no longer be captured. Moreover, it seems to break the string literal when used on its own in other rules.

Solution

The currently available solution (apart from unhiding the node) is aliasing it back to the literal. I propose introducing a prec.lexical function which, like prec, will not mask its contents.

References

https://github.com/tree-sitter-perl/tree-sitter-perl/pull/114#issuecomment-1682420642 https://github.com/nvim-treesitter/nvim-treesitter/pull/5301#issuecomment-1689045251

Aug 23 '23 11:08 ObserverOfTime

If it would be more comfortable for you then you may patch JS like:

prec.lex = function (_prec, rule) { return token(prec(_prec, rule)); };

Aug 23 '23 19:08 ahlinc

That would be a bodge.

Aug 23 '23 19:08 ObserverOfTime

I think improving the docs about lexical precedence is better than introducing another function that, although might be a bit clearer, is unnecessary and does exactly the same thing

Aug 23 '23 20:08 amaanq

Tree-sitter's DSL was build around an idea to be minimal and provides combinable actions where it's possible.

Many grammars add own abstractions on top of base DSL actions.

Aug 25 '23 15:08 ahlinc

If it would be more comfortable for you then you may patch JS like:
prec.lex = function (_prec, rule) { return token(prec(_prec, rule)); };

I think you're missing the main issue here. The problem is that in order to use lexical precedence, you need to use the token directive. This makes TS create a new token for the contents, which you then are unable to match in a query. In order to work around this, you need to then alias the literal token BACK TO ITSELF. I refer to what I wrote here https://github.com/tree-sitter-perl/tree-sitter-perl/pull/114/commits/2098bca9162e672d1e6be78418802d0f52be7f4d

In any case, the workaround that solves the issue would be

prec.lex = function (_prec, rule) { return alias(token(prec(_prec, rule)), rule); };

which smells to me like there should be better handling for this

Aug 26 '23 18:08 rabbiveesh

In order to work around this, you need to then alias the literal token BACK TO ITSELF.

We discussed this with @amaanq and we think it's a bug and it will be fixed.

Aug 26 '23 22:08 ahlinc

that is, only if the rule is hidden

Aug 26 '23 22:08 amaanq

It's also a documentation issue. It's only briefly mentioned in the website and not in the prec() or token() function documentation. Ideally, the website should also explain why token(prec()) represents lexical precedence.

Aug 27 '23 08:08 ObserverOfTime

tree-sitter tree-sitter copied to clipboard

🐛 Hidden rule hides single anonymous string literal token

Problem

Solution

References

tree-sitter
tree-sitter copied to clipboard