textual icon indicating copy to clipboard operation
textual copied to clipboard

Tree-sitter callback-parsing prevents correct query processing.

Open paul-ollis opened this issue 7 months ago • 5 comments

The TextArea code invokes tree_sitter.Parser.parse with a callback function as its first argument. Queries on the resulting tree_sitter.Tree do not support predicates (#egq, #not-eq, etc.), which is an important feature of queries (it is used by several of Textual's query definitions (SCM files)). In practice, "not supported" means that too many query expressions produce matches:

  • The intention of the query definition wirter is obviously not met.

  • Syntax highlighting is not as rich or nuanced as it should be.

  • Unwanted captures are generated, which will have some impact on performance.

  • Users of Textual are limited when it comes to creating custom SCM files.

It is fairly easy to change the code so that tree_sitter.Parser.parse is invoked with the full text of the TextArea as the first argument, in which case query definitions are then fully supported. I have tried this on my local textarea-speedup-2 branch - as used for #5645. There is no obvious detrimental impact on performance and the code is simpler, but...

Py-tree-sitter 0.23.2 has a bug in its processing of the #any-of predicate. For Textual's Python SCM file and the Monokai theme, this produces rather unpleasant results. Some re-working of the SCM file could work around this. Other SCM files might also need changes.

Py-tree-sitter 0.24.0 has a fix for the bug, which appears to work, based on a quick trial. (Py-tree-sitter does not have tests coverage of #any-of.) But 0.24.0 drops support for Python 3.9!

The best way forward does not seem obvious to me, but I am willing to do the work based on what you think is the correct approach.

paul-ollis avatar Apr 14 '25 19:04 paul-ollis

We found the following entry in the FAQ which you may find helpful:

Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.

This is an automated reply, generated by FAQtory

github-actions[bot] avatar Apr 14 '25 19:04 github-actions[bot]

My gut says we should drop syntax support for 3.9. Users that really need syntax support for 3.9 will be stuck at the current version of Textual.

@darrenburns thoughts?

willmcgugan avatar Apr 15 '25 12:04 willmcgugan

I think the syntax functionality is so dependent on tree-sitter that we should follow their lead and drop 3.9 if that's what they've decided.

darrenburns avatar Apr 20 '25 17:04 darrenburns

@paul-ollis I think that is conclusive. We are happy for syntax support to be for 3.9 onwards.

willmcgugan avatar Apr 21 '25 11:04 willmcgugan

@willmcgugan Thanks for the update.

I am happy to make a separate PR or roll this into #5645, assuming that you wish to pursue #5645 (some discussion and work still required).

paul-ollis avatar Apr 22 '25 09:04 paul-ollis

tree-sitter was bumped to v0.25.0 in #5977, so can this issue now be closed?

TomJGooding avatar Sep 18 '25 14:09 TomJGooding