tree-sitter-langs
tree-sitter-langs copied to clipboard
Compilation requirements
I’m trying to understand a bit about how tree-sitter-langs compiles it’s shared libraries in an attempt to use versions 0.20+. I’m looking at the tree-sitter-langs-compile function and it seems to call tree-sitter generate, tree-sitter test, as well as separate compilation commands. Is this all necessary? I had the understanding that tree-sitter test would compile it automatically.
Additionally, if my tree-sitter https://github.com/emacs-tree-sitter/elisp-tree-sitter/pull/212 is accepted and you relied on the automatic compilation then you would no longer need to tar and extract the bundle to tree-sitter-langs--bin-dir in tree-sitter-langs-create-bundle. I believe you’d only need the version file.
I was actually thinking of proposing the opposite: vendoring the build code in tree-sitter test so we’re not at the whims of what directory TS decides to place them, since it seems like the shared library generation from running test is intended for internal use. This would prevent the current 0.20 issue that’s happening.
Fully committing to one or the other seems like the best bet for maintainability, though I don’t have the full context of why we have a mix between custom build commands and tree-sitter test currently.
The directory that TS has decided to place then is pretty standardized with XDG_CACHE_HOME/tree-sitter/lib. Having tree-sitter-langs place its libraries in that directory would be more consistent with user expectations. It would also make it simpler for users to add their own libraries.
If there was not shift to this I can imagine there potentially being some race condition issues when emacs-tree-sitter is attempting to load from both the standard TS directory and emacs-tree-sitter directory. I am not sure how that would be handled.
Either way, I agree fully committing would be ideal so we can support more current versions.
The reasons we are using both custom compilation commands and tree-sitter test are:
tree-sitter test's output location is an internal implementation detail that hasn't been stabilized, and cannot be queried/controlled.tree-sitter testdoesn't support advanced needs. An example is linkinglibstdc++statically, which is a major distribution hassle otherwise. Another example is cross compilation for Apple Silicon.- Custom compilation commands don't cut it on Windows, where compilation is very involved.
The preferred direction at the moment is handling compilation on our own, instead of relying on the CLI. The biggest hurdle is compiling on Windows. A potential solution is using the cc crate (which the CLI also uses), putting the compilation logic in the dynamic module.
... you relied on the automatic compilation then you would no longer need to tar and extract the bundle to
tree-sitter-langs--bin-dirintree-sitter-langs-create-bundle. I believe you’d only need the version file.
The bundle is intended for people who don't want to compile the grammars on their own. It's one of the main purposes of tree-sitter-langs.
The directory that TS has decided to place then is pretty standardized with
XDG_CACHE_HOME/tree-sitter/lib. Havingtree-sitter-langsplace its libraries in that directory would be more consistent with user expectations. It would also make it simpler for users to add their own libraries.
There is no consensus on user expectations. It is tree-sitter.el that should use tree-sitter CLI's directory as the default, allowing users who use the CLI to reuse the binaries. (The fact that it doesn't support the location used by newer CLI versions is something to be fixed, and is an orthogonal discussion.) tree-sitter-langs is an Emacs package that should have its own binary location.
If there was not shift to this I can imagine there potentially being some race condition issues when
emacs-tree-sitteris attempting to load from both the standard TS directory andemacs-tree-sitterdirectory.
It tries the directories sequentially, not concurrently, so there shouldn't be race conditions.
A potential solution is using the
cccrate (which the CLI also uses)
Now that tree-sitter-loader has been extracted into a separate crate, we can also use that.
I've given a shot at the tree-sitter-loader solution here: https://github.com/emacs-tree-sitter/elisp-tree-sitter/pull/220, lmk what you think!