Merge vscode-parse-tree extension into Cursorless monorepo
The problem
- Building tree-sitter-wasms locally is painful
- We need a way to use tree-sitter outside vscode, eg in jetbrains, neovim, etc
- Eventually enable automatically upgrading languages because we could use CI to make sure no scopes break
The solution
1. Publish npm package in CI containing pre-built wasms
We need to modify https://github.com/cursorless-dev/tree-sitter-wasms as follows:
- use the languages and versions in vscode-tree-sitter. We might need to tweak https://github.com/cursorless-dev/tree-sitter-wasms/blob/main/build.ts to handle languages that need special handling that it doesn't already support. The existing vscode-parse-tree makefile might possibly be of use
- remove
dependabot.ymlfrom that repo as we don't currently want auto-updating, as it could break our queries - change the package name to
@cursorless/tree-sitter-wasms
2. Create a cursorless-tree-sitter package in our monorepo
- Exposes a class
CursorlessTreeSittercopied fromJetBrainsTreeSitter - Depends on web-tree-sitter
- Depends on
@cursorless/tree-sitter-wasms - Takes path to directory containing wasms as a constructor argument
3. Modify our vscode setup to leverage the above
- Modify
extension.tsto construct aCursorlessTreeSitterand passpackages/cursorless-tree-sitter/node_modules/@cursorless/tree-sitter-wasms/outas directory in development. That should be subdir ofcursorlessRepoRoot()or whatever the function is - In production, it should pass
tree-sitter-wasmssubdir of assets root - Modify our vscode packaging harness to copy
packages/cursorless-tree-sitter/node_modules/@cursorless/tree-sitter-wasms/outtotree-sitter-wasms - Remove extension dependency on
vscode-parse-tree
Future steps
- Once the above steps are done, we'd like to split the
@cursorless/tree-sitter-wasmsnpm package into one package per language, eg@cursorless/tree-sitter-wasms-python. Will still ship all from the same https://github.com/cursorless-dev/tree-sitter-wasms repo. This will enable us to individually pin versions to improve auto-upgrade (#1282)
Possibly useful links
- https://www.google.com/search?q=copy-webpack-plugin&rlz=1C5CHFA_enUS504US504&oq=copy-webpack-plugin&aqs=chrome..69i57j0i67j0i512j0i20i263i512j0i512l6.326j0j7&sourceid=chrome&ie=UTF-8
- https://github.com/Gregoor/tree-sitter-wasms
- https://github.com/tree-sitter/tree-sitter/issues/408
- https://github.com/milahu/vite-plugin-tree-sitter
- https://github.com/tree-sitter/tree-sitter/issues/730#issuecomment-736018228
- https://github.com/emacs-tree-sitter/tree-sitter-langs
- https://github.com/tree-sitter/tree-sitter/pull/1864
- https://github.com/verhovsky/curlconverter.github.io/blob/715ec4defd2ca661aa74fc9400e1fff9a183755b/webpack.config.js#L29-L32
I've gotten the buildscripts nearly functional locally (13 or so packages fail to build out of what, 40-50?) for step one, but it fails spectacularly for reasons I don't fully understand when run via github actions. https://github.com/cocona20xx/tree-sitter-wasms/tree/issue-1488-restructure
My only guess is that how the buildscripts are set up is fundamentally incompatible with github actions due to how it stores dependencies temporarily? That'd explain the ENOENT on spawn for running the build commands, but idk how to fix that issue. Maybe storing the downloaded dependencies as part of the repo itself would work?
The github action works on the original repo, so presumably something broke from the tree-sitter version bump? Looks like this repo has it working, so maybe worth seeing if there's something to steal there? Can try just forking that branch onto your repo and running it to see if it also works for you for a start. Lmk if you wanna drop into a meet-up to debug
I found a solution after a lot of trial and error (16 to 18 commits worth of it! 🫠):
pnpm config set node-linker hoisted causes pnpm's dependency storage to mimic npm's, which fixes the ENOENT spam, though this requires adding some extra stuff to package.json to properly install the tree sitter CLI (and it does so inefficiently due to some deps also having the CLI as prod deps)
Now that build behavior is the same on my machine as it is on Actions, the focus is going to be on getting every depenency package to actually build, which isn't happening just yet. ~~I'll also probably add something to a prebuild script to handle running just the buildscripts for the root tree sitter CLI, since that'll save 10-20s per run.~~
Weird. Good work. Might be worth just switching to npm if it simplifies things 🤷♂️
https://github.com/cursorless-dev/tree-sitter-wasms/pull/1
After much ado, I think this is as far as I can take things for bullet point 1. Lmk if anything else is needed.