code-intel-extensions icon indicating copy to clipboard operation
code-intel-extensions copied to clipboard

tree-sitter proof of concept

Open chrismwendt opened this issue 6 years ago • 0 comments

What is tree-sitter?

tree-sitter is a parser for ~15 programming languages (TypeScript, Go, Python, Java, etc.). It does not do analysis such as type checking - it only constructs a syntax tree and provides an API for walking the syntax tree.

What can basic-code-intel use it for?

  • Jump-to-definition and find-references on local parameters and variables without a language server
  • Increasing precision by looking at the token to the left of the . (e.g. Bar in foo.Bar probably comes from the foo package/class/namespace)

This PR

This PR supports jump-to-definition on local parameters and variables for 1 language (TypeScript) as a proof-of-concept. Further but similar work is necessary to support find-references and other languages.

2019-05-01 01 06 28

Implementation of jump-to-definition

  • Parse the file into a syntax tree
  • When the user hovers over a token, ask tree-sitter for the node at that position
  • Walk up the syntax tree and at each parent node, check for a binding with the same name as the token
  • Return the first binding node's position if any were found (this works because of variable shadowing)

"Checking for a binding" is implemented with a tiny CSS-like EDSL involving what I call "selectors" rather than switch-statements and recursion. To see what this looks like, check out the TypeScript selector defined in handler.ts.

Status of this work

I've been working on this here and there over the past 2 weeks and now that we're considering investing effort in non-basic-code-intel approaches in the near future, I figured I would checkpoint this work and make a PR out of it in case we come back to it later.

TODO

  • [ ] Upload main .wasm and grammars to a file hosting site
  • [ ] Make sure the .wasm files are compressed
  • [ ] Add checksum validation
  • [ ] Only load the grammar for languages of open files
  • [ ] Only create one parser for each language
  • [ ] Cache grammars
  • [ ] Reorganize/refactor the code
  • [ ] Split tests into a separate file
  • [ ] .delete() the parser and tree when done with them (i.e. when a file is closed)
  • [ ] Also .delete() the parser and tree after a timeout to safeguard against memory leaks
  • [ ] Write a selector EDSL debugger or sandbox to more quickly write and troubleshoot selectors

chrismwendt avatar May 01 '19 08:05 chrismwendt