tree-sitter-elixir icon indicating copy to clipboard operation
tree-sitter-elixir copied to clipboard

Support HEEx/Surface markup

Open jonatanklosko opened this issue 2 years ago โ€ข 40 comments

We need to handle these syntaxes. These should be standalone grammar handling .heex/.surface files, but we will also use tree-sitter injection to parse the relevant sigil nodes.

@connorlay already worked on both grammars, so we can just integrate them.

jonatanklosko avatar Oct 01 '21 22:10 jonatanklosko

Regarding code organisation, on one hand it seems like a tree-sitter convention to keep all grammars as separate repositories, but separating heex/surface feels quite scattered. I saw that inย tree-sitter-typescriptย they keep typescript and tsx grammars in separate directories, but they are bundled together (as the node/rust package), however that's a rather specific case because these are two variants of the same language and the grammars share most of the code. Since we are dealing with separate grammars, I feel that we should have separate tree-sitter projects, but we could place them in a single repo, each in its own directory. I'm just not sure if this will work well when installing the package directly from GitHub (which seems GitHub does). That said, I would be totally ok with separate repositories, just wouldn't want to spam elixir-lang with these :)

jonatanklosko avatar Oct 01 '21 23:10 jonatanklosko

My main concern about being a single repo or multiple repos is about further functionality. If we implement "Go to definition", what are the odds we want to share functionality between them? Especially when HEEx and Surface templates do allow calling local functions?

There is also EEx, which is definitely part of Elixir. If we want to support it, would we make it a separate language package? That sounds a bit too much to me.

If we are not sure, we can also wait until we start "Go to definition" before making a decision.

josevalim avatar Oct 02 '21 06:10 josevalim

To support code navigation for <Mod.fun> I think we would parse Mod.fun into a single node and then use injections to parse it as Elixir code, which would be a regular function call. Not sure about <.fun> though, if we skip the dot and parse fun then it's gonna be an identifier rather than call :thinking: Either way, we certainly want separate heex/surface grammars, hence injections seems like the only reasonable way to integrate go-to-definition there.

As for sharing functionality, as far as I understand we need to add Elixir support in github/semantic, so go-to-definition code will live outside tree-sitter-elixir anyway.

There is also EEx, which is definitely part of Elixir. If we want to support it, would we make it a separate language package? That sounds a bit too much to me.

There is already an official tree-sitter-embedded-template grammar for ERB and EJS, we could probably reuse that given how similar EEx is?

jonatanklosko avatar Oct 04 '21 08:10 jonatanklosko

Alright, so it is either three directories in this repo or three separate repositories. Both are fine by me. It depends on how many colocated changes we will have to perform.

There is already an official tree-sitter-embedded-template grammar for ERB and EJS, we could probably reuse that given how similar EEx is?

Sounds good to me!

josevalim avatar Oct 04 '21 08:10 josevalim

Alright, so it is either three directories in this repo or three separate repositories. Both are fine by me. It depends on how many colocated changes we will have to perform.

I don't expect any collocated changes, as the only integration point should be the injection queries.

So I'd lean towards separate repositories to adhere to the tree-sitter convention.

jonatanklosko avatar Oct 04 '21 09:10 jonatanklosko

So perhaps it is best to start a discussion about moving all of those to the tree-sitter organization. Or alternatively we move all of them to elixir-editors (or we start an elixir-tree org).

josevalim avatar Oct 04 '21 09:10 josevalim

Ideally we would move to the tree-sitter org yeah, otherwise the elixir-editors sounds good to me (just so we don't have elixir-tree-sitter/tree-sitter-elixir ^^).

jonatanklosko avatar Oct 04 '21 09:10 jonatanklosko

Alright, so let's give tree-sitter-elixir a try on neovim, and, if solid, we should request GitHub to move to tree-sitter-elixir/surface/heex for syntax highlighting.

josevalim avatar Oct 04 '21 09:10 josevalim

Hey all, I've been busy the past few days and now am catching up on this thread :)

To support code navigation for <Mod.fun> I think we would parse Mod.fun into a single node and then use injections to parse it as Elixir code, which would be a regular function call. Not sure about <.fun> though, if we skip the dot and parse fun then it's gonna be an identifier rather than call thinking Either way, we certainly want separate heex/surface grammars, hence injections seems like the only reasonable way to integrate go-to-definition there.

This is the approach I took when integrating Surface with Neovim. I suggest we try to keep the HEEx/Surface/EEx grammars as similar to each other as possible to keep the maintenance overhead low when wiring up these injections.

As for sharing functionality, as far as I understand we need to add Elixir support in github/semantic, so go-to-definition code will live outside tree-sitter-elixir anyway.

I found this merged PR that added Rust support that illustrates what might be involved integrating Elixir into semantic. I think tag definitions are what drive the code navigation capability?

Alright, so let's give tree-sitter-elixir a try on neovim, and, if solid, we should request GitHub to move to tree-sitter-elixir/surface/heex for syntax highlighting.

Can GitHub move to tree-sitter-elixir for syntax highlighting before semantic support is added? Or is semantic integration blocking that as well? Either way, I'll get started on a branch migrating Neovim to the new grammar and see how that turns out.

connorlay avatar Oct 04 '21 18:10 connorlay

Can GitHub move to tree-sitter-elixir for syntax highlighting before semantic support is added?

From my understanding, yes. :)

Either way, I'll get started on a branch migrating Neovim to the new grammar and see how that turns out.

Yes, beautiful.

josevalim avatar Oct 04 '21 19:10 josevalim

I think tag definitions are what drive the code navigation capability?

I think so, although code navigation is not supported for Rust at this point. I think the Rust tags are missing the normalization parts, see Ruby/Tags.hs. Generally Ruby seems like a minimal example, it has just three files, and AST.hs is automatically generated, so it's pretty much just Tags.hs as far as I can see.

jonatanklosko avatar Oct 04 '21 19:10 jonatanklosko

@josevalim My PR migrating Neovim to the new parser is open for full review now https://github.com/nvim-treesitter/nvim-treesitter/pull/1904 ๐Ÿš€

connorlay avatar Oct 09 '21 00:10 connorlay

@connorlay beautiful! ๐Ÿ˜

josevalim avatar Oct 09 '21 06:10 josevalim

Hi @connorlay, quick question: do you think we are ready to ask the GitHub team to move Elixir and Surface (and HEEx?) upstream?

josevalim avatar Oct 25 '21 13:10 josevalim

@josevalim For Elixir, I think we are ready to ask GitHub to switch over. @jonatanklosko I'm curious to hear what you think.

For Surface and HEEx, I have a couple questions:

  1. How often does GitHub update their parser versions? If new syntax is introduced to the parsers, what does the upgrade path look like? Surface and HEEx are pre-1.0, so I expect more changes will occur to those compared to Elixir.
  2. I wrote both Surface and HEEx parsers initially for neovim, which utilizes additional query capture definitions than what tree-sitter provides alone. Will we need to port the highlight queries from nvim-treesitter to tree-sitter-surface and tree-sitter-heex before GitHub can utilize them for syntax highlighting? @the-mikedavis has a PR integrating HEEx with the Helix editor that might solve this for us.
  3. If GitHub is going to use the Surface and HEEx parsers, would it make sense to transfer the those repositories out of my personal namespace and to the elixir-lang organization? I am happy to do so as long as I retain my status as a maintainer ๐Ÿ™‚

connorlay avatar Oct 25 '21 16:10 connorlay

@connorlay my plan is to ask to move all of them to the tree-sitter org. I will ask your questions upstream and follow up! :)

josevalim avatar Oct 25 '21 16:10 josevalim

@connorlay as for HEEx the only thing that I'm unsure about is how to parse <.tag>, if we want to use injection we need to parse "tag" into it's own node. But then, even changing the AST at any point should probably be fine, because the highlight queries will live in the same repo and can be updated together.

@josevalim when you get a chance please also ask if it's actually possible to integrate jump-to-definition with injections. I was looking at an example Rails repo and references don't seem to work for inside ERB templates, so maybe there's some limitation to that?

jonatanklosko avatar Oct 25 '21 17:10 jonatanklosko

as for HEEx the only thing that I'm unsure about is how to parse <.tag>, if we want to use injection we need to parse "tag" into it's own node. But then, even changing the AST at any point should probably be fine, because the highlight queries will live in the same repo and can be updated together.

@jonatanklosko that is a good point. The existing tests for the HEEx and Surface parsers are pretty bare and could use some work to cover more scenarios. I'm not sure how much time I'll have this week to dedicate to this ๐Ÿค”

That reminds me: we will also want to check to see if the HEEx parser works with the new slots feature.

connorlay avatar Oct 25 '21 17:10 connorlay

Hi, popping into this discussion. I have a friend who works on the jump-to-definition stuff at GitHub. He says if the highlighting queries are ready, he can submit a proposal to integrate it into the highlight service.

Do y'all already have a GitHub contact you're working with to try to get things moved down the pipe?

maco avatar Dec 13 '21 20:12 maco

Hi @maco! Help on this front would be appreciated. The highlight queries is ready but we do have one final question about repository organization. In particular, we are wondering how jump-to-definition works across multiple repositories.

For example, take https://github.com/tree-sitter/tree-sitter-embedded-template. In Ruby, you could do <%= SomeModule.method(123) %>. However, there is currently no go-to-definition for those. I am wondering if this is because this was not implemented, if this is a limitation of how injections work, or if it is a limitation on the GitHub side of things.

The reason we ask is because we have a similar setup. We have the Elixir language and we have our embedded templates, and we want to make sure our go-to-definitions will work for those embedded templates too. Maybe we need to keep all of those in the same repository and, if that's the case, we want to merge the repositories before we send it upstream!

josevalim avatar Dec 13 '21 20:12 josevalim

Good news, someone from GitHub reached out and they will enable tree-sitter-elixir on GitHub. They told me that both injections and definitions should work across repositories, so there is no reason to put everything together in the same repository.

We should tackle embedded template (.eex) definitions for Elixir and go-to definition in the future.

We should also work to have official integration for both HEEx and Surface on GitHub. @connorlay, please let us know if you need help on this front.

Also, if you want to move tree-sitter-heex to the Phoenix org, we would be glad to welcome you there! :)

josevalim avatar Dec 15 '21 18:12 josevalim

It is live, thanks everyone and @maco!

josevalim avatar Dec 15 '21 20:12 josevalim

@josevalim Exciting! To clarify, is GitHub now using tree-sitter-elixir and tree-sitter-heex, or just tree-sitter-elixir?

I'll get ready to move tree-sitter-heex to the Phoenix organization ๐Ÿ˜„

connorlay avatar Dec 15 '21 21:12 connorlay

@connorlay only tree-sitter-elixir because I believe the highlight queries are not ready for heex and surface? Or have I misunderstood?

josevalim avatar Dec 15 '21 21:12 josevalim

@josevalim No you are correct, I still need to port the highlight queries from nvim-treesitter to tree-sitter-heex and tree-sitter-surface before we can integrate them with GitHub. Thanks!

connorlay avatar Dec 15 '21 21:12 connorlay

@josevalim @jonatanklosko Thanks to the efforts of @the-mikedavis , we now have highlight queries for tree-sitter-heex! Is this ready to send to GitHub?

connorlay avatar Jan 26 '22 17:01 connorlay

I think on the heex side the queries are ready but we should probably await a resolution to #24 so that the injections are in good shape as well

the-mikedavis avatar Jan 26 '22 18:01 the-mikedavis

Revisiting this thread, do folks think it would be worth moving forward with adding tree-sitter-heex injections despite https://github.com/elixir-lang/tree-sitter-elixir/issues/24 reported as a known issue? I'm asking because today there is no syntax highlighting at all for ~H sigils in GitHub, which is making it difficult to review pull-requests for my team at work. Enabling language injection and syntax highlighting would be a good first step towards improving the Live View developer experience in GitHub, knowing that the scenarios detailed in https://github.com/elixir-lang/tree-sitter-elixir/issues/24 might look off when highlighted.

@jonatanklosko @the-mikedavis @josevalim I am happy to work on the PR to add tree-sitter-heex injections if you are comfortable sending it to GitHub once ready ๐Ÿ™‚

connorlay avatar Apr 12 '22 23:04 connorlay

I think it would be a good increment to start using the heex grammar now, especially since there currently isn't ~H highlighting ๐Ÿ‘

The cases in #24 might be pretty hard to fix so I think we shouldn't hold off for their sake. (Also thanks for reminding about it, I was dragging my feet on investigating the scoped combined injections part ๐Ÿ˜…)

What do you think, Jonatan and Josรฉ?

the-mikedavis avatar Apr 13 '22 01:04 the-mikedavis

Letโ€™s go ahead! Having HEEx support with go to definitions for components will be glorious!

josevalim avatar Apr 13 '22 05:04 josevalim