tree-sitter-dockerfile
tree-sitter-dockerfile copied to clipboard
Make language injections easier by making nodes anonymous
Without heredoc_line and shell_fragment being present in a tree it's much easier to add language injections, for example, with bash:
((shell_command) @injection.content
(#set! injection.language "bash"))
((run_instruction
(heredoc_block) @injection.content)
(#set! injection.language "bash"))
With the way it's implemented right now it doesn't seem to be possible to do these kinds of injections correctly. There is a Neovim issue with a bit more details for shell_command injections https://github.com/nvim-treesitter/nvim-treesitter/issues/6530
Here are the injections used at the moment in nvim-treesitter:
((comment) @injection.content
(#set! injection.language "comment"))
((shell_command
(shell_fragment) @injection.content)
(#set! injection.language "bash")
(#set! injection.combined))
((run_instruction
(heredoc_block) @injection.content)
(#set! injection.language "bash")
(#set! injection.include-children))
There doesn't seem to be a better way to "join" shell_fragment nodes than using injection.combined, which isn't quite great since it treats all of the scripts in the whole dockerfile as a single large script, breaking highlighting in a weird way. With these changes every RUN / CMD / ENTRYPOINT instruction can be injected independently, but without splitting it into different chunks between line_continuation nodes.
For heredoc, include-children is needed to highlight all heredoc_line nodes, but it also includes heredoc_end node, which isn't quite right. Injecting every heredoc as bash is not itself correct though, but that's a different issue I think.
Thanks for maintaining this project! :+1:
Playing with examples/1 I noticed one more thing. With code like this:
RUN apk update && apk add --no-cache \
# some comment here
bash
bash after the comment is treated as a command inside of an injection instead of an argument, since comment is consumed up until the newline (excluding newline itself), so effectively it is parsed by bash as:
apk update && apk add --no-cache \
bash
Note the blank line there.
It should be interpreted like this though:
apk update && apk add --no-cache \
bash
For that to work comment should also consume a newline. I tried adding a newline into regex there and all tests are still passing. I couldn't come up with any negative consequences of that yet, so it should probably be added as well.
See https://github.com/nvim-treesitter/nvim-treesitter/issues/6975 for a related issue. (The current setup makes it hard to parse multiline commands as bash, since we need to inject into fragments and combine them together -- and that can lead to false positives.)
Any progress on this?
I've rebased on top of main and made comment consume a newline.
Not sure what else I need to do, but let me know if there is something. I've never worked with treesitter grammars before.