tree-sitter-ruby icon indicating copy to clipboard operation
tree-sitter-ruby copied to clipboard

Nested heredocs are not parsed correctly

Open aibaars opened this issue 5 years ago • 4 comments

puts <<HERE
  hello #{<<HERE}
  world
HERE
HERE

In the following parse tree the ranges of the heredoc bodies are not right

program [0, 0] - [6, 0])
  method_call [0, 0] - [0, 11])
    method: identifier [0, 0] - [0, 4])
    arguments: argument_list [0, 5] - [0, 11])
      heredoc_beginning [0, 5] - [0, 11])
  heredoc_body [0, 11] - [3, 4])
    interpolation [1, 8] - [1, 17])
      heredoc_beginning [1, 10] - [1, 16])
    heredoc_end [3, 0] - [3, 4])
  heredoc_body [3, 4] - [4, 4])
    heredoc_end [4, 0] - [4, 4])

aibaars avatar Dec 16 '20 16:12 aibaars

I've run into this issue when trying to run semgrep on a ruby file with nested heredocs. (https://github.com/returntocorp/semgrep/issues/3151)

When I paste the following into https://tree-sitter.github.io/tree-sitter/playground:

output =
  <<~ABC
    Top
    #{
      <<~DEF
        Middle
      DEF
    }
    Bottom
  ABC

puts output

I get the following output:

program [0, 0] - [13, 0])
  assignment [0, 0] - [1, 8])
    left: identifier [0, 0] - [0, 6])
    right: heredoc_beginning [1, 2] - [1, 8])
  heredoc_body [1, 8] - [9, 5])
    heredoc_content [1, 8] - [3, 4])
    interpolation [3, 4] - [7, 5])
      heredoc_beginning [4, 6] - [4, 12])
      constant [5, 8] - [5, 14])
      constant [6, 6] - [6, 9]). <-- This is the closing DEF of the HEREDOC string
    heredoc_content [7, 5] - [9, 2])
    heredoc_end [9, 2] - [9, 5])
  heredoc_body [9, 5] - [13, 0])
    heredoc_content [9, 5] - [13, 0])
    heredoc_end [13, 0] - [13, 0])

aegarbutt-stripe avatar May 25 '21 18:05 aegarbutt-stripe

I have the same issue with bourne shell. Nested heredocs are totally wrongly parsed and thus wrongly highlighted e.g. by neovim.

Do you have any news on the state of heredocs in treesitter?

dumblob avatar Oct 01 '23 15:10 dumblob

@dumblob heredocs are not a treesitter feature. Support for special lexcical things such as heredocs are implemented by scanner.cc. I guess the scanner used tree-sitter-bash has a similar bug as the one of tree-sitter-ruby.

aibaars avatar Oct 02 '23 08:10 aibaars