tree-sitter-dockerfile icon indicating copy to clipboard operation
tree-sitter-dockerfile copied to clipboard

Segfault when file doesn't end in newline

Open kopecs opened this issue 3 years ago • 3 comments

Attempting to parse the file

LABEL A=$B

where the file contents do not contain a newline results in a segfault. If I run with -d, I get the following:

trace

new_parse
process version:0, version_count:1, state:1, row:0, col:0
lex_internal state:157, row:0, column:0
  consume character:'L'
  consume character:'A'
  consume character:'B'
  consume character:'E'
  consume character:'L'
lexed_lookahead sym:LABEL, size:5
shift state:156
process version:0, version_count:1, state:156, row:0, col:5
lex_internal state:48, row:0, column:5
  skip character:' '
  consume character:'A'
lexed_lookahead sym:unquoted_string, size:2
shift state:234
process version:0, version_count:1, state:234, row:0, col:7
lex_internal state:157, row:0, column:7
  consume character:'='
lexed_lookahead sym:=, size:1
shift state:15
process version:0, version_count:1, state:15, row:0, col:8
lex_internal state:25, row:0, column:8
  consume character:'$'
lexed_lookahead sym:$, size:1
shift state:140
process version:0, version_count:1, state:140, row:0, col:9
lex_internal state:44, row:0, column:9
  consume character:'B'
lexed_lookahead sym:variable, size:1
shift state:82
process version:0, version_count:1, state:82, row:0, col:10
lex_internal state:13, row:0, column:10
lex_internal state:0, row:0, column:10
lexed_lookahead sym:end, size:0
detect_error
resume version:0
recover_with_missing symbol:
, state:5
recover_eof
select_smaller_error symbol:ERROR, over_symbol:ERROR
select_smaller_error symbol:ERROR, over_symbol:ERROR
select_smaller_error symbol:ERROR, over_symbol:ERROR
select_smaller_error symbol:ERROR, over_symbol:ERROR
select_smaller_error symbol:ERROR, over_symbol:ERROR
select_smaller_error symbol:ERROR, over_symbol:ERROR
process version:1, version_count:11, state:303, row:0, col:10
no_lookahead_after_non_terminal_extra
reduce sym:line_continuation, child_count:1

which seems to suggest an issue in error recovery.

On x86_64, I get a parse error instead of a segfault.

Environment:

  • tree-sitter: 0.20.6
  • OS: MacOS 12.4
  • arch: aarch64

This is maybe better suited as an issue in https://github.com/tree-sitter/tree-sitter/, so let me know if this should just be migrated there.

kopecs avatar Aug 16 '22 20:08 kopecs

Hi @kopecs! Thanks for the report.

When you say "attempting to parse the file", how exactly are you doing that? When I run tree-sitter parse on a file with that content, I can reproduce the error, but I'm unable to reproduce the segfault. I'm also using tree-sitter 0.20.6 on a aarch64 mac.

camdencheek avatar Aug 17 '22 23:08 camdencheek

I'm also running tree-sitter parse. Running the following reproduces this for both myself and a coworker:

git clone [email protected]:camdencheek/tree-sitter-dockerfile.git \
    && cd tree-sitter-dockerfile \
    && printf 'LABEL A=$B' > Dockerfile \
    && tree-sitter generate \
    && tree-sitter parse Dockerfile

In case it's helpful, my C compiler is

Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: arm64-apple-darwin21.5.0
Thread model: posix

kopecs avatar Aug 18 '22 04:08 kopecs

Oh, sure enough, a fresh clone reproduced this. Turns out my main branch wasn't up to date 🤦

The "doesn't handle newlines" part of this is almost definitely a bug in the Dockerfile grammar, but tree-sitter probably shouldn't be segfaulting. I opened an issue upstream with your reproduction steps, but I'm going to leave this issue open until I get around to fixing the newline handling causing the parse error.

camdencheek avatar Aug 18 '22 15:08 camdencheek

Closing since the upstream issue has been closed and I can no longer reproduce this with the latest version of treesitter

camdencheek avatar Apr 19 '24 20:04 camdencheek