tree-sitter-dart
tree-sitter-dart copied to clipboard
Segmentation Fault and crashed neovim
Crashing neovim completely, Segmentation Fault (Core dumped).
+1, I can replicate when attempting to type at the bottom of the file.
Related open issue in neovim
https://github.com/nvim-treesitter/nvim-treesitter/issues/2972
Workaround For now add this commit
f95876f0ed3ef207bbd3c5c41987bc2e9cecfc97
to lockfile.json in nvim-treesitter plugin. And TSInstall dart
The issue is in scanner.c somewhere, reverting it to the version at f95876f0ed3ef207bbd3c5c41987bc2e9cecfc97 fixes the crash
I made a quick fork where I changed the version (feel free to use it until it's fixed) https://github.com/RobertBrunhage/nvim-treesitter
Just creating an empty dart file and adding one non-white-space character without a trailing semicolon is enough to cause the segmentation fault. If we write the semicolon first and then insert anything before it the parser won't crash.
As simple as this:
printf "p" > konstant.dart && nvim konstant.dart => [SIGSEV]
printf "p;" > konstant.dart && nvim konstant.dart => [works just fine]
Same issue here: Reverting to the older revision works :)
@UserNobody14 I've played around a bit with the scanner and the only way so far I have been able to prevent a segfault without a semi colon at the end is by changing the last return false; in tree_sitter_dart_external_scanner_scan to return lexer->lookahead == '\0'.
This however really sounds like a bug in tree sitter itself we stumbled upon, or it could even be in the grammar.
@UserNobody14 I've played around a bit with the scanner and the only way so far I have been able to prevent a segfault without a semi colon at the end is by changing the last
return false;intree_sitter_dart_external_scanner_scantoreturn lexer->lookahead == '\0'. This however really sounds like a bug in tree sitter itself we stumbled upon, or it could even be in the grammar.
Hey! I don't have idea of why, but it worked for me as well! @UserNobody14 is this a real fix or would it be just masking some other issue? Unfortunately the commit that introduced the bug is very big to allow attempts to fix it without a deep understanding of tree-sitter and Dart language specification.
Let me try adding that fix and running the grammar on a large list of files again, as far as I know it shouldn't cause a big issue. If it works, I'll merge it
Hmmm. After looking into the problem I can't seem to either replicate it or get the solution you posted to work with our tests. I don't have neovim setup with dart, so I tried running all of the tests I traditionally use, like parsing files and such. Sadly I was not able to recreate any of these issues, which I believe indicates it only happens on reruns of the grammer. Unfortunately without an easy way to replicate it I just don't know what I should do.
Near as I can tell the "lookahead == '\0'" line causes it to return true while the string has not yet ended, but this causes nearly all of our tests to fail, and on other projects' scanner.c files, the return false line is still present, so I don't see why it should be causing such an issue.
I found that this configuration below successfully runs the tests, but without an easy way to see if it segfaults, i have no way of knowing if it actually fixes the problem. Does anyone have some easy steps for reproducing the issue?
const bool *valid_symbols) {
// bool ret = false;
// if (lexer->lookahead == '/') {
// return scan_multiline_comments(lexer);
// }
if (lexer->lookahead == '\0') {
return false;
}
if (
valid_symbols[TEMPLATE_CHARS_DOUBLE] ||
valid_symbols[TEMPLATE_CHARS_SINGLE] ||
valid_symbols[TEMPLATE_CHARS_DOUBLE_SINGLE] ||
valid_symbols[TEMPLATE_CHARS_SINGLE_SINGLE]
) {
return scan_templates(lexer, valid_symbols);
}
if (valid_symbols[AUTOMATIC_SEMICOLON]) {
bool ret = scan_automatic_semicolon(lexer);
return ret;
}
while (iswspace(lexer->lookahead)) lexer->advance(lexer, true);
if (lexer->lookahead == '/') {
return scan_multiline_comments(lexer);
}
return false;
}```
You don't need a Neovim setup. Just let the tree-sitter parser run. Instructions https://github.com/ikatyang/tree-sitter-markdown/issues/14 (CC needs to be clang and python in the script python2). I've provided sample inputs that let the parser crash in the original issue, but I would recommend running the fuzzer (maybe even on CI)
If this is really a bug in tree-sitter, you could have a look at our open issue with tree-sitter-prisma which crashes without a custom parser @Kavantix . There is even a closed issue for tree-sitter-pug which hd the same issue that could be fixed by a check for \0
@theHamsta could you link the issues you mentioned?
@UserNobody14 I reproduced it by running tree-sitter highlight on a file containing a few random characters and no semicolon at the end. E.g. dhdh. So no need for using vim while testing
@Kavantix here you go: https://github.com/nvim-treesitter/nvim-treesitter/issues/2972#issuecomment-1141491120 As you say, no need to use vim. I bisected using git rebase -exec '...' using the setup in https://github.com/ikatyang/tree-sitter-markdown/issues/14 . There are some problems with the queries in this repo, so I had to modify the fuzzing main a bit to skip checking the queries. (basically skip all usages of https://github.com/tree-sitter/tree-sitter/blob/b729029a403046375a77a8e320211da64da00629/test/fuzz/fuzzer.cc#L7 in this file
I should really open a PR to document how to setup fuzzing in upstream tree-sitter. We found quite a lot parser issue with tree-sitter's fuzzing setup.
Is the current workaround is the one offered by @GenesisTMS , to revert to SHA f95876f0ed3ef207bbd3c5c41987bc2e9cecfc97 of this repo?
This is a manual step so all new users of Neovim + TS + Dart will have to perform it on their own. Would it be possible to revert the changes done on this repo to that SHA instead?
https://github.com/nvim-treesitter/nvim-treesitter/issues/2972 which was referenced earlier as the matching TS issue is now marked as closed.
Coming to this from nvim-treesitter.
Adding the following to test/corpus/dart.txt replicates the issue with tree-sitter test:
=====================
Weird file
=====================
d
---------------
This should at least help debugging the test.
The reason of the segfault is still a mistery to me, and the scanner.c is cryptic to me, especially scan_templates which does some kind of weird loop things.
Changing the scanner to always return false immediately gives the same crash, so it is not because of the scanner
Now that's interesting...
Any chance to revert back to the working commit until the problem is solved?
nvim-treesitter now uses an older working version for now: https://github.com/nvim-treesitter/nvim-treesitter/commit/d99b4cdd7591ab2e3ff3f962f7cc518908375805
@theHamsta a fuzzing setup guide would be tremendously appreciated, I'd love to fuzz my grammar.
@ahelwer Can't count how often I linked this post https://github.com/ikatyang/tree-sitter-markdown/issues/14#issue-725850717
@theHamsta maybe we could host a wiki page for this, although the comment is good, debugging the error is a bit more complicated than just running the fuzzer :)
Ideally such a documentation is contributed to tree-sitter upstream. On how to create safe, fuzzed parsers.
I guess so. I will look at this next week
Alright I got a commit working with @Kavantix 's highlight dhdh file as well as @vigoux 's "weird file" test (thanks btw!). Unfortunately it's still hard for me to be sure it won't crash again. Let me know if it keeps happening with the latest changes.
I'm not a genius user of scanner.c, sadly, I just kinda copied and pasted from various other tree sitter repos until I got it working. If anyone knowledgeable of C and tree-sitter wants to take a crack at it, I'd be happy to assist/merge your commits.
Seems to be ok now. Fuzzed for 55mins. Can the others please confirm that this is gone now also in Neovim?
I can confirm that this issue has been fixed on neovim, although it seems that the speed for the first load is back to normal
also, the flutter project does not crash now for me
For me the crash still happens on 53485a8f301254e19c518aa20c80f1bcf7cf5c62