tree-sitter-kotlin icon indicating copy to clipboard operation
tree-sitter-kotlin copied to clipboard

Add notes on contributing, especially w.r.t the parser verification check

Open fwcd opened this issue 11 months ago • 2 comments

New PRs regularly fail the CI check that verifies that the generated parser matches the committed parser.c, presumably because their development machine didn't generate a byte-for-byte-exact copy of the parser as the GitHub Actions runner.

Perhaps we could add some notes to clarify the motivation behind this check and how to successfully generate a conforming parser.c. Something along the lines of:

  • The check exists since it is infeasible to review a ~20 MB C source file. Suggestions for something less brittle than a byte-for-byte check are of course welcome, however.
  • Unfortunately, the output of tree-sitter generate only seems to be deterministic when run on the same platform. Since the GitHub-hosted runners use x86_64 Linux, contributors have to generate the parser from that platform too (even aarch64 Linux/macOS will produce a different output).
  • One option besides running Linux natively or e.g. through a Docker container would be to use the regenerate-parser.yml workflow in their own fork. This would potentially require customizing the branch it runs on.

fwcd avatar Jul 15 '23 16:07 fwcd

Do we really need parser.c in the repo. The tree-sitter-swift or tree-sitter-java don't have it in the repo either.

https://github.com/alex-pinkus/tree-sitter-swift#frequently-asked-questions https://github.com/tree-sitter/tree-sitter-java

ketkarameya avatar Jul 17 '23 03:07 ketkarameya

tree-sitter-java does in fact check it in (see here), but I do like the approach of tree-sitter-swift, that is, omitting the parser and publishing it as a CI artifact. That would probably also help with repo growth, which may become problematic due to the frequent changes in parser.c.

The only reason I am a bit hesitant is that it seems to deviate from the dominant convention of including the parser, even if it's comparatively large. See e.g.

fwcd avatar Jul 21 '23 00:07 fwcd