ocaml-tree-sitter-semgrep icon indicating copy to clipboard operation
ocaml-tree-sitter-semgrep copied to clipboard

Split into core repo and grammars

Open mjambon opened this issue 3 years ago • 1 comments

Goal: make it simple to use the ocaml-tree-sitter runtime library as a git submodule without pulling the big tree-sitter-* submodules that contain generated files for real-world languages.

Proposed split:

  • ocaml-tree-sitter-core repo containing:
    • src and scripts folders with all the machinery for code generation and runtime
    • tests folder with the end-to-end tests on simple grammars
  • ocaml-tree-sitter-semgrep repo containing:
    • lang folder which wraps around real-world grammars and includes tree-sitter-* repos as submodules
    • ocaml-tree-sitter-core repo as a submodule
  • ocaml-tree-sitter-languages repo: similar to ocaml-tree-sitter-semgrep but without the grammar extensions for semgrep patterns. This is for the community of users of tree-sitter and ocaml, independently from semgrep.

Two-step plan

Phase 1

  1. Clone ocaml-tree-sitter to ocaml-tree-sitter-core.
  2. Remove the languages from the lang/ folder in ocaml-tree-sitter-core.
  3. Create core submodule in ocaml-tree-sitter. Remove duplicate code. Create symlinks to core as needed.

Phase 2

  1. Rename ocaml-tree-sitter → ocaml-tree-sitter-semgrep.
  2. Make semgrep use ocaml-tree-sitter-core instead of ocaml-tree-sitter.
  3. Create community repo ocaml-tree-sitter-languages on the same model as ocaml-tree-sitter-semgrep.

[Phase 3 - later]

Simplify the structure of the repos, minimize reliance on symlinks.

Progress

  • [x] phase 1
  • [x] phase 2.1
  • [ ] phase 2.2
  • [x] phase 2.3
  • [x] update documentation in ocaml-tree-sitter-core
  • [x] update documentation in ocaml-tree-sitter-semgrep
  • [x] update links to documentation from semgrep

mjambon avatar May 17 '21 19:05 mjambon

Yep, sounds good

aryx avatar May 18 '21 06:05 aryx