ocaml-tree-sitter-semgrep
ocaml-tree-sitter-semgrep copied to clipboard
Split into core repo and grammars
Goal: make it simple to use the ocaml-tree-sitter runtime library as a git submodule without pulling the big tree-sitter-* submodules that contain generated files for real-world languages.
Proposed split:
-
ocaml-tree-sitter-core repo containing:
-
src
andscripts
folders with all the machinery for code generation and runtime -
tests
folder with the end-to-end tests on simple grammars
-
-
ocaml-tree-sitter-semgrep repo containing:
-
lang
folder which wraps around real-world grammars and includes tree-sitter-* repos as submodules -
ocaml-tree-sitter-core
repo as a submodule
-
- ocaml-tree-sitter-languages repo: similar to ocaml-tree-sitter-semgrep but without the grammar extensions for semgrep patterns. This is for the community of users of tree-sitter and ocaml, independently from semgrep.
Two-step plan
Phase 1
- Clone ocaml-tree-sitter to ocaml-tree-sitter-core.
- Remove the languages from the
lang/
folder in ocaml-tree-sitter-core. - Create
core
submodule in ocaml-tree-sitter. Remove duplicate code. Create symlinks tocore
as needed.
Phase 2
- Rename ocaml-tree-sitter → ocaml-tree-sitter-semgrep.
- Make semgrep use ocaml-tree-sitter-core instead of ocaml-tree-sitter.
- Create community repo ocaml-tree-sitter-languages on the same model as ocaml-tree-sitter-semgrep.
[Phase 3 - later]
Simplify the structure of the repos, minimize reliance on symlinks.
Progress
- [x] phase 1
- [x] phase 2.1
- [ ] phase 2.2
- [x] phase 2.3
- [x] update documentation in ocaml-tree-sitter-core
- [x] update documentation in ocaml-tree-sitter-semgrep
- [x] update links to documentation from semgrep
Yep, sounds good