py-tree-sitter-languages
py-tree-sitter-languages copied to clipboard
Language detection
:wave: first off, huge thanks for putting this package together!
i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?
i see that each language implementation has a package.json section for tree-sitter configuration:
https://github.com/tree-sitter/tree-sitter-python/blob/master/package.json#L28-L32 https://github.com/latex-lsp/tree-sitter-latex/issues/19
perhaps the build process could pluck out these entries and make them available? so then a user could simply apply the file_types and content_regex rules to figure out what language to use.
i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?
there's some logic in the tree-sitter cli that does this, but unfortunately its not part of the actual library
i guess most people are using linguist or integrating into editor environments where they already have textmate compatible language detection.
seems like it may be beneficial to port and bundle the detection code in this python package, so users don't have to reimplement it. wdyt?
here's the reference impl:
https://github.com/tree-sitter/tree-sitter/blob/b8f7645ae2a5e240e67f968c89328af280055c9f/cli/loader/src/lib.rs#L207-L223
cc @nathansobo do i have this right?
I’m open to a PR but unlikely to do it myself.