py-tree-sitter-languages icon indicating copy to clipboard operation
py-tree-sitter-languages copied to clipboard

Language detection

Open tmm1 opened this issue 2 years ago • 2 comments

:wave: first off, huge thanks for putting this package together!

i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?

i see that each language implementation has a package.json section for tree-sitter configuration:

https://github.com/tree-sitter/tree-sitter-python/blob/master/package.json#L28-L32 https://github.com/latex-lsp/tree-sitter-latex/issues/19

perhaps the build process could pluck out these entries and make them available? so then a user could simply apply the file_types and content_regex rules to figure out what language to use.

tmm1 avatar Jul 27 '23 17:07 tmm1

i'm wondering, with all these languages available what is the recommended way to pick a parser/language for a given file?

there's some logic in the tree-sitter cli that does this, but unfortunately its not part of the actual library

i guess most people are using linguist or integrating into editor environments where they already have textmate compatible language detection.

seems like it may be beneficial to port and bundle the detection code in this python package, so users don't have to reimplement it. wdyt?

here's the reference impl:

https://github.com/tree-sitter/tree-sitter/blob/b8f7645ae2a5e240e67f968c89328af280055c9f/cli/loader/src/lib.rs#L207-L223

cc @nathansobo do i have this right?

tmm1 avatar Jul 27 '23 18:07 tmm1

I’m open to a PR but unlikely to do it myself.

grantjenks avatar Jul 27 '23 19:07 grantjenks