syntect
syntect copied to clipboard
Curated list of languages
Extracting from #113 into its own issue.
Many people expressed a desire to add a support for new languages. The current story is unclear because the list of syntaxes is coming from https://github.com/sublimehq/Packages/ and its README says
Pull requests for new packages won't be accepted at this stage, as new packages can cause issues for users who have a package with the same name installed via Package Control. There are some planned changes that will address this in the future.
Decouping list of supported languages from sublimehq/Packages would allow to move forward.
List of already discussed options:
- Take dependency on some other currated list of syntaxes, i.e https://github.com/Keats/gutenberg/tree/master/sublime_syntaxes or https://github.com/github/linguist/tree/master/vendor/grammars
- create a new project for a list of curated grammar files and consume it from
syntect
In https://github.com/trishume/syntect/pull/113#issuecomment-389018043 @trishume said
If someone's willing to do the work to automatically convert all those tmBundles to sublime-syntaxes and add a cargo feature to bundle the extra syntaxes, I'm probably willing to accept that. Might need to set up Git LFS for the packdumps for that so they don't bloat the Git repo too much, but that's fine.
I'm totally okay with some curation though, especially since we can use sublime-synaxes I bet there are some higher quality syntaxes for certain languages than the ones Github uses.
I'm opening this issue in a hope to continue this conversation and start some work.
I would definitely be interested in this as well. I currently also maintain a (small) list of syntaxes for bat: bat/assets/syntaxes.
I think a separate repository might be useful such that the syntect repository will not be polluted with issues/PRs for new syntaxes(?).
Concerning linguist, note that a lot of these are atom syntaxes. Still, this could be a useful resource.
Regarding logistics, I think Git submodules are a great way of bundling different repositories since we keep the link to the upstream source. Unfortunately, a lot of Sublime Syntax repositories do not contain a .sublime-syntax file. Ideally, I think, the .tmBundle => .sublime-syntax conversion should be done by some script when the syntax-bundle repository is "built".
Also, I just found this fork of sublimehq/Packages which is used by syntect_server. It has a lot more syntaxes than the default Packages repo.
Just saw this. I am indeed also interested in how we could make adding more language syntaxes a more official process (there was some prior discussion of this at https://github.com/sourcegraph/syntect_server/issues/3 but the issue is a bit stale unfortunately).
I saw you found my fork of sublimehq/Packages @sharkdp :)
In that repo all I've really done is taken existing .tmLanguage files and converted them to .sublime-syntax via ST3's builtin command (and added some SOURCE and VERSION files to keep track of where they came from), e.g. https://github.com/slimsag/Packages/tree/master/Swift
But one thing to consider is: how can we override what is provided by sublimehq/Packages, as well? For example, with JavaScript/JSX files we've found e.g. https://github.com/babel/babel-sublime to be superior than sublimehq/Package's builtin JavaScript syntax, but doing so requires either replacing or having some type of package disablement feature like ST3 has.
I'm late to this discussion, but have reason to be looking for just such a thing right now, and it seems like a shared repo which we can all contribute to (combined with a relatively simple config file and build.rs to allow subsetting?) would be a better long-term solution than everyone maintaining their own forks and repositories.
If folks are interested, and @Keats is up for it, the work they've done with Zola (formerly Gutenberg) seems like it's pretty ready-made for this as a well-set-up starting point.
Thoughts?
Yup that sounds reasonable. Would be nice if it was just a crate that people could include that depended on syntect and just provided functions kind of like the existing ones for loading default syntaxes. Maybe with Cargo features to disable really obscure languages or something.
If someone makes such a crate/repo I will link to it prominently in the readme and docs.
How would I go about getting mdcat to highlight zig files? https://github.com/ziglang/sublime-zig-language/
@daurnimator probably mdcat would need to use a syntax dump created from the list of syntaxes of bat, zola or syntect_server. Then just add the Zig language file to the repo of syntaxes you're using.
Current blockers by syntect using sublimehq/Packages
Currently, syntect neither supports branching in ST4050+ (compare #271) nor v2 related changes of ST4075+.
Compare the new docs at:
- https://www.sublimetext.com/docs/syntax.html#ver-dev
- https://www.sublimetext.com/docs/syntax.html#ver-3.2
Recent changes at sublimehq/Packages
Major re-writes for ST4xxx+:
- CSS: sublimehq/Packages#2556
- Java: sublimehq/Packages#2654
- JavaScript:
- JsDoc: sublimehq/Packages#2557
- TypeScript/TSX: sublimehq/Packages@ec091f8
- move to sublime-syntax
v2: sublimehq/Packages#2418 - for more compare: https://github.com/sublimehq/Packages/commits/master/JavaScript
- Erlang: sublimehq/Packages#2464
- Markdown: sublimehq/Packages#2339
- Haskell: sublimehq/Packages#2225
- PHP: sublimehq/Packages#2275
- Rust: sublimehq/Packages#2305
- there are probably more I missed
Missing languages at sublimehq/Packages
SublimeHQ would like to have support for Swift. Old related PRs that have stalled:
- sublimehq/Packages#11
- sublimehq/Packages#253
There are currently no plans to accept languages other than Swift to https://github.com/sublimehq/Packages as far as I know.
This looks like something the syntect projects really needs. Broot deals with the problem by using the work made by the bat project, which looks like a very good starting point, but an alternative would be convenient for everybody and it shouldn't be based on git submodules which don't really make sense as a packaging solution. A crate with features looks like a good solution.
Have there been any decisions or progress made on this in the last year? I'm particularly interested in Elixir support (a la https://github.com/trishume/syntect/issues/134).
If there've been no updates, that's totally okay. I just wanted to check! 😄