Please allow building mitmproxy-rs without tree-sitter
Problem Description
I'm a Debian Developer and am currently working with the Debian Rust team and the other mitmproxy maintainers on getting mitmproxy back into Debian from which it was removed in July 2024. What is holding this back is the packaging of the Rust parts in mitmproxy-rs. Specifically, packaging tree-sitter proves to be very challenging.
Proposal
Add a feature, for example called "highlight" to mitmproxy-rs. The feature is enabled by default. Disabling the feature will disable syntax highlighting and will make mitmproxy-rs not require tree-sitter.
Alternatives
We can also patch out tree-sitter in Debian with a Debian-specific patch but it would be cleaner to have this happen upstream to avoid having to carry a patch.
Additional context
Debian releases a stable version around every two years. Packages in the stable release are then supported for up to 10 years. Performing security and maintenance tasks for more than 30k source packages can only work well if each software package performs regular stable releases which we can ship and (if needed) easily fix. This is not how the Rust eco-system operates for the most part as embedding/vendoring code copies of arbitrary versions is not the exception but the norm. Packaging the tree-sitter grammars for xml, javascript, css and yaml can be done but will be a very lengthy process. It would be much easier to package a version of mitmproxy without syntax highlighting today and add syntax highlighting once the respective tree-sitter grammars are packaged later. Since we are doing this work as volunteers, it is hard to estimate how long that would take.
I am essentially done with packaging the remaining Rust crates that were missing and am only blocked by tree-sitter right now.
Thanks!
Hi @josch,
Thank you for your packaging work! Can you clarify what makes tree-sitter so particularly hard to support?
Hi @mhils, I'm one of josch's fellow Debian Developers and have been driving most of the tree-sitter packaging, so I'll try to fill in the details.
Creating the grammar packages themselves is actually fairly repetitive. They all follow the same basic structure, so it's mostly copy/paste from one to the next.
The bigger issues are around handling tree-sitter and the grammars from a distribution's perspective, where we typically only have 1 version of any given project packaged.
Given that the grammar itself is, in essence, part of the API that each tree-sitter-* exposes, that can lead to incompatibilities between the particular version of the grammar that's packaged and the queries applications have written against the grammar. For example, neovim has developed a number of queries around specific versions of the c, lua, markdown, etc. grammars. Updates to those grammars have, many times, required updates to neovim's queries so they can continue to work.
There's also the issue, albeit not as interesting for rust projects, that there's no standard location or naming for the parser shared objects. Neovim, emacs, helix and likely other projects all have their own ideas about that.
I tried to accommodate these aspects when I started packaging grammars in Debian by making it so that the packaged grammars provide their source in versioned paths so each application can build against what it specifically needs. So far neovim is the only package using the parsers, so it's been fairly contained, but this isn't really scalable. I've been planning to undo this over-engineering and just ship the built parser shared objects, providing symlinks to handle the differences in how applications expect them to be named. That doesn't address the issue of incompatibilities between grammar versions, though.
I'm hopeful that as the tree-sitter ecosystem matures some of these rough edges will get cleaned up, but these are the main reasons I've been slow to spread use of tree-sitter within Debian.