helix icon indicating copy to clipboard operation
helix copied to clipboard

Inconsistent/undocumented lowercase usage for grammars/languages

Open kleinweby opened this issue 3 years ago • 1 comments

Summary

I tried to integrate a grammar for BitBake files into helix. This grammar uses BB as its name. While a bit odd, I don't think its prohibited to use uppercase letters for the grammar name.

However this proves to be difficult to use with helix, as it will use to_ascii_lowercase() for some operations. Furthermore this is not even consistent, so declaring a grammar with name = "BB" and building it will produce a BB.so but loading fails because it can not find a bb.so. (I know that even if that would have worked it still would not find the tree_sitter_<name> symbol, as it is also lowercased.)

Note: when changing the tree-sitter grammar to use a lowercase name, and adjusting all my configs accordingly it works fine.

Is this forced lowercase there for a reason? Then I think this should be both documented and at least hx --grammar build should be fixed.

In my opinion, helix should just use the case supplied.

Reproduction Steps

Put the following into languages.toml:

[[language]]
name = "BB"
scope = "source.BB"
file-types = ["bb", "bbclass", "bbappend", "inc"]
roots = []
grammar = "BB"
comment-token = "#"

[[grammar]]
name = "BB"
source = { git = "https://github.com/nateglims/tree-sitter-bb.git", rev = "d8c986da8cd043551e3084280c2f3c1edd6e49c8" }

Place the queries into runtime/grammars/BB/.

Fetch and build hx --grammar fetch && hx --grammar build

Open a BitBake file and be happy. (Example file to be found here

Platform

Linux

Terminal Emulator

alacritty 0.10.1

Helix Version

helix 22.08.1 (9b7f349f)

kleinweby avatar Oct 18 '22 12:10 kleinweby

There's certainly a convention for tree-sitter grammars to have lowercase language names but it looks like it's not enforced by tree-sitter-cli. In this case we can't use lowercase because tree-sitter-bb's parser function is named tree_sitter_BB.

I don't think there's any reason to limit ourselves to lowercase here: the relevant to_ascii_lowercase calls should be removed.

the-mikedavis avatar Oct 18 '22 23:10 the-mikedavis