helix
helix copied to clipboard
Allow using path suffixes to associate language file-types
Related:
- #2443
- #1426
After playing around with shellcheck, I think it's not the way to go mostly because of the path separator between linux and windows (\ vs /). Hence this PR uses globbing instead
Now it's possible to associate for example .git/config to the git-config grammar by adding the following
[[language]]
name = "git-config"
file-globs = ["*.git/config"]

~~Also, I didn't manage to get the ssh-client-config to work. Anyone has any ideas.. ?~~
SSH client config
[[language]]
name = "sshclientconfig"
scope = "source.sshclientconfig"
file-globs = ["*.ssh/config"]
roots = []
[[grammar]]
name = "sshclientconfig"
source = { git = "https://github.com/metio/tree-sitter-ssh-client-config", rev = "769d7a01a2e5493b4bb5a51096c6bf4be130b024" }

Do we really need file globbing here? Trying to match the end of the file path to a partial path given in file-types covers most of the cases that I can think of. So something like this:
[[language]]
name = "ssh-client-config"
scope = "source.ssh-client-config"
file-types = [ ".ssh/config" ]
We will definitely lose flexibility without globbing, so dropping it might not be a good idea anyway. A suggestion is to use the current file-types key instead of introducing a new file-globs key; i.e. file-types could itself understand glob patterns.
@sudormrfbin
I thought about those approaches before opting for globbing, here is what I ran into when considering those approaches
Trying to match the end of the file path to a partial path given in
file-typescovers most of the cases that I can think of. So something like this:[[language]] name = "ssh-client-config" scope = "source.ssh-client-config" file-types = [ ".ssh/config" ]
This runs into 2 problems:
- Handling of the '/' in a cross-platform way (on windows, the path of the file will be .ssh\config which won't match)
.ssh/configis relative to what ? even if the file is opened from the home directoryhx .ssh/config, the path will be its absolute path aka/home/user/.ssh/configwhich won't match with.ssh/configwithout some sort of wacky string comparison. The alternative is to do something like described in comment which matches the exact path enabling something likefile-types = ["~/.ssh/config"]orfile-types = ["$root/.git/config"]. But I also discarded that approach because I couldn't think of a clean way to do it in a cross-platform manner (/vs\\)
A suggestion is to use the current
file-typeskey instead of introducing a newfile-globskey; i.e.file-typescould itself understand glob patterns.
Handling each of the strings in the file-types array as a glob feels kind of awkward and might result in undesirable behavior in some cases. Handling globs separately felt more natural
I'm a bit wary of importing another crate for something this small too.
- Handling of the '/' in a cross-platform way (on windows, the path of the file will be .ssh\config which won't match)
We can document we expect / in these strings, then replace / with https://doc.rust-lang.org/std/path/constant.MAIN_SEPARATOR.html internally
.ssh/configis relative to what
It doesn't have to be relative to anything. You'd take the absolute filepath and check for ends_with?
I'm a bit wary of importing another crate for something this small too.
- Handling of the '/' in a cross-platform way (on windows, the path of the file will be .ssh\config which won't match)
We can document we expect
/in these strings, then replace/with https://doc.rust-lang.org/std/path/constant.MAIN_SEPARATOR.html internally
.ssh/configis relative to whatIt doesn't have to be relative to anything. You'd take the absolute filepath and check for
ends_with?
I guess that would work, the behavior sounds a bit like a subset of globbing tho. But ok, I understand your point of not wanting to add dependencies that could be easily implemented by hand.
I'll update this PR soon
Yeah, it seemed confusing to me too, some of the cases could even overlap. I think a better way is to make use of globbing, so that is easier for users to learn and don't have overlaps.
- File name is an exact match -
file - File extension is an exact match -
*.ext - File path contains the value as a suffix -
**/*.extor**.ext - File path
.git/config
That is what I initially proposed in https://github.com/helix-editor/helix/pull/2455#issue-1231137605 but I ended up changing my mind because I agreed with https://github.com/helix-editor/helix/pull/2455#issuecomment-1123054205 because the other approaches seem to bring a lot more flexibility in order to handle use-cases that probably won't occur
1. File name is an exact match - `file` 2. File extension is an exact match - `*.ext` 3. File path contains the value as a suffix - `**/*.ext` or `**.ext` 4. File path `.git/config`
1 and 2 are already what's implemented; 3 and 4 are essentially the same thing - there is no exact path matching since when you open let's say hx .git/config, its path is going to be /home/user/Source/project/.git/config, so you would need to match the suffix for it to be detected (which is what's implemented).
3 approaches have been discussed so far:
- with
shellexpand
[[language]]
name = "ssh-client-config"
file-types = ["~/.ssh/config"]
[[language]]
name = "git-config"
file-types = ["~/.gitconfig", "${workspace}/.git/config"]
- globbing (note that we would use a attribute here)
[[language]]
name = "git-config"
file-globs = ["*.git/config"]
[[language]]
name = "ssh-client-config"
file-globs = ["*.ssh/config"]
- Manually doing suffix matching which essentially boils down to doing pseudo-globbing, allowing
[[language]]
name = "git-config"
file-types = [".git/config"]
[[language]]
name = "ssh-client-config"
file-types = [".ssh/config"]
IMO, the objectively cleanest approach would be to separate each strategy and add a field like this
enum FileMatchStrategy {
ByName(String),
ByExtension(String),
ByGlob(String),
ByShellExpression(String),
}
struct LanguageConfig {
//* fields *//
file_matchers: Vec<FileMatchStrategy>,
}
Which would allow the user to configure what strategy takes precedence over the other in a flexible way (also this way, we could add strategies without breaking existing configs)
[[language]]
name = "git-config"
file-match = [
{ by-shell-expression = "~/.gitconfig" },
{ by-shell-expression = "~/.config/git/config" },
{ by-glob = "*.git/config" }
]
But again, the problem with this solution is that the use cases don't warrant that kind of flexibility
I do think being explicit about the strategy would make the config a lot easier to comprehend, especially to newcomers. With this addition, I can't help but feel like you'd have to look at the source code to completely understand how the option behaves.
How about now ?
That looks good to me, thanks!
I wish there is glob for full path so it can be a prerequisite for https://github.com/helix-editor/helix/pull/3249#discussion_r933859990,
I stole part of this and put it there.
I was doing a fallback approach on https://github.com/helix-editor/helix/pull/3249 since the mechanism I originally think of can be confusing when it matches multiple records, maybe you might want to take a look at that?
So what's blocking here ? Are we waiting for another approval or do we need to change the design/approach ?