comby
comby copied to clipboard
Feature request: add Nix language
First of all, thank you for this amazing tool!
Just to give you some context, there is a language called Nix
used in the linux distribution called NixOS
that is similar to haskell except it doesn't have types. Since it is used for a distribution, there often are bash injections inside strings.
As an example, there is a repo called nixpkgs (300k commits, so a little big), where I was trying to find the use of bash extglob
usage. Unfortunately the generic matcher did not work
comby '!([:pattern])' '' -match-only -newline-separated
even though the following file does contain the pattern. Admittedly, the pattern is contained inside a string (delimitted with '') so it might be harder to detect.
Here are some more information about the nix language
comment delimiter #
# this is a comment
single line string delimiter "
" this is a single line string"
multiline string delimiter ''
''
this is a multiline
string
''
The hard part being that any string can contain valid shell code (almost always bash).
If you ever have any questions, I would be happy to supply more informations.
Hi! Thanks for the context. Couple of things:
-
It makes sense to add a matcher for
.nix
, so I'll do that in the next version. -
Due to the issues you point out, the matcher for
.nix
might be a little more 'coarse', in order to get around some of the issues you mention. -
For your specific pattern, actually, I think there is a typo-> it should be
:[pattern]
not[:pattern]
(easy mistake to make). When you use the:[pattern]
syntax, it will actually match:
LINK (scroll down in the input to see the match)
But! I should caution that it is only matched in this case because we're "lucky". We're lucky because the generic matcher interprets the ''
as an empty string, and the body of the multiline string as "code". This is not ideal, what we actually want is to treat the multiline string data as data, not code.
In comby, when we want to match inside a string, we will specify the kind of quotes for a string, and then a pattern in side it. So if a real .nix
matcher existed, you would match this string you are interested in as '':[before]!(:[pattern]):[after]''
, with the quotes, and you would use :[before]
and :[after]
to explicitly match any thing before or after in the string. You could also take a shortcut and just use ...
, so like ''...!(:[pattern])...''
.
As you note, the "data" of a string may actually be code in nix files! So sometimes, we don't want to write a pattern that treats the contents of a data string as data, but we want to treat this data as code. For that, comby has some experimental syntax in the works, that will let you match string data as code, for any kind of code (bash, C, whatever). This isn't possible yet, but doing that would involve matching + a rule. Here is an example that I'd like to support at some point:
'':[data]''
(this matches all multiline string contents)
and a rule:
where match :[data] as .sh { !(:[pattern]) -> true }
# this will parse/match any contents inside :[data]
as shell/bash syntax.
This match as
rule is also useful when code embeds strings that represent structured GQL queries, for example, like
where match :[data] as .qgl { ... }
Thanks a lot for the detailed answer!
I was wondering if multiline matching was possible. One thing I wanted to do but failed was, check for extglob
then match for things that are in the shape of !(:[pattern])
. extglob is just a flag that makes some feature possible. I tried with
comby 'extglob...!(:[pattern])' '' -match-only -newline-separated
. Perhaps multiline matching is not available yet, right ?
I liked the rules as well. However there was no details on how to use them with the cli. They always appear as a separate string, so I didn't know how to try them on a huge repo.
I was wondering if multiline matching was possible.
In the current version, to do the style of multiline matching in that example, you need to add a command-line argument -match-newline-at-toplevel
This will actually happen in the live site. When you click on the "Run in terminal" button, it will generate that command-line argument for you:
COMBY_M="$(cat <<"MATCH"
extglob...!(:[pattern])
MATCH
)"
# Install comby with `bash <(curl -sL get.comby.dev)` or see github.com/comby-tools/comby && \
comby "$COMBY_M" '' -stats -match-newline-at-toplevel
I liked the rules as well. However there was no details on how to use them with the cli. They always appear as a separate string,
On the command-line, you can either use -rule '<rule>'
on a single line, or use a multiline rule by using a bash variable and heredoc. The latter is tricky to craft by hand. But if you like, the live site will generate the bash command. Here is an example generated by the link if you click "Run in terminal" on that site:
COMBY_M="$(cat <<"MATCH"
extglob...!(:[pattern])
MATCH
)"
COMBY_RULE="$(cat <<"RULE"
where :[pattern] == lightwalletd
RULE
)"
# Install comby with `bash <(curl -sL get.comby.dev)` or see github.com/comby-tools/comby && \
comby "$COMBY_M" '' -rule "$COMBY_RULE" -stats -match-newline-at-toplevel
The where :[pattern] == lightwalletd
can then be a multiline string (and so can the MATCH
part, etc.)
Thank you that is super helpful! It's working wonders!