v9.0.0 wishlist
Hey all, thanks for all the support and contributions. I'm thinking about doing a breaking change, namely merging protect into the detect command. I haven't committed totally to this change but it's something that's been in the back of my mind for a while now.
If you have other suggestions or features you'd like to see added, throw 'em in this issue.
Thanks!
Zach
@zricethezav I'm not sure when the v9 is planned, but I think #1059 should definitely be in there, as it currently makes using .gitleaksignore in a CI environment rather unusable. It seems like a relatively easy fix, but I'm not familiar with Go.
I've been meaning to respond to this for weeks. To prevent me from putting this off further, I'm going to incrementally update this comment.
Right now this isn't in any particular order and is a mix of "I feel really strongly about this" or "this is an idea I had". I will try to refine and categorize it later, as some of these are half-baked.
-
#1162 In my opinion, the current
fingerprintimplementation is brittle and trying to add everything you need to.gitleaksignorecan be a bit of a pain-point. It would be more straightforward to "fingerprint" the secret itself and use that for.gitleaksignore. -
Clarify If
keywordsMatches Path I wrote this down in a note a long time ago. I forget why, and I will remove this if I can't figure it out. -
Clarify or Rework "Stopwords" In my experience using Gitleaks, as well as helping other members of the community, the concept of "stopwords" is confusing and difficult to wrap your head around. Does it match the entire secret? Is it a regular expression or a static string? Is it case sensitive? etc.
-
#1212 I don't feel particularly strongly about this, but I know it's caused issues for some people.
-
Support Multiple Allowlist Targets Right now you can only chose a single
regexTargetper allowlist (global or rule). This makes it awkward when you want to ignore multiple patterns, where one is the secret (e.g.,{{ secrets.MY_SECRET}}) and another is the match or line (e.g.,passwordFile: ...ortoken_id: ...). The ability to definite separate regexes forsecretandmatchorlinewould be a nice quality-of-life improvement. -
Support Allowlist matching
path && regexPresently,rule.path+rule.regexandrule.allowlist.paths+rule.allowlist.regexesbehave differently. Unlikerulewhich matchespath && regex, theallowlistwill matchpath || regexwhich can lead to unexpected results. -
Make
rule.idMandatory and the Primary Identifier Right nowrule.idis not required. This has lead to some bugs like #1261, as well as confusing error messages like "Failed to load config error="$DESCRIPTION invalid regex secret group..." that rely on therule.description(there was one about the sarif output but I can't find it atm). -
Support Multiple Secret Groups As mentioned in this comment, I was experimenting with changing
secretGroup: ntosecretGroups: [ n1, n2, ...].There are two potential use cases for this: 1. combing multiple pieces of input (e.g.,
username: "john"\npassword: "titor"returning bothjohnandtitorrather than justtitor) 2. returning the first non-empty capture group when you have mutually exclusive ones. -
Support Grouping Detection by Secret Rather than outputting a CSV line/JSON object per result, it would be cool to have an output that show a secret with all the locations grouped together. I imagine this would be especially beneficial for SARIF to reduce the number of alerts. I know this is possible with GitHub's first-party alerts, but would have to confirm that it's possible with third-party ones.
Non-Breaking Changes
Config
- Add stricter validation for gitleaks.toml to prevent typos from causing silent failures (there's a PR for this but I can't find it at the moment).
Tests
- Move true/false positive test cases into actual
x_test.gofiles. The main benefit of this is leveraging Go's test caching, which will be necessary with more/larger test cases. - Require at least 2 (arbitrary number) test cases for true and false positives for each rule.
- Make generation and validation logic public so that people can reference those methods from their own homebrew scripts to generate configs (maybe I'm the only one).
In my opinion, the current fingerprint implementation is brittle and trying to add everything you need to .gitleaksignore can be a bit of a pain-point. It would be more straightforward to "fingerprint" the secret itself and use that for .gitleaksignore.
@rgmz ahh, interesting. So if the secret is password123 the idea would be to hash that and use that as the fingerprint. Effectively leaving location and source data out of it?
Clarify or Rework "Stopwords" In my experience using Gitleaks, as well as helping other members of the community, the concept of "stopwords" is confusing and difficult to wrap your head around. Does it match the entire secret? Is it a regular expression or a static string? Is it case sensitive? etc.
I can see how stopwords is confusing. Stopwords are an efficient way to check if the secret contains any stopwords rather than using regexes in an allowlist. Allowlist.regexes could be applied to the line, match, or secret depending on allowlist.regextarget. The idea was, I think, to provide users the ability to define a regex filter that could be applied anywhere and provide stopwords. Typing this all out reveals to me that it is a bit confusing.... maybe removing stopwords and instead allowing multiple allowlists per rule would be a better design... I'll have to think on this one
Support Allowlist matching path && regex Presently, rule.path + rule.regex and rule.allowlist.paths + rule.allowlist.regexes behave differently. Unlike rule which matches path && regex, the allowlist will match path || regex which can lead to unexpected results.
I think this could be accomplished pretty easily by adding something like
[[rules]]
# ... my rule
[rules.allowlist]
regexTarget = "line"
regexes = [
"blah",
]
paths = ["foo", "bar"]
pathsAndTargetOperator = "or" # or "and"
Any chance we could source the .gitleaksignore via GIT if path does not exist? I'm thinking in terms of pre-receive hooks where gitleaks is getting called from basically a git config directory, with none of the actual files, just info/objects/refs etc.
Current workaround is something like this:
git show $to:.gitleaksignore > .gitleaksignore
Perhaps the detection logic could fall back to sourcing from the repo directly unless no-git flag is set
r, err := git.PlainOpen(".git")
if err != nil {
log.Fatal(err)
}
ref, err := r.Head()
if err != nil {
log.Fatal(err)
}
commit, err := r.CommitObject(ref.Hash())
if err != nil {
log.Fatal(err)
}
tree, err := commit.Tree()
if err != nil {
log.Fatal(err)
}
f, err := tree.File(".gitleaksignore")
if err != nil {
log.Fatal(err)
}
ignore, err := f.Contents()
if err != nil {
log.Fatal(err)
}
fmt.Println(ignore)
Any way to disable an inherited rule from an extended base config would be very helpful 🙂
It's great to be able to ignore/exclude secret-fingerprints, path-regex, commits and single lines (via comment), but if there is just one rule in the base config which regularly creates false positives you're pretty much forced to stop using the extension mechanism and lose the ability to be kept "up-to-date" automatically 😔