Add a reasonable -seed default
Right now, if the -seed flag isn't given, we use no seed at all for the tool's deterministic behavior (hashes, etc). The user can specify -seed=<base64> to supply their own, or -seed=random to get a random one.
We could do better for the vast majority of cases by default, though. For example, the default for -seed when building inside a Go module could be sha256(contents("go.mod") + contents("go.sum")). If one builds in GOPATH mode, or in ad-hoc mode like cd /tmp; garble build foo.go, then the current default of no seed would be kept.
Original idea by @glenjamin, actually. Almost a year ago when I briefly talked about this tool at the local meetup :) Initially I didn't think it was a great idea to lock the tool to only work inside a module, but recently I realised it can just affect the default behavior, which is entirely reasonable.
What if a module only has a go.mod file and no go.sum?
That is indeed an edge case where the default seed wouldn't be as strong, but it also seems like a very barebones use case. The module would need to use zero external dependencies, which is only feasible if you're building something relatively simple.
You could always add a comment to your go.mod too, just for the sake of making it unique without having to add external dependencies.
Actually, something came to mind; if external dependencies aren't obfuscated, an attacker could fairly easily deduce the version of each dependency by building multiple publicly-available versions and comparing the binary contents. Bit by bit, they could write a copy of the original go.mod, and by extension go.sum too. So I'm actually no longer sure that this idea helps us with the default behavior's strength all that much...
Like before, one could always make the go.mod file unique by writing a custom comment, but that entirely defeats the purpose of a sane and strong default behavior. It's the default for a reason: so that it doesn't require further action from the user.
What if we instead use all .go files under a certain filesize e.g. 100kb from the "entry" package and the go.mod + go.sum of the module if present.
That certainly lowers the chances that the default seed can be predicted. But one could still imagine a main package that's just:
package main
import "current/module/cmd"
func main() { cmd.Run() }
I personally dislike this pattern, but some projects and people use it.
I think, in general, this issue is somewhat low priority. If someone wants a strong seed, they should define an entirely random and reasonably long one. And we could always say "there isn't a strong enough default seed we could rely on all the time", and instead document that the default is generally not as strong given the lack of a random/custom seed.
https://github.com/burrowers/garble/issues/275 will improve the recommendations around -seed. I would still like to have a sane seed by default, though I have yet to find a mechanism that I think would work well.
Here's another idea for a sane seed default: the action ID of the original main package. We already have it, and it will change if any of the Go files in the build change, even indirect dependencies. It's virtually impossible for a third party to guess this hash, and it should change with any change to the input source code.
Not sure what I was thinking - that does not work. If I build two binaries separately which depend on the same library package, they would build different versions of the same library package. So that would mean build cache mismatches.
We could say that the build cache only works for a build of a specific main module version, kinda like -seed=random or when manually changing seeds. This could result in the default garble build getting significantly slower though, as a single line change in main.go would mean rebuilding all dependencies.
I think "use a custom or random seed if you prefer security over convenience and fast builds" is a better tradeoff than "use a static seed if you want to use the build cache at all". Go without the build cache is painfully slow to build large projects.
We could say that the build cache only works for a build of a specific main module version
Another downside of any stronger default seed is that garble build ./cmd/foo ./cmd/bar would no longer work, because we'd need to build two versions of the shared dependencies, but the build only compiles one.
a single line change in main.go would mean rebuilding all dependencies.
This would also affect the earlier idea of "use the hash of all Go files in the main package as the default seed". It also affects the idea of using go.sum, but at least one can assume that go.sum changes less frequently.
What about a git commit hash if available?
I think most CI build systems make the commit hash available as an env variable.
At any rate, I think a commit hash would be a good recommendation to pass to -seed.
Personally, I would prefer a git commit hash as a seed for easily reversing panics for a specific app version, without relying on inputs such as garble version, go version, etc. It's also different for each build, so the security aspect is there.
Also, you can't lose the git hash.
(I don't really care about this issue.) A commit hash may be used if the current state is identical to the last commit.
I think for most "production" cases -seed=random and then saving the generated seed should be recommended, the current default behavior is fine for testing.
I guess I don't understand why random is better than the commit hash for a production use case. You also have to go to the trouble of then saving the seed (and somehow correlating it to a specific version of your code eg. a commit, if you want to reverse a panic)