vault
vault copied to clipboard
Vault binary size increase (+44%) from 1.14.1 to 1.14.2
The binary file has increased by 44%, and I don't see a rational explanation for this.
247M Jul 26 14:25 vault_1.14.1
355M Sep 8 14:25 vault_1.14.2
linux amd64
Describe the bug N/A
To Reproduce N/A
Expected behavior N/A
Environment: N/A
Additional context N/A
We are aware of the size of the binary. Many design decisions are a factor, for example the debug symbols upon which many of our customers rely for information and troubleshooting. We are constantly evaluating how we can improve here, however, no firm decisions have been made yet and may not be for some time. Are there issues that you are facing here, such as with container sizing? Please let us know the details of any operational headache that you may be experiencing as a result. Thanks!
Even so, that's a remarkable change in size for a patch release.
I was curious, so I started git bisecting. Turns out most of it is down to pulling in new/more code from the Azure SDK in 486f7d0fda27b057959d5c907749e6f291237778, and then a smaller increase from 5a37c6f0d74a11ad614f9bb2f3354fd79071617e (more Azure SDK code, and some other things)
EDIT: It turns out that 27% of the binary size of vault 1.14.2 (self-compiled, no web UI) is just the Azure secrets engine, Azure auth method, and Azure auth support for Vault agent/proxy ! If you take out enough other code to get rid of all use of github.com/Azure/azure-sdk-for-go/... (Azure auto-seal, snowflake database plugin, Azure metadata node discovery for Raft joining) the size reduction is 35%. That's rather remarkable.
We did update the Azure SDK as part of work on an HTTP/2 bug Microsoft found, had to update that SDK in the seal subsystem as well as the secrets plugin. I'll take a second look but I wonder if/why we're bringing in more of the SDK now.
@evsasha : It is indeed due to the change of usage of the Azure SDK but not where I thought. We're looking at it.
@sgmiller Thank you. I just wanted to point out the negative trend of package size growth and find out what is directly causing this growth. Is there any deliberate malicious intent behind it?
I have created a histogram of releases and versions for myself, and the trend of package size growth is clearly evident.
My main concern and discomfort related to the package size growth is the use of the Vault CLI. CLI shouldn't weigh 300+ megabytes. Perhaps it's time to separate the server and the CLI.
@evsasha : It is indeed due to the change of usage of the Azure SDK but not where I thought. We're looking at it.
Is there any chance we could selectively opt-in/out of these features? We personally use Azure, but there's a bunch of secrets engines that we do not.
I was investigating the sizes of some container images and stumbled over this: 370mb for a go binary is pretty impressive. :S
I'd also love a stripped and more sanely composed binary that is smaller. =/
Related issues:
- #10180 opened Oct 20, 2020
- #21069 opened Jun 8, 2023
As of 1.16.2 the Vault (linux_amd64) binary has now grown to over 400M.
$ ./vault --version
Vault v1.16.2 (c6e4c2d4dc3b0d57791881b087c026e2f75a87cb), built 2024-04-22T16:25:54Z
$ ls -hog vault | cut -d' ' -f3-
402M Apr 22 20:30 vault
HashiCorp Vault's Plugin system currently consists of built-in and external plugins:
Built-in plugins are shipped with Vault... External plugins are not shipped with Vault and require additional operator intervention to run. To run an external plugin, a binary or container image of the plugin is required. Plugin binaries can be obtained from releases.hashicorp.com or they can be built from source.
I personally would prefer all the secret, auth, and database plugins were "external" with the Vault binary containing only the core components, similar to Terraform and its use of Providers.
In one use case of Vault, I have need to use only 4-5 of the 19 included Auth Methods. I imagine most use cases of Vault would be similar in needing only a subset of the default included Auth Methods and Secret Engines. Moving all plugins to be "external" would remove bloat to the binary and unnecessary additional code that need not exist.
It is abhorrent that a utility often purposed for extracting secrets for an environment has grown to over 400MiB in size.
The size itself isn't the main issue, it's that it keeps growing with every patch release. It's been almost a year since #21069 was raised, in which the size has more than doubled. An action plan from the team might help address concerns about increasing pipeline slowness for potential enterprise customers.
It is abhorrent that a utility often purposed for extracting secrets for an environment has grown to over 400MiB in size.
The size itself isn't the main issue, it's that it keeps growing with every patch release. It's been almost a year since #21069 was raised, in which the size has more than doubled. An action plan from the team might help address concerns about increasing pipeline slowness for potential enterprise customers.
While I am not able to share specifics at this time, please be assured that the engineering teams are examining this issue and trying to determine the best ways to rectify it. In my own personal opinion, the size of the binary grew organically as Vault became a more complex product. As a result, our engineers want to make sure that we can approach this issue carefully and with a minimum of unintended consequences. I understand that this may not be a very satisfying response to your valid concerns, so I appreciate your patience with us. :) Thanks!
I have a docker image that uses vault to automatically generate SSH Certificates - rewriting the tool to just use curl to make the relevant requests massively reduces image size:
Before:
ssh_key_signer 1.1.1 447MB
After:
ssh_key_signer 2.0.0 28.3MB
The fact the vault binary is both the server and the client tool seems like a good first place to tackle - the fact I need to package the whole vault server binary into a container to just sign some keys seems silly - and also makes me somewhat nervous that I've created a large attack surface for exploits when I really don't need to.\
I'd be interested to see what size reduction would be possible if there was just a vault client binary available for uses in cases like this which is potentially the biggest use case and most impact caused by the growing binary
This bit us in the back. On AWS, since AMIs are not warmed up and copied to EBS automatically, the 400MB Vault binary took over 30 seconds to do vault --version. Reading the binary once before executing it reduces that time to ~1s range (and is way faster than starting the binary and reading part of it when needed). Loading it into memory/page cache (since we are already reading it to warm it anyway) reduces that to ms range.
Having a Vault Client / Agent binary would be awesome. The option to have stripped binaries without debug symbols distributed in the same place as the normal releases would be great as well.
The increase in the binary size over the last year is very suboptimal. This is increasing our docker image sizes and results in significant delays to do download these docker images. Could you evaluate the following options?
- Distribute builds without the debug symbols.
- Often we only need the vault client/agent. So creating a different binary just for the client/agent is very useful
I agree with everyone else that the size of the binary is not ideal for running in environments where startup time is important, in our case CI/CD.
If it helps anyone, we had success replacing vault (CLI) with a bit of curl & jq. https://blog.marco.ninja/notes/technology/vault/vault-cli-in-containers/
While this works it is far from ideal, we would certainly prefer a reasonable sized official client.
This is really silly…
I absolutely understand your frustration with this situation, as it can cause a lot of problems. Our engineering teams are aware (and can relate, I think), and I know we've been looking for ways to tackle this without causing other deleterious effects. I'll keep bringing it up for further research and possible fixes, too. We appreciate your understanding in the meantime - thanks!