vault icon indicating copy to clipboard operation
vault copied to clipboard

Vault binary size increase (+44%) from 1.14.1 to 1.14.2

Open evsasha opened this issue 2 years ago • 14 comments

The binary file has increased by 44%, and I don't see a rational explanation for this.

247M Jul 26 14:25 vault_1.14.1
355M Sep  8 14:25 vault_1.14.2

linux amd64

Describe the bug N/A

To Reproduce N/A

Expected behavior N/A

Environment: N/A

Additional context N/A

evsasha avatar Sep 08 '23 11:09 evsasha

We are aware of the size of the binary. Many design decisions are a factor, for example the debug symbols upon which many of our customers rely for information and troubleshooting. We are constantly evaluating how we can improve here, however, no firm decisions have been made yet and may not be for some time. Are there issues that you are facing here, such as with container sizing? Please let us know the details of any operational headache that you may be experiencing as a result. Thanks!

akshya96 avatar Sep 08 '23 22:09 akshya96

Even so, that's a remarkable change in size for a patch release.

I was curious, so I started git bisecting. Turns out most of it is down to pulling in new/more code from the Azure SDK in 486f7d0fda27b057959d5c907749e6f291237778, and then a smaller increase from 5a37c6f0d74a11ad614f9bb2f3354fd79071617e (more Azure SDK code, and some other things)

EDIT: It turns out that 27% of the binary size of vault 1.14.2 (self-compiled, no web UI) is just the Azure secrets engine, Azure auth method, and Azure auth support for Vault agent/proxy ! If you take out enough other code to get rid of all use of github.com/Azure/azure-sdk-for-go/... (Azure auto-seal, snowflake database plugin, Azure metadata node discovery for Raft joining) the size reduction is 35%. That's rather remarkable.

maxb avatar Sep 09 '23 18:09 maxb

We did update the Azure SDK as part of work on an HTTP/2 bug Microsoft found, had to update that SDK in the seal subsystem as well as the secrets plugin. I'll take a second look but I wonder if/why we're bringing in more of the SDK now.

sgmiller avatar Sep 13 '23 14:09 sgmiller

@evsasha : It is indeed due to the change of usage of the Azure SDK but not where I thought. We're looking at it.

sgmiller avatar Sep 13 '23 20:09 sgmiller

@sgmiller Thank you. I just wanted to point out the negative trend of package size growth and find out what is directly causing this growth. Is there any deliberate malicious intent behind it?

I have created a histogram of releases and versions for myself, and the trend of package size growth is clearly evident.

My main concern and discomfort related to the package size growth is the use of the Vault CLI. CLI shouldn't weigh 300+ megabytes. Perhaps it's time to separate the server and the CLI.

evsasha avatar Sep 28 '23 12:09 evsasha

@evsasha : It is indeed due to the change of usage of the Azure SDK but not where I thought. We're looking at it.

Is there any chance we could selectively opt-in/out of these features? We personally use Azure, but there's a bunch of secrets engines that we do not.

franciscoabsampaio avatar Oct 16 '23 14:10 franciscoabsampaio

I was investigating the sizes of some container images and stumbled over this: 370mb for a go binary is pretty impressive. :S

I'd also love a stripped and more sanely composed binary that is smaller. =/

dragetd avatar Feb 19 '24 18:02 dragetd

Related issues:

  • #10180 opened Oct 20, 2020
  • #21069 opened Jun 8, 2023

As of 1.16.2 the Vault (linux_amd64) binary has now grown to over 400M.

$ ./vault --version
Vault v1.16.2 (c6e4c2d4dc3b0d57791881b087c026e2f75a87cb), built 2024-04-22T16:25:54Z

$ ls -hog vault | cut -d' ' -f3-
402M Apr 22 20:30 vault

HashiCorp Vault's Plugin system currently consists of built-in and external plugins:

Built-in plugins are shipped with Vault... External plugins are not shipped with Vault and require additional operator intervention to run. To run an external plugin, a binary or container image of the plugin is required. Plugin binaries can be obtained from releases.hashicorp.com or they can be built from source.

I personally would prefer all the secret, auth, and database plugins were "external" with the Vault binary containing only the core components, similar to Terraform and its use of Providers.

In one use case of Vault, I have need to use only 4-5 of the 19 included Auth Methods. I imagine most use cases of Vault would be similar in needing only a subset of the default included Auth Methods and Secret Engines. Moving all plugins to be "external" would remove bloat to the binary and unnecessary additional code that need not exist.

111a5ab1 avatar May 07 '24 09:05 111a5ab1

It is abhorrent that a utility often purposed for extracting secrets for an environment has grown to over 400MiB in size.

The size itself isn't the main issue, it's that it keeps growing with every patch release. It's been almost a year since #21069 was raised, in which the size has more than doubled. An action plan from the team might help address concerns about increasing pipeline slowness for potential enterprise customers.

Stealthii avatar May 24 '24 00:05 Stealthii

It is abhorrent that a utility often purposed for extracting secrets for an environment has grown to over 400MiB in size.

The size itself isn't the main issue, it's that it keeps growing with every patch release. It's been almost a year since #21069 was raised, in which the size has more than doubled. An action plan from the team might help address concerns about increasing pipeline slowness for potential enterprise customers.

While I am not able to share specifics at this time, please be assured that the engineering teams are examining this issue and trying to determine the best ways to rectify it. In my own personal opinion, the size of the binary grew organically as Vault became a more complex product. As a result, our engineers want to make sure that we can approach this issue carefully and with a minimum of unintended consequences. I understand that this may not be a very satisfying response to your valid concerns, so I appreciate your patience with us. :) Thanks!

heatherezell avatar May 24 '24 21:05 heatherezell

I have a docker image that uses vault to automatically generate SSH Certificates - rewriting the tool to just use curl to make the relevant requests massively reduces image size:

Before:

ssh_key_signer  1.1.1  447MB

After:

ssh_key_signer  2.0.0  28.3MB

The fact the vault binary is both the server and the client tool seems like a good first place to tackle - the fact I need to package the whole vault server binary into a container to just sign some keys seems silly - and also makes me somewhat nervous that I've created a large attack surface for exploits when I really don't need to.\

I'd be interested to see what size reduction would be possible if there was just a vault client binary available for uses in cases like this which is potentially the biggest use case and most impact caused by the growing binary

jk464 avatar Jun 12 '24 11:06 jk464

This bit us in the back. On AWS, since AMIs are not warmed up and copied to EBS automatically, the 400MB Vault binary took over 30 seconds to do vault --version. Reading the binary once before executing it reduces that time to ~1s range (and is way faster than starting the binary and reading part of it when needed). Loading it into memory/page cache (since we are already reading it to warm it anyway) reduces that to ms range.

Having a Vault Client / Agent binary would be awesome. The option to have stripped binaries without debug symbols distributed in the same place as the normal releases would be great as well.

Sayrus avatar Jul 29 '24 14:07 Sayrus

The increase in the binary size over the last year is very suboptimal. This is increasing our docker image sizes and results in significant delays to do download these docker images. Could you evaluate the following options?

  • Distribute builds without the debug symbols.
  • Often we only need the vault client/agent. So creating a different binary just for the client/agent is very useful

sbandadd avatar Aug 12 '24 18:08 sbandadd

I agree with everyone else that the size of the binary is not ideal for running in environments where startup time is important, in our case CI/CD.

If it helps anyone, we had success replacing vault (CLI) with a bit of curl & jq. https://blog.marco.ninja/notes/technology/vault/vault-cli-in-containers/

While this works it is far from ideal, we would certainly prefer a reasonable sized official client.

ProfessorLogout avatar Aug 12 '24 20:08 ProfessorLogout

This is really silly…

dragetd avatar Sep 24 '24 21:09 dragetd

I absolutely understand your frustration with this situation, as it can cause a lot of problems. Our engineering teams are aware (and can relate, I think), and I know we've been looking for ways to tackle this without causing other deleterious effects. I'll keep bringing it up for further research and possible fixes, too. We appreciate your understanding in the meantime - thanks!

heatherezell avatar Sep 24 '24 21:09 heatherezell