nomad
nomad copied to clipboard
Store keyring in Raft
Background
With the introduction of Variables in Nomad 1.4, Nomad has had an internal keyring consisting of:
- A Data Encryption Key (DEK) for encrypting Variables in Raft and on disk
- A Key Encryption Key (KEK) for encrypting the DEK
- A Workload Identity signing key (see #18882 for details)
The KEK is stored on disk distinctly from Raft logs and snapshots so that secret material may not be extracted from leaked snapshots.
As of #23580 Nomad will support using an external KMS for the KEK. However the wrapped (encrypted) DEK and WI keys will still be in the on disk keyring outside of snapshots.
The on disk keyring, even when using a KMS, poses an unfortunate challenge for Nomad operators: they must backup the keyring like snapshots, but distinctly. If a single script backs up both snapshots and keyring to the same location, the only remaining benefit to keeping the two distinct is in the case of sharing snapshots with a third party (eg HashiCorp) for debugging.
Nomad engineering has encountered multiple outages caused by users unaware that they need to backup and restore the keyring distinct but alongside the snapshot.
Proposal
The downsides of storing the keyring separate from snapshots outweighs the potential security benefits. The security benefits are likely rarely realized due to the difficulty of splitting backups across multiple processes and locations. This user experience should be fixed by doing the following:
- By default, without a KMS configured, Nomad should store the unencrypted KEK and other wrapped keys in Raft and snapshots. Server agents should log a warning instructing users their snapshots are functionally unencrypted and using a KMS is desirable (or linking to such docs).
- When using a KMS, wrapped key material should be stored in Raft and snapshots.
- The existing behavior of storing the KEK on disk distinctly from Raft and snapshots should be available for users who do not want to rely on an external KMS but also do not want to risk exposing their keys in Raft or snapshots.
In order to support storing key material separate from snapshots, the following features should be added:
- The Snapshot API and
operator snapshot saveCLI should gain a parameter that allows generating snapshots without the KEK included (for users without a KMS). Users with a KMS configured will never receive their KEK in a snapshot. - A new
operator snapshot redactcommand should remove the KEK from a snapshot if one is present. This command could optionally store the extracted key to a file for backing up distinctly from the snapshot. operator snapshot inspectshould warn users if a snapshot contains a KEK.
Attempted Solutions
The Nomad team has attempted to document proper keyring handling, but has encountered multiple instances of operators being unaware of how to properly handle keyrings during backups and restores.