vault-autounseal
vault-autounseal copied to clipboard
Unsealer cannot find vault instance
After some recent changes, the auto unsealing stopped working like I would expect it to. I looked though the code and I think I see a mistake at: https://github.com/pyToshka/vault-autounseal/blob/5bd1d941980b2b3b4b358c76c6d0db21459c05f1/app.py#L291-L324
More specifically at the 291 and 321. It seems that only the pods with the label vault-selector: "true"
will try to be unlocked (line 291) and the vault URL parameter is completely ignored (since the list is wiped at line 321).
To manually fix this issue, I have to add the label to the pod and restart the unsealer. Then it tries to unlock the pods, which succeeds and the vault starts up.
For me working as expected, please share your setup, also how did you deploy unseal
I have deployed Vault and the Unsealer via helm and it is managed with ArgoCD. I am running Vault in standalone mode.
@SimonWoidig did you try latest chart?
I have the Vault chart at version v0.27.0 and the unsealer at 0.5.2. Both seem to be the latest one,
@SimonWoidig, when configuring the vault Helm chart, you need to make sure that you have in your values configuration (https://github.com/hashicorp/vault-helm/blob/main/values.yaml#L664):
server:
extraLabels:
vault-sealed: true
This will ensure that the correct label is applied to the vault pods, and the unsealer can identify them.
@sfc-gh-gmerticariu Yes, I have figured as much from the code. But I don't think this is the wanted behaviour. I would expect the unsealer to just look at the vault instances and if they are sealed, unseal them. Nevertheless, if it works, good enough for me.
It also looks like that the lookup of the Vault replicas is useless because the list gets cleared and rewritten by the pods with the vault-sealed
label. Also, what happens if I add the label to some non-vault pod?
@SimonWoidig, according to the code, that's correct; the list gets cleared and rewritten, which means that the hostname lookup is not working.
I cannot speak about the wanted behaviour, as @pyToshka is the one defining it.
Any pod labelled will be selected, and then an unseal request will be sent to that pod, which will most likely fail and raise an exception.
@sfc-gh-gmerticariu Yes I understand. But I have no idea, why there is even hostname resolving, since the resolved IP is overwritten by the pod IP(s).
@pyToshka could you please chime in? Would it be ok if we remove the hostname resolving?
vault-sealed
for determination of current state
Feel free for offering improvement of algorithm.
P.S: This repository has been created as POC and usage in production, not good idea from security side :-)
I agree too that vault-sealed
label is being used to determine current state and will caused auto unseal
pod to fail health check if used when all vault pods are already in an unsealed state (vault-sealed=false
)
A custom label for both Vault Server
and a matchingSelector for vault-autounseal
is the solution in my opinion.
I am proposing that a merge to this PR: https://github.com/pyToshka/vault-autounseal/pull/47 and publish a new release to accomadate the currently existing https://github.com/pyToshka/vault-autounseal/blob/main/app.py#L285 unreleased feature, if it's ok with @pyToshka
and we can specify the settings as so on autounseal
side
settings:
# Replace **hashicorp-vault** with Vault release name (if installed using helm)
vault_label_selector: app.kubernetes.io/instance=hashicorp-vault,component=server
and on HCP Vault (this is the default behavior https://github.com/hashicorp/vault-helm/blob/v0.28.1/values.yaml#L697-L701)
server:
service:
enabled: true
instanceSelector:
enabled: true
instanceSelector
should add label app.kubernetes.io/instance=<helm-release-name>
to each vault server
@panteparak thank you for pr merged
@pyToshka Just a friendly bump
I believe you will need to bump the latest helm version to 0.5.3
based on this Helm Chart Releaser CI, it will refuse to publish a new chart if there's a pre existing helm version. https://github.com/pyToshka/vault-autounseal/blob/main/.github/workflows/release_chart.yaml#L34
I have created a PR to bump the version to 0.5.3
and fix the default vault_label_selector
value to match with the discussion above
PR: https://github.com/pyToshka/vault-autounseal/pull/48