vault-autounseal icon indicating copy to clipboard operation
vault-autounseal copied to clipboard

Unsealer cannot find vault instance

Open SimonWoidig opened this issue 11 months ago • 10 comments

After some recent changes, the auto unsealing stopped working like I would expect it to. I looked though the code and I think I see a mistake at: https://github.com/pyToshka/vault-autounseal/blob/5bd1d941980b2b3b4b358c76c6d0db21459c05f1/app.py#L291-L324

More specifically at the 291 and 321. It seems that only the pods with the label vault-selector: "true" will try to be unlocked (line 291) and the vault URL parameter is completely ignored (since the list is wiped at line 321).

To manually fix this issue, I have to add the label to the pod and restart the unsealer. Then it tries to unlock the pods, which succeeds and the vault starts up.

SimonWoidig avatar Mar 11 '24 11:03 SimonWoidig

For me working as expected, please share your setup, also how did you deploy unseal

pyToshka avatar Mar 11 '24 11:03 pyToshka

I have deployed Vault and the Unsealer via helm and it is managed with ArgoCD. I am running Vault in standalone mode.

SimonWoidig avatar Mar 11 '24 11:03 SimonWoidig

@SimonWoidig did you try latest chart?

pyToshka avatar Mar 11 '24 11:03 pyToshka

I have the Vault chart at version v0.27.0 and the unsealer at 0.5.2. Both seem to be the latest one,

SimonWoidig avatar Mar 11 '24 12:03 SimonWoidig

@SimonWoidig, when configuring the vault Helm chart, you need to make sure that you have in your values configuration (https://github.com/hashicorp/vault-helm/blob/main/values.yaml#L664):

server:
  extraLabels:
    vault-sealed: true

This will ensure that the correct label is applied to the vault pods, and the unsealer can identify them.

sfc-gh-gmerticariu avatar Apr 26 '24 10:04 sfc-gh-gmerticariu

@sfc-gh-gmerticariu Yes, I have figured as much from the code. But I don't think this is the wanted behaviour. I would expect the unsealer to just look at the vault instances and if they are sealed, unseal them. Nevertheless, if it works, good enough for me.

It also looks like that the lookup of the Vault replicas is useless because the list gets cleared and rewritten by the pods with the vault-sealed label. Also, what happens if I add the label to some non-vault pod?

SimonWoidig avatar Apr 26 '24 10:04 SimonWoidig

@SimonWoidig, according to the code, that's correct; the list gets cleared and rewritten, which means that the hostname lookup is not working.

I cannot speak about the wanted behaviour, as @pyToshka is the one defining it.

Any pod labelled will be selected, and then an unseal request will be sent to that pod, which will most likely fail and raise an exception.

sfc-gh-gmerticariu avatar Apr 26 '24 11:04 sfc-gh-gmerticariu

@sfc-gh-gmerticariu Yes I understand. But I have no idea, why there is even hostname resolving, since the resolved IP is overwritten by the pod IP(s).

SimonWoidig avatar Apr 26 '24 11:04 SimonWoidig

@pyToshka could you please chime in? Would it be ok if we remove the hostname resolving?

sfc-gh-gmerticariu avatar Apr 29 '24 08:04 sfc-gh-gmerticariu

vault-sealed for determination of current state Feel free for offering improvement of algorithm. P.S: This repository has been created as POC and usage in production, not good idea from security side :-)

pyToshka avatar May 03 '24 06:05 pyToshka

I agree too that vault-sealed label is being used to determine current state and will caused auto unseal pod to fail health check if used when all vault pods are already in an unsealed state (vault-sealed=false)

A custom label for both Vault Server and a matchingSelector for vault-autounseal is the solution in my opinion.

I am proposing that a merge to this PR: https://github.com/pyToshka/vault-autounseal/pull/47 and publish a new release to accomadate the currently existing https://github.com/pyToshka/vault-autounseal/blob/main/app.py#L285 unreleased feature, if it's ok with @pyToshka

and we can specify the settings as so on autounseal side

settings:
  # Replace **hashicorp-vault** with Vault release name (if installed using helm)
  vault_label_selector: app.kubernetes.io/instance=hashicorp-vault,component=server

and on HCP Vault (this is the default behavior https://github.com/hashicorp/vault-helm/blob/v0.28.1/values.yaml#L697-L701)

server:
  service:
    enabled: true
    
    instanceSelector:
      enabled: true

instanceSelector should add label app.kubernetes.io/instance=<helm-release-name> to each vault server

panteparak avatar Jul 20 '24 09:07 panteparak

@panteparak thank you for pr merged

pyToshka avatar Jul 26 '24 05:07 pyToshka

@pyToshka Just a friendly bump I believe you will need to bump the latest helm version to 0.5.3

based on this Helm Chart Releaser CI, it will refuse to publish a new chart if there's a pre existing helm version. https://github.com/pyToshka/vault-autounseal/blob/main/.github/workflows/release_chart.yaml#L34

I have created a PR to bump the version to 0.5.3 and fix the default vault_label_selector value to match with the discussion above

PR: https://github.com/pyToshka/vault-autounseal/pull/48

panteparak avatar Aug 01 '24 10:08 panteparak