vault-secrets-operator icon indicating copy to clipboard operation
vault-secrets-operator copied to clipboard

error: leader election lost

Open jascsch opened this issue 1 year ago • 4 comments

The vault-secrets-operator container is frequently restarting with the following error messages:

{"level":"error","ts":"2024-01-15T09:46:26Z","logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"main.main\n\t/workspace/main.go:135\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"} E0115 09:46:26.613160 1 leaderelection.go:332] error retrieving resource lock vault-secrets-operator/vaultsecretsoperator.ricoberger.de: Get "https://192.168.64.1:443/apis/coordination.k8s.io/v1/namespaces/vault-secrets-operator/leases/vaultsecretsoperator.ricoberger.de": context deadline exceeded {"level":"error","ts":"2024-01-15T08:36:58Z","logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"main.main\n\t/workspace/main.go:135\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"} E0115 08:36:58.095320 1 leaderelection.go:332] error retrieving resource lock vault-secrets-operator/vaultsecretsoperator.ricoberger.de: Get "https://192.168.64.1:443/apis/coordination.k8s.io/v1/namespaces/vault-secrets-operator/leases/vaultsecretsoperator.ricoberger.de": context deadline exceeded

Can you please check and advise how to fix this issue?

jascsch avatar Jan 15 '24 14:01 jascsch

Hi @jascsch, most of the time this is indicates a problem with your Kubernetes API server. There is nothing special how the leader election is handled within the VaultSecrets operator and nothing we can really do here.

We had the same issues with our old Kubernetes provider and decided to run the operator with 1 replica, since it was ok for us when the operator is not available for a short period of time. Maybe this is also a solution for you.

ricoberger avatar Jan 15 '24 17:01 ricoberger

Hi @ricoberger thanks for the quick feedback. there is nothing we can do about the kubernetes API which is fully managed. We already use 1 replica and the error still occurs. Is there any way to disable the leader election? this should not be needed if only one replica is running.

jascsch avatar Jan 16 '24 08:01 jascsch

Is there a way to add proxy envs? this would be needed for corporate proxy servers if the vault operator communicates with an external vault service.

jascsch avatar Jan 16 '24 08:01 jascsch

Hi we are using the following values in the Helm chart:

deploymentStrategy:
  type: Recreate

args:
  - -leader-elect=false

ricoberger avatar Jan 16 '24 08:01 ricoberger