helm-charts
helm-charts copied to clipboard
[kube-prometheus-stack] Prometheus Operator pod cannot come up when admission hook is disabled
Describe the bug a clear and concise description of what the bug is.
Prometheus Operator pod cannot come up with a missing admission hook secret error if admission hook is disabled.
What's your helm version?
v3.7.0
What's your kubectl version?
v1.22.2
Which chart?
kube-prometheus-stack
What's the chart version?
19.0.2
What happened?
The operator pod cannot come up, with the following error message: MountVolume.SetUp failed for volume "tls-secret" : secret "prometheus-kube-prometheus-admission" not found
. This message is displayed because admission hook is disabled and the secret is not present.
What you expected to happen?
I expect the operator to come up.
How to reproduce it?
Install the chart with the values below
Enter the changed values of values.yaml?
prometheusOperator:
admissionWebhooks:
enabled: false
Enter the command that you execute and failing/misfunctioning.
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring -f prometheus-values.yaml
Anything else we need to know?
No response
i am also encountering this when trying to deploy Prometheus-operator admissionWebhooks
disabled.
This is due to Prometheus-operator's deployment referencing the secret (https://github.com/prometheus-community/helm-charts/blob/0a55b7319e0397c2f4a82eb5f680b6a260301e8c/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L125) but secret will only be created by admission-create job (https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml#L1).
A workaround (or maybe intended behaviour?) will be to set
prometheusOperator:
tls:
enabled: false
This will prevent helm from generating the volume
and volumeMount
blocks (https://github.com/prometheus-community/helm-charts/blob/0a55b7319e0397c2f4a82eb5f680b6a260301e8c/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L116).
However, this revealed another set of issues.
- Missing role & rolebindings for Prometheus-operator
level=error ts=2021-10-26T14:55:40.266312816Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Probe: failed to list *v1.Probe: probes.monitoring.coreos.com is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-operator\" cannot list resource \"probes\" in API group \"monitoring.coreos.com\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:55:42.581554026Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.PrometheusRule: failed to list *v1.PrometheusRule: prometheusrules.monitoring.coreos.com is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-operator\" cannot list resource \"prometheusrules\" in API group \"monitoring.coreos.com\" in the namespace \"test-tenant\""
Workaround is to create rolebinding and role with permission matching what's stated here https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/clusterrole.yaml.
- Missing role & rolebindings for Prometheus
level=error ts=2021-10-26T14:56:42.254Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:56:54.706Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"services\" in API group \"\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:56:58.193Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"pods\" in API group \"\" in the namespace \"test-tenant\""
Similar to above, workaround is to create role & rolebinding separately using this as reference https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus/clusterrole.yaml
Question to maintainers:
- Should we be setting
tls.enabled: false
if we are not intending to useadmissionWebhooks
? - Any issue with creating
role/rolebindings
whenclusterrole/clusterrolebindings
are not needed or not applicable? (e.g multi-tenant environment)
I can help create a PR to fix this.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
+1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
+1
Works for me.
@monotek do you have the prometheus-kube-prometheus-admission
secret?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Same issue with kube-prometheus-stack
31.0.0 on a fresh cluster. I disabled the admission webhooks, because I do not configure Prometheus in this way and there is no need for it to be running.
prometheusOperator:
enabled: true
admissionWebhooks:
enabled: false
While looking at the code it seems conceptually wrong that the prometheus-operator
uses the same TLS certificates intended to be used by the admission webhooks. It should generate it's own certificates if needed or there should be instructions on how to set it up.
- https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L126
Workaround 1: Disable TLS (traffic to operator is now unencrypted?):
tls:
enabled: false
Workaround 2: Enable the generation of admission webhooks certificates with cert-manager despite it being disabled (generated by certmanager.yaml#L42):
admissionWebhooks:
enabled: false
certManager:
enabled: true
Workaround 3: Manually create the needed TLS secrets/certificates.
+1
@gw0 thanks for recommending option 2, that seems to work for 33.1.0
.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
recent activity
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
More recent activity
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Even more recent activity
@monotek can you or anyone take a look?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Hello there
I'm facing the same issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
remove stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
/remove-lifecycle stale
This issue is being automatically closed due to inactivity.
Re-opened as https://github.com/prometheus-community/helm-charts/issues/2742 since it was closed by the bot.