helm-charts [kube-prometheus-stack] Prometheus Operator pod cannot come up when admission hook is disabled

Describe the bug a clear and concise description of what the bug is.

Prometheus Operator pod cannot come up with a missing admission hook secret error if admission hook is disabled.

What's your helm version?

v3.7.0

What's your kubectl version?

v1.22.2

Which chart?

kube-prometheus-stack

What's the chart version?

19.0.2

What happened?

The operator pod cannot come up, with the following error message: MountVolume.SetUp failed for volume "tls-secret" : secret "prometheus-kube-prometheus-admission" not found. This message is displayed because admission hook is disabled and the secret is not present.

What you expected to happen?

I expect the operator to come up.

How to reproduce it?

Install the chart with the values below

Enter the changed values of values.yaml?

prometheusOperator:
  admissionWebhooks:
    enabled: false

Enter the command that you execute and failing/misfunctioning.

helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring -f prometheus-values.yaml

Anything else we need to know?

No response

Oct 18 '21 00:10 AndrewSav

i am also encountering this when trying to deploy Prometheus-operator admissionWebhooks disabled.

This is due to Prometheus-operator's deployment referencing the secret (https://github.com/prometheus-community/helm-charts/blob/0a55b7319e0397c2f4a82eb5f680b6a260301e8c/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L125) but secret will only be created by admission-create job (https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml#L1).

A workaround (or maybe intended behaviour?) will be to set

prometheusOperator:
  tls:
    enabled: false

This will prevent helm from generating the volume and volumeMount blocks (https://github.com/prometheus-community/helm-charts/blob/0a55b7319e0397c2f4a82eb5f680b6a260301e8c/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L116).

However, this revealed another set of issues.

Missing role & rolebindings for Prometheus-operator

level=error ts=2021-10-26T14:55:40.266312816Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Probe: failed to list *v1.Probe: probes.monitoring.coreos.com is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-operator\" cannot list resource \"probes\" in API group \"monitoring.coreos.com\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:55:42.581554026Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.PrometheusRule: failed to list *v1.PrometheusRule: prometheusrules.monitoring.coreos.com is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-operator\" cannot list resource \"prometheusrules\" in API group \"monitoring.coreos.com\" in the namespace \"test-tenant\""

Workaround is to create rolebinding and role with permission matching what's stated here https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/clusterrole.yaml.

Missing role & rolebindings for Prometheus

level=error ts=2021-10-26T14:56:42.254Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:56:54.706Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"services\" in API group \"\" in the namespace \"test-tenant\""
level=error ts=2021-10-26T14:56:58.193Z caller=klog.go:116 component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:test-tenant:tenant-foo-prometheus\" cannot list resource \"pods\" in API group \"\" in the namespace \"test-tenant\""

Similar to above, workaround is to create role & rolebinding separately using this as reference https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus/clusterrole.yaml

Question to maintainers:

Should we be setting tls.enabled: false if we are not intending to use admissionWebhooks?
Any issue with creating role/rolebindings when clusterrole/clusterrolebindings are not needed or not applicable? (e.g multi-tenant environment)

I can help create a PR to fix this.

Oct 26 '21 15:10 jaanhio

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Nov 26 '21 05:11 stale[bot]

+1

Nov 26 '21 10:11 AndrewSav

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Dec 26 '21 13:12 stale[bot]

+1

Dec 28 '21 19:12 AndrewSav

Works for me.

Dec 28 '21 20:12 monotek

@monotek do you have the prometheus-kube-prometheus-admission secret?

Dec 28 '21 23:12 AndrewSav

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Jan 28 '22 03:01 stale[bot]

Same issue with kube-prometheus-stack 31.0.0 on a fresh cluster. I disabled the admission webhooks, because I do not configure Prometheus in this way and there is no need for it to be running.

prometheusOperator:
  enabled: true
  admissionWebhooks:
    enabled: false

While looking at the code it seems conceptually wrong that the prometheus-operator uses the same TLS certificates intended to be used by the admission webhooks. It should generate it's own certificates if needed or there should be instructions on how to set it up.

https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus-operator/deployment.yaml#L126

Workaround 1: Disable TLS (traffic to operator is now unencrypted?):

  tls:
    enabled: false

Workaround 2: Enable the generation of admission webhooks certificates with cert-manager despite it being disabled (generated by certmanager.yaml#L42):

  admissionWebhooks:
    enabled: false
    certManager:
      enabled: true

Workaround 3: Manually create the needed TLS secrets/certificates.

Feb 05 '22 16:02 gw0

+1

Feb 18 '22 12:02 obvionaoe

@gw0 thanks for recommending option 2, that seems to work for 33.1.0.

Mar 02 '22 04:03 danmanners

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Apr 02 '22 07:04 stale[bot]

recent activity

Apr 02 '22 20:04 AndrewSav

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

May 03 '22 02:05 stale[bot]

More recent activity

May 03 '22 03:05 danmanners

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Jun 04 '22 01:06 stale[bot]

Even more recent activity

Jun 04 '22 02:06 AndrewSav

@monotek can you or anyone take a look?

Jun 13 '22 12:06 obvionaoe

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Jul 13 '22 20:07 stale[bot]

Hello there

Jul 13 '22 21:07 AndrewSav

I'm facing the same issue.

Sep 01 '22 00:09 jkleinkauff

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Oct 12 '22 05:10 stale[bot]

remove stale

Oct 12 '22 07:10 AndrewSav

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Nov 12 '22 12:11 stale[bot]

/remove-lifecycle stale

Nov 12 '22 20:11 AndrewSav

This issue is being automatically closed due to inactivity.

Nov 27 '22 14:11 stale[bot]

Re-opened as https://github.com/prometheus-community/helm-charts/issues/2742 since it was closed by the bot.

Nov 27 '22 20:11 AndrewSav

helm-charts helm-charts copied to clipboard

[kube-prometheus-stack] Prometheus Operator pod cannot come up when admission hook is disabled

Describe the bug a clear and concise description of what the bug is.

What's your helm version?

What's your kubectl version?

Which chart?

What's the chart version?

What happened?

What you expected to happen?

How to reproduce it?

Enter the changed values of values.yaml?

Enter the command that you execute and failing/misfunctioning.

Anything else we need to know?

helm-charts
helm-charts copied to clipboard