helm-charts
helm-charts copied to clipboard
[kube-prometheus-stack] stuck on Back-off restarting failed container create in pod monitor-kube-prometheus-st-admission-create
Describe the bug a clear and concise description of what the bug is.
Command:
>>> helm install --create-namespace monitor -n monitor -f kube-prometheus-stack-config.yaml prometheus-community/kube-prometheus-stack --debug
BTW: I only modified the prometheusOperator > admissionWebhooks > patch > image > register from registry.k8s.io to the ****forsecure******.dkr.ecr.cn-northwest-1.amazonaws.com.cn in the kube-prometheus-stack-config.yaml, because the machines are located in china.
Debug Info
It got stucked at the end
install.go:193: [debug] Original chart version: "" install.go:210: [debug] CHART PATH: /Users/kain/.cache/helm/repository/kube-prometheus-stack-48.1.2.tgz
client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD alertmanagerconfigs.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD alertmanagers.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD podmonitors.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD probes.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheusagents.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheuses.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheusrules.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD scrapeconfigs.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD servicemonitors.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD thanosrulers.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ServiceAccount client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ClusterRole client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ClusterRoleBinding client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" Role client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" RoleBinding client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission-create" Job client.go:133: [debug] creating 1 resource(s) client.go:703: [debug] Watching for changes to Job monitor-kube-prometheus-st-admission-create with timeout of 5m0s client.go:731: [debug] Add/Modify event for monitor-kube-prometheus-st-admission-create: ADDED client.go:770: [debug] monitor-kube-prometheus-st-admission-create: Jobs active: 1, jobs failed: 0, jobs succeeded: 0 Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition helm.go:84: [debug] failed pre-install: timed out waiting for the condition INSTALLATION FAILED main.newInstallCmd.func2 helm.sh/helm/v3/cmd/helm/install.go:141 github.com/spf13/cobra.(*Command).execute github.com/spf13/[email protected]/command.go:916 github.com/spf13/cobra.(*Command).ExecuteC github.com/spf13/[email protected]/command.go:1044 github.com/spf13/cobra.(*Command).Execute github.com/spf13/[email protected]/command.go:968 main.main helm.sh/helm/v3/cmd/helm/helm.go:83 runtime.main runtime/proc.go:250 runtime.goexit runtime/asm_arm64.s:1270
helm status
helm list -n monitor -a After a period of time, the status changes to failed, and pod got deleted.
| NAME | NAMESPACE | REVISION | UPDATED | STATUS | CHART | APP VERSION |
|---|---|---|---|---|---|---|
| monitor | monitor | 1 | 2023-07-24 16:24:25.278429 +0800 CST | pending-install | kube-prometheus-stack-48.1.2 | monitor |
Event Msg
Back-off restarting failed container create in pod monitor-kube-prometheus-st-admission-create-98brv_monitor(f3c33947-a284-43f3-a372-e561baf165d8)
What's your helm version?
v3.11.0
What's your kubectl version?
v5.0.1
Which chart?
prometheus-kube-stack
What's the chart version?
48.1.2
What happened?
No response
What you expected to happen?
No response
How to reproduce it?
No response
Enter the changed values of values.yaml?
No response
Enter the command that you execute and failing/misfunctioning.
helm install --create-namespace monitor -n monitor -f kube-prometheus-stack-config.yaml prometheus-community/kube-prometheus-stack --debug
Anything else we need to know?
No response
Any update about this? Facing same issue :(
me, too
I could solve the issue with this values:
prometheus:
prometheusSpec:
terminationGracePeriodSeconds: 90
## Give 10 minutes
maximumStartupDurationSeconds: 600
minReadySeconds: 90
containers:
- name: prometheus
readinessProbe:
periodSeconds: 30
initialDelaySeconds: 30
livenessProbe:
periodSeconds: 30
initialDelaySeconds: 60