helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

[kube-prometheus-stack] stuck on Back-off restarting failed container create in pod monitor-kube-prometheus-st-admission-create

Open Kain-90 opened this issue 2 years ago • 2 comments

Describe the bug a clear and concise description of what the bug is.

Command:

>>> helm install --create-namespace monitor -n monitor -f kube-prometheus-stack-config.yaml prometheus-community/kube-prometheus-stack --debug

BTW: I only modified the prometheusOperator > admissionWebhooks > patch > image > register from registry.k8s.io to the ****forsecure******.dkr.ecr.cn-northwest-1.amazonaws.com.cn in the kube-prometheus-stack-config.yaml, because the machines are located in china.

Debug Info

It got stucked at the end

install.go:193: [debug] Original chart version: "" install.go:210: [debug] CHART PATH: /Users/kain/.cache/helm/repository/kube-prometheus-stack-48.1.2.tgz

client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD alertmanagerconfigs.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD alertmanagers.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD podmonitors.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD probes.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheusagents.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheuses.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD prometheusrules.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD scrapeconfigs.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD servicemonitors.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) install.go:152: [debug] CRD thanosrulers.monitoring.coreos.com is already present. Skipping. client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ServiceAccount client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ClusterRole client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" ClusterRoleBinding client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" Role client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission" RoleBinding client.go:133: [debug] creating 1 resource(s) client.go:477: [debug] Starting delete for "monitor-kube-prometheus-st-admission-create" Job client.go:133: [debug] creating 1 resource(s) client.go:703: [debug] Watching for changes to Job monitor-kube-prometheus-st-admission-create with timeout of 5m0s client.go:731: [debug] Add/Modify event for monitor-kube-prometheus-st-admission-create: ADDED client.go:770: [debug] monitor-kube-prometheus-st-admission-create: Jobs active: 1, jobs failed: 0, jobs succeeded: 0 Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition helm.go:84: [debug] failed pre-install: timed out waiting for the condition INSTALLATION FAILED main.newInstallCmd.func2 helm.sh/helm/v3/cmd/helm/install.go:141 github.com/spf13/cobra.(*Command).execute github.com/spf13/[email protected]/command.go:916 github.com/spf13/cobra.(*Command).ExecuteC github.com/spf13/[email protected]/command.go:1044 github.com/spf13/cobra.(*Command).Execute github.com/spf13/[email protected]/command.go:968 main.main helm.sh/helm/v3/cmd/helm/helm.go:83 runtime.main runtime/proc.go:250 runtime.goexit runtime/asm_arm64.s:1270

helm status

helm list -n monitor -a After a period of time, the status changes to failed, and pod got deleted.

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
monitor monitor 1 2023-07-24 16:24:25.278429 +0800 CST pending-install kube-prometheus-stack-48.1.2 monitor

Event Msg

Back-off restarting failed container create in pod monitor-kube-prometheus-st-admission-create-98brv_monitor(f3c33947-a284-43f3-a372-e561baf165d8)

What's your helm version?

v3.11.0

What's your kubectl version?

v5.0.1

Which chart?

prometheus-kube-stack

What's the chart version?

48.1.2

What happened?

No response

What you expected to happen?

No response

How to reproduce it?

No response

Enter the changed values of values.yaml?

No response

Enter the command that you execute and failing/misfunctioning.

helm install --create-namespace monitor -n monitor -f kube-prometheus-stack-config.yaml prometheus-community/kube-prometheus-stack --debug

Anything else we need to know?

No response

Kain-90 avatar Jul 24 '23 08:07 Kain-90

Any update about this? Facing same issue :(

geolffreym avatar Feb 28 '24 15:02 geolffreym

me, too

I could solve the issue with this values:

prometheus:
  prometheusSpec:
    terminationGracePeriodSeconds: 90
    ## Give 10 minutes
    maximumStartupDurationSeconds: 600
    minReadySeconds: 90
    containers:
      - name: prometheus
        readinessProbe:
          periodSeconds: 30
          initialDelaySeconds: 30
        livenessProbe:
          periodSeconds: 30
          initialDelaySeconds: 60

hypery2k avatar May 21 '24 05:05 hypery2k