kube-prometheus icon indicating copy to clipboard operation
kube-prometheus copied to clipboard

prometheus-k8s-0 keeps terminating restarting

Open wwdz opened this issue 3 years ago • 6 comments

k8s cluster version: 1.22 kube-prometheus version: 0.10.0 OS:centos7

The log error is as follows: kubectl logs -f prometheus-k8s-0 -n monitoring --all-containers

level=info ts=2022-02-25T16:36:34.191323313Z caller=main.go:147 msg="Starting prometheus-config-reloader" version="(version=0.43.2, branch=refs/tags/v0. 43.2, revision=b86ab77239f2a11ee69ad05b24122958d8b2df5b)" ts=2022-02-25T16:36:34.054Z caller=main.go:434 level=error msg="Error loading config (--config.file=/etc/prometheus/config_out/prometheus.env.yaml)" err ="open /etc/prometheus/config_out/prometheus.env.yaml: no such file or directory" level=info ts=2022-02-25T16:36:34.191441854Z caller=main.go:148 build_context="(go=go1.14.10, user=simonpasquier, date=20201109-10:56:57)" level=info ts=2022-02-25T16:36:34.191697484Z caller=main.go:182 msg="Starting web server for metrics" listen=:8080 level=error ts=2022-02-25T16:36:34.194758389Z caller=runutil.go:98 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post "http:// localhost:9090/-/reload": dial tcp [::1]:9090: connect: connection refused"

wwdz avatar Feb 25 '22 16:02 wwdz

Seeing same error on k8s cluster: 1.23.5 kube-prometheus version: 0.10.0 OS: Ubuntu20

ssista avatar Apr 04 '22 22:04 ssista

K8s Server Version: v1.18.6 kube-prometheus version: kube-prometheus-0.8.0 OS: Centos 7.3

jackyliusohu avatar May 16 '22 03:05 jackyliusohu

Port: 9090/TCP Host Port: 0/TCP Args: --web.console.templates=/etc/prometheus/consoles --web.console.libraries=/etc/prometheus/console_libraries --config.file=/etc/prometheus/config_out/prometheus.env.yaml --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=15d --web.enable-lifecycle --storage.tsdb.no-lockfile --web.route-prefix=/ State: Running Started: Mon, 16 May 2022 10:44:44 +0800 Last State: Terminated Reason: Error Message: "Start listening for connections" address=0.0.0.0:9090 level=info ts=2022-05-16T02:44:42.984Z caller=head.go:575 component=tsdb msg="replaying WAL, this may take awhile" level=info ts=2022-05-16T02:44:42.987Z caller=head.go:624 component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0 level=info ts=2022-05-16T02:44:42.987Z caller=head.go:627 component=tsdb msg="WAL replay completed" duration=2.474151ms level=info ts=2022-05-16T02:44:42.987Z caller=main.go:683 fs_type=XFS_SUPER_MAGIC level=info ts=2022-05-16T02:44:42.987Z caller=main.go:684 msg="TSDB started" level=info ts=2022-05-16T02:44:42.987Z caller=main.go:788 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml level=info ts=2022-05-16T02:44:42.987Z caller=main.go:535 msg="Stopping scrape discovery manager..." level=info ts=2022-05-16T02:44:42.987Z caller=main.go:549 msg="Stopping notify discovery manager..." level=info ts=2022-05-16T02:44:42.987Z caller=main.go:571 msg="Stopping scrape manager..." level=info ts=2022-05-16T02:44:42.987Z caller=manager.go:875 component="rule manager" msg="Stopping rule manager..." level=info ts=2022-05-16T02:44:42.987Z caller=manager.go:885 component="rule manager" msg="Rule manager stopped" level=info ts=2022-05-16T02:44:42.987Z caller=main.go:531 msg="Scrape discovery manager stopped" level=info ts=2022-05-16T02:44:42.987Z caller=main.go:545 msg="Notify discovery manager stopped" level=info ts=2022-05-16T02:44:42.988Z caller=main.go:565 msg="Scrape manager stopped" level=info ts=2022-05-16T02:44:42.988Z caller=notifier.go:598 component=notifier msg="Stopping notification manager..." level=info ts=2022-05-16T02:44:42.988Z caller=main.go:738 msg="Notifier manager stopped" level=error ts=2022-05-16T02:44:42.988Z caller=main.go:747 err="error loading config from "/etc/prometheus/config_out/prometheus.env.yaml": couldn't load configuration (--config.file="/etc/prometheus/config_out/prometheus.env.yaml"): open /etc/prometheus/config_out/prometheus.env.yaml: no such file or directory"

jackyliusohu avatar May 16 '22 03:05 jackyliusohu

Seeing same error on k8s version 1.20. Prometheus v2.36.2

Barceloniak1 avatar Mar 08 '23 15:03 Barceloniak1

看起来官方并不维护这部分,没有看到官方人员支持

justinmans avatar Mar 21 '23 01:03 justinmans

I encountered this error today. Following the README, including the wait, corrected the issue for me.

#Create the namespace and CRDs, and then wait for them to be available before creating the remaining resources #Note that due to some CRD size we are using kubectl server-side apply feature which is generally available since kubernetes 1.22. #If you are using previous kubernetes versions this feature may not be available and you would need to use kubectl create instead. kubectl apply --server-side -f manifests/setup kubectl wait
--for condition=Established
--all CustomResourceDefinition
--namespace=monitoring kubectl apply -f manifests/

tomsherrod avatar Jul 26 '23 20:07 tomsherrod