splunk-kubernetes-logging: Liveness probe failed
What happened:
What you expected to happen: Liveness probe continue to fail, but the pod is active and collect logs
How to reproduce it (as minimally and precisely as possible): Deploy splunk-kubernetes-logging with default values
Anything else we need to know?: See the pod event log: Events: Type Reason Age From Message
Normal Scheduled 2m1s default-scheduler Successfully assigned default/era-splunk-connect-splunk-kubernetes-logging-66rmc to ip-10-240-10-168.ec2.internal Normal Pulled 118s kubelet Container image "docker.io/splunk/fluentd-hec:1.3.0" already present on machine Normal Created 118s kubelet Created container splunk-fluentd-k8s-logs Normal Started 118s kubelet Started container splunk-fluentd-k8s-logs Warning Unhealthy 1s kubelet Liveness probe failed: Get "http://[2607:f220:41a:5022:45d8::4]:24220/api/plugins.json": dial tcp [2607:f220:41a:5022:45d8::4]:24220: connect: connection refused
Environment:
- Kubernetes version (use
kubectl version): 1.23 - Ruby version (use
ruby --version): - OS (e.g:
cat /etc/os-release): Linux - Splunk version: 9.0.1 (latest)
- Splunk Connect for Kubernetes helm chart version: 1.5.0 and develop branch
- Others:
Getting the similar error
Normal Scheduled default-scheduler Successfully assigned splunk/esg-splunk-splunk-kubernetes-metrics-df977 to ip-10---.ap-southeast-2.compute.internal Normal Pulled kubelet Container image "docker.io/splunk/k8s-metrics:1.2.0" already present on machine Normal Created kubelet Created container splunk-fluentd-k8s-metrics Normal Started kubelet Started container splunk-fluentd-k8s-metrics Warning Unhealthy kubelet Liveness probe failed: Get "http://...:24220/api/plugins.json": dial tcp ...*:24220: connect: connection refused
I am also having a similar error.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 21m (x30 over 4h13m) kubelet Container splunk-fluentd-k8s-logs failed liveness probe, will be restarted
Warning Unhealthy 5m49s (x96 over 4h15m) kubelet Liveness probe failed: Get "http://198.18.134.62:24220/api/plugins.json": dial tcp 198.18.134.62:24220: connect: connection refused
I have not seen anything related in the container logs (with kubectl logs) and the container is sending logs to the targeted splunk index.
I have not been able to find documentation about what should be behind ":24220/api/plugins.json". Maybe @rockb1017 can share information about this to see if it is relevant to this issue.
I have made monitoring_agent bind address configurable in https://github.com/splunk/splunk-connect-for-kubernetes/pull/818.
Earlier, it was binding to 0.0.0.0 that only accepts IPv4 requests. If you are using an IPv6 cluster, you can set monitoring_agent_bind_address to :: in the global config.
global:
monitoring_agent_bind_address: "::"
This changes should fix the issue.
Got the issue fixed by adding below
The port that is used to get the metrics using apiserver proxy using ssl for the metrics aggregator
kubeletPortAggregator:
# This option is used to get the metrics from summary api on each kubelet using ssl
useRestClientSSL: true
insecureSSL: true
clusterName: <>
This issue is stale because it has been open for 30 days with no activity.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.