containers-roadmap
containers-roadmap copied to clipboard
[EKS] [request]: Add kube-dns service port for metrics
Tell us about your request
The kubernetes Service resource for kube-dns does not export the metrics port for Prometheus, even when in the Pod this is defined
Service
kind: Service
apiVersion: v1
metadata:
name: kube-dns
# omitted information here
prometheus.io/port: '9153'
prometheus.io/scrape: 'true'
spec:
ports:
- name: dns
protocol: UDP
port: 53
targetPort: 53
- name: dns-tcp
protocol: TCP
port: 53
targetPort: 53
# omitted information here
Pod
kind: Pod
apiVersion: v1
metadata:
name: coredns-5fd8748bdd-dpzts
# omitted information here
spec:
containers:
- name: coredns
image: '602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/coredns:v1.6.6'
args:
- '-conf'
- /etc/coredns/Corefile
ports:
- name: dns
containerPort: 53
protocol: UDP
- name: dns-tcp
containerPort: 53
protocol: TCP
- name: metrics
containerPort: 9153
protocol: TCP
# omitted information here
Which service(s) is this request for? EKS
- 1.14.9
- 1.15.11
- 1.16.8
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
I'm trying to use prometheus-operator serviceMonitor to get the metrics
serviceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kube-dns
labels:
k8s-app: kube-dns
prometheus-visibility.monitoring: "enabled"
spec:
jobLabel: kube-dns
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: kube-dns
endpoints:
- port: metrics
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
interval: 15s
and it doesn't work because the port is not available through the service
Are you currently working around this issue?
I added manually the metrics port to the Service
Service with the metrics port
kind: Service
apiVersion: v1
metadata:
name: kube-dns
# omitted information here
prometheus.io/port: '9153'
prometheus.io/scrape: 'true'
spec:
ports:
- name: dns
protocol: UDP
port: 53
targetPort: 53
- name: dns-tcp
protocol: TCP
port: 53
targetPort: 53
- name: metrics
protocol: TCP
port: 9153
targetPort: 9153
# omitted information here
Additional context Not relevant but I'm using kube-visibility project to monitor the cluster
An alternative to a ServiceMonitor is a PodMonitor.
We tend to go with PodMonitor's by default now as not all apps take inbound traffic, thus simpler than adding a service (and service monitor) for only prom discovery.
Any news on this?
An alternative to a
ServiceMonitoris aPodMonitor.We tend to go with PodMonitor's by default now as not all apps take inbound traffic, thus simpler than adding a service (and service monitor) for only prom discovery.
As per my understanding, in order to use a PodMonitor one needs to use prometheus-operator. What if someone does not use it? In my case, I install my Prometheus and Gafana charts separately.
As of now, I am using this one-liner as a workaround
kubectl -n kube-system patch svc kube-dns -p '{"spec": {"ports": [{"port": 9153,"targetPort": 9153,"name": "metrics"}]}}'
If I could find where the core-dns.yaml or similar lives (on GitHub presumably?) I would raise a PR to add the metrics port to try and progress this.
We ended up exporting the resources and maintaining our own yaml file in git for doing updates to core-dns but it would still be nice to confirm what AWS recommend/install as part of addons.
We'd also love it if this gets implemented. We use an external Prometheus (on EC2) to monitor EKS and are relying on service port 9153 to gather metrics. At the moment we created a separate kube-dns-metrics service just for this.
Any update on this? This is clearly a bug since the port is mentioned in the annotation but not available in the list of ports. I have to manually patch this in all the clusters and monitor that it stays that way.
same here - manually patching out-of-the-box solutions for everyone doesn't look right 😞 as it is like trying to catch the train and duct tape it
It's incredible that in 2022 there is still this misconfiguration.
For reference, this is still an issue in AWS EKS 1.23.
And EKS 1.26
can someone share a working podmonitor to scrap core-dns?
I used this one and not working
---
apiVersion: monitoring.coreos.com/v1 # The API version of the custom resource.
kind: PodMonitor # The kind of custom resource.
metadata:
name: coredns-monitor
labels:
app: grafana-agent
spec:
jobLabel: coredns-stats
selector:
matchLabels:
k8s-app: kube-dns # The label selector to match Pods to scrape.
podMetricsEndpoints:
# A list of endpoints to scrape from matching Pods.
- port: metrics # The name of the port to scrape from each Pod.
interval: 30s # How frequently to scrape this endpoint.
```
This works for core-dns
- job_name: 'kube-dns'
scheme: http
kubernetes_sd_configs:
- role: pod
selectors:
- role: "pod"
label: "k8s-app=kube-dns"
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: coredns
- source_labels: [__address__]
action: replace
regex: ([^:]+)(?::\d+)?
replacement: $1:9153
target_label: __address__
metric_relabel_configs:
- action: labeldrop
regex: id|name|image|boot_id|machine_id|system_uuid
- action: keep
regex: up|coredns_dns_responses_total|coredns_forward_request_duration_seconds_bucket|coredns_forward_request_duration_seconds_count|coredns_forward_responses_total|coredns_cache_requests_total|coredns_cache_misses_total|coredns_cache_hits_total|coredns_dns_request_duration_seconds.*|coredns_dns_requests_total|coredns_reload_failed_total
source_labels:
- __name__
Still an issue. This should be fixed...
This is a literal BUG.
The following tells Prometheus to scrape this target and to use the specified port but there's no ports mapping on the service.
...
Annotations:
prometheus.io/port: 9153
prometheus.io/scrape: true
...
Edit: I created an issue with a "better" title to highlight this being a bug. Hopefully that helps...
This is now fixed in coredns v1.10.1-eksbuild.5, v1.9.3-eksbuild.9 and v1.8.7-eksbuild.8.