containers-roadmap
containers-roadmap copied to clipboard
EKS Fargate [Bug]: Containers running in Fargate cannot get their own metrics from the kubelet
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Tell us about your request What do you want us to build? Containers running in Fargate cannot get their own metrics from the kubelet
Which service(s) is this request for? This could be Fargate, EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
Currently, when trying to use curl or when running the metrics server calls against https://<Fargate_IP>:10250/metrics/resource will fail, in both cases it is a, connection refused error. Below is an example from the metrics server.
E0804 18:26:43.486945 1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.165.181:10250/metrics/resource\": dial tcp 192.168.165.181:10250: connect: connection refused" node="fargate-ip-192-168-165-181.ec2.internal"
The goal of this feature request/bug report would be to allow a Fargate pod to know it's own kubelet metrics.
Are you currently working around this issue? How are you currently solving this problem? I do not see a workaround.
Additional context Anything else we should know?
This mainly impact the metrics-server application as far as I can tell. The reasons for this are detailed here and this issue was previously raised with the metrics-server GitHub here. I was unable to find this issue raised here so I am putting it here to give this issue more visibility and making it easier to search.
Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
I appreciate this doesn't answer @pptb-aws's question about a Fargate Pod being able to reach its own Kubelet, but a monitoring server (running inside or outside of the cluster) can access all Kubelet Metrics via the Api Server:
kubectl get --raw /api/v1/nodes/fargate-ip-10-1-213-127.eu-west-1.compute.internal/proxy/metrics/cadvisor
Its how the EKS Fargate OpenTelemetry Collector blog and the EKS Fargate Prometheus blog work.
A snippet from the OpenTelemetry Collector Config:
scrape_configs:
- job_name: 'kubelets-cadvisor-metrics'
sample_limit: 10000
scheme: https
kubernetes_sd_configs:
- role: node
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
# Only for Kubernetes ^1.7.3.
# See: https://github.com/prometheus/prometheus/issues/2916
- target_label: __address__
# Changes the address to Kube API server's default address and port
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
# Changes the default metrics path to kubelet's proxy cadvdisor metrics endpoint
replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor
facing the same issue