containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

EKS Fargate [Bug]: Containers running in Fargate cannot get their own metrics from the kubelet

Open pptb-aws opened this issue 2 years ago • 4 comments

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request What do you want us to build? Containers running in Fargate cannot get their own metrics from the kubelet

Which service(s) is this request for? This could be Fargate, EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

Currently, when trying to use curl or when running the metrics server calls against https://<Fargate_IP>:10250/metrics/resource will fail, in both cases it is a, connection refused error. Below is an example from the metrics server.

E0804 18:26:43.486945       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.165.181:10250/metrics/resource\": dial tcp 192.168.165.181:10250: connect: connection refused" node="fargate-ip-192-168-165-181.ec2.internal"

The goal of this feature request/bug report would be to allow a Fargate pod to know it's own kubelet metrics.

Are you currently working around this issue? How are you currently solving this problem? I do not see a workaround.

Additional context Anything else we should know?

This mainly impact the metrics-server application as far as I can tell. The reasons for this are detailed here and this issue was previously raised with the metrics-server GitHub here. I was unable to find this issue raised here so I am putting it here to give this issue more visibility and making it easier to search.

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

pptb-aws avatar Aug 04 '22 18:08 pptb-aws

I appreciate this doesn't answer @pptb-aws's question about a Fargate Pod being able to reach its own Kubelet, but a monitoring server (running inside or outside of the cluster) can access all Kubelet Metrics via the Api Server:

kubectl get --raw /api/v1/nodes/fargate-ip-10-1-213-127.eu-west-1.compute.internal/proxy/metrics/cadvisor

Its how the EKS Fargate OpenTelemetry Collector blog and the EKS Fargate Prometheus blog work.

A snippet from the OpenTelemetry Collector Config:

scrape_configs:
- job_name: 'kubelets-cadvisor-metrics'
  sample_limit: 10000
  scheme: https

  kubernetes_sd_configs:
  - role: node
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

  relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
      # Only for Kubernetes ^1.7.3.
      # See: https://github.com/prometheus/prometheus/issues/2916
    - target_label: __address__
      # Changes the address to Kube API server's default address and port
      replacement: kubernetes.default.svc:443
    - source_labels: [__meta_kubernetes_node_name]
      regex: (.+)
      target_label: __metrics_path__
      # Changes the default metrics path to kubelet's proxy cadvdisor metrics endpoint
      replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor

ollypom avatar Aug 11 '22 14:08 ollypom

facing the same issue

DilipCoder avatar May 07 '24 21:05 DilipCoder