aws-otel-collector icon indicating copy to clipboard operation
aws-otel-collector copied to clipboard

Separate kuberentes Job metrics

Open tomiszili opened this issue 2 years ago • 5 comments

Hello, Is it possible to configure aws OTEL with container insights on EKS to do not aggreagate the jobs together as "one pod", so view jobs as separate pods with their unique names?

Environment Current config:

extensions:
  health_check:

receivers:
  awscontainerinsightreceiver:

processors:
  batch/metrics:
    timeout: 60s

exporters:
  awsemf:
    namespace: ContainerInsights
    log_group_name: '/aws/containerinsights/{ClusterName}/performance'
    log_stream_name: '{NodeName}'
    resource_to_telemetry_conversion:
      enabled: true
    dimension_rollup_option: NoDimensionRollup
    parse_json_encoded_attr_values: [Sources, kubernetes]
    metric_declarations:
      # node metrics
      - dimensions: [[NodeName, InstanceId, ClusterName]]
        metric_name_selectors:
          - node_cpu_utilization
          - node_memory_utilization
          # - node_network_total_bytes
          # - node_cpu_reserved_capacity
          # - node_memory_reserved_capacity
          # - node_number_of_running_pods
          # - node_number_of_running_containers
      - dimensions: [[ClusterName]]
        metric_name_selectors:
          - node_cpu_utilization
          - node_memory_utilization
          # - node_network_total_bytes
          # - node_cpu_reserved_capacity
          # - node_memory_reserved_capacity
          # - node_number_of_running_pods
          # - node_number_of_running_containers
          # - node_cpu_usage_total
          # - node_cpu_limit
          # - node_memory_working_set
          # - node_memory_limit

      # pod metrics
      # - dimensions: [[PodName, Namespace, ClusterName], [Service, Namespace, ClusterName], [Namespace, ClusterName], [ClusterName]]
      - dimensions: [[PodName, Namespace, ClusterName], [Service, Namespace, ClusterName]]
        metric_name_selectors:
          - pod_cpu_utilization
          - pod_memory_utilization
          # - pod_network_rx_bytes
          # - pod_network_tx_bytes
          # - pod_cpu_utilization_over_pod_limit
          # - pod_memory_utilization_over_pod_limit
      - dimensions: [[PodName, Namespace, ClusterName], [ClusterName]]
        metric_name_selectors:
          - pod_cpu_reserved_capacity
          - pod_memory_reserved_capacity
      - dimensions: [[PodName, Namespace, ClusterName]]
        metric_name_selectors:
          - pod_number_of_container_restarts

      # # cluster metrics
      # - dimensions: [[ClusterName]]
      #   metric_name_selectors:
      #     - cluster_node_count
      #     - cluster_failed_node_count

      # # service metrics
      # - dimensions: [[Service, Namespace, ClusterName], [ClusterName]]
      #   metric_name_selectors:
      #     - service_number_of_running_pods

      # # node fs metrics
      # - dimensions: [[NodeName, InstanceId, ClusterName], [ClusterName]]
      #   metric_name_selectors:
      #     - node_filesystem_utilization

      # # namespace metrics
      # - dimensions: [[Namespace, ClusterName], [ClusterName]]
      #   metric_name_selectors:
      #     - namespace_number_of_running_pods

service:
  pipelines:
    metrics:
      receivers: [awscontainerinsightreceiver]
      processors: [batch/metrics]
      exporters: [awsemf]

  extensions: [health_check]

tomiszili avatar May 24 '22 13:05 tomiszili

We are seeing this too. Using the PodName dimension but its an aggregate not individual pods. I would expect it to be individual pods given its dimension name is PodName

lorelei-rupp-imprivata avatar May 27 '22 14:05 lorelei-rupp-imprivata

Could you try enabling the prefer_full_pod_name field in the awscontainerinsightreceiver ?

bryan-aguilar avatar Jun 07 '22 17:06 bryan-aguilar

We tried enabling

    receivers:
      awscontainerinsightreceiver:
        add_full_pod_name_metric_label: true
        prefer_full_pod_name: true

It seemed to get a little better, some pods come through, like stateful sets it seems, but things that have a aws-node-<RANDOMSTRING> are not coming through, all you see is "aws-node" in grafana Attached a picture to show what im seeing image

Similarly like we have a pod named external-secrets-79cff596cd-zv258 but we just see "external-secrets"

lorelei-rupp-imprivata avatar Jun 07 '22 19:06 lorelei-rupp-imprivata

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Aug 07 '22 20:08 github-actions[bot]

Any update on this?

lorelei-rupp-imprivata avatar Aug 08 '22 12:08 lorelei-rupp-imprivata

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Oct 16 '22 20:10 github-actions[bot]

This is fixed if you use receivers: awscontainerinsightreceiver: add_full_pod_name_metric_label: true prefer_full_pod_name: true As well as the FullPodName in your dimentions instead of PodName on the metrics exporter section

lorelei-rupp-imprivata avatar Oct 17 '22 13:10 lorelei-rupp-imprivata

Thanks for the update @lorelei-rupp-imprivata!

bryan-aguilar avatar Oct 17 '22 14:10 bryan-aguilar