opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

Unable to extract `k8s.container.name` and use as a Resource Attribute

Open SeamusGrafana opened this issue 1 year ago • 12 comments
trafficstars

Component(s)

processor/k8sattributes

What happened?

Description

I am unable to extract k8s.container.name and use as a Resource Attribute in my Telemetry data

Steps to Reproduce

Deploy OpenTelemetry Collector on Kubernetes (I am using GCP) Deploy the OpenTelemetry Demo Set the k8s.container.name and container.id to be extracted in the k8sattributes processor Observe that the k8s.container.name is not there as a resource attribute

Expected Result

The k8s.container.name should be a Resource Attribute

Actual Result

The k8s.container.name is not available as a Resource Attribute

Collector version

0.93.0

Environment information

Environment

Kubernetes on GCP, 1.27.8-gke.1067004

OpenTelemetry Collector configuration

exporters:
      debug: {}
      # debug:
      #   verbosity: detailed
      #   sampling_initial: 10
      #   sampling_thereafter: 200
      otlphttp:
        auth:
          authenticator: basicauth
        endpoint: ""
    extensions:
      basicauth:
        client_auth:
          password: ""
          username: ""
      health_check: {}
      memory_ballast:
        size_in_percentage: 40
    processors:
      batch: {}
      filter/ottl:
        error_mode: ignore
        metrics:
          metric:
          - name == "rpc.server.duration"
      attributes:
        actions:
          - action: insert
            key: cluster
            value: supportlab-dev-gcp
          - action: insert
            key: source
            value: otelcol-opentelemetry-demo
      attributes/fix_upstream_attribute:
        actions:
          - key: upstream.cluster
            from_attribute: upstream_cluster
            action: upsert
      k8sattributes:
        extract:
          metadata:
          - container.id
          # - container.image.name
          # - container.image.tag
          - k8s.namespace.name
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.cronjob.name
          - k8s.job.name
          - k8s.node.name
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.pod.start_time
          - k8s.container.name
        filter:
          node_from_env_var: K8S_NODE_NAME
        passthrough: false
        pod_association:
        - sources:
          - from: resource_attribute
            name: k8s.pod.ip
        - sources:
          - from: resource_attribute
            name: k8s.pod.uid
        - sources:
          - from: connection
      memory_limiter:
        check_interval: 5s
        limit_percentage: 80
        spike_limit_percentage: 25
      resource:
        attributes:
        - action: insert
          from_attribute: k8s.pod.uid
          key: service.instance.id
        - action: insert
          key: k8s.cluster.name
          value: supportlab-dev-gcp
        - action: insert
          key: service.namespace
          from_attribute: k8s.namespace.name
        - action: insert
          key: deployment.environment
          value: production
      resourcedetection:
        detectors: ["env", "system"]
        override: false
        system:
          hostname_sources: ["os"]
      resourcedetection/gcp:
        detectors: ["env", "gcp"]
        override: false
      transform:
        metric_statements:
        - context: datapoint
          statements:
          - set(attributes["cluster"], resource.attributes["k8s.cluster.name"])
          - set(attributes["container"], resource.attributes["k8s.container.name"])
          - set(attributes["deployment.environment"], resource.attributes["deployment.environment"])
          - set(attributes["deployment.name"], resource.attributes["k8s.deployment.name"])
          - set(attributes["pod"], resource.attributes["k8s.pod.name"])
          - set(attributes["service.namespace"], resource.attributes["k8s.namespace.name"])
          - set(attributes["service.version"], resource.attributes["service.version"])
    receivers:
      hostmetrics:
        collection_interval: 1m
        root_path: /
        scrapers:
          cpu: {}
          load:
            cpu_average: true
          memory: {}
          disk: {}
          filesystem: {}
          network: {}
          paging: {}
          process: {}
          processes: {}
      otlp:
        protocols:
          grpc:
            endpoint: $${env:MY_POD_IP}:4317
          http:
            endpoint: $${env:MY_POD_IP}:4318
      prometheus:
        config:
          scrape_configs:
          - job_name: opentelemetry-collector
            scrape_interval: 10s
            static_configs:
            - targets:
              - $${env:MY_POD_IP}:8888
    service:
      extensions:
      - basicauth
      - health_check
      - memory_ballast
      pipelines:
        logs:
          exporters:
          # - debug
          - otlphttp
          processors:
          - k8sattributes
          - resourcedetection/gcp
          - memory_limiter
          - resource
          - batch
          receivers:
          - otlp
        metrics:
          exporters:
          # - debug
          - otlphttp
          processors:
          - attributes
          - k8sattributes
          - filter/ottl
          - resource
          - resourcedetection
          - transform
          - memory_limiter
          - batch
          receivers:
          - hostmetrics
          - otlp
          - prometheus
        traces:
          exporters:
          # - debug
          - otlphttp
          processors:
          - k8sattributes
          - memory_limiter
          - resource
          - resourcedetection/gcp
          - batch
          receivers:
          - otlp
      telemetry:
        metrics:
          address: $${env:MY_POD_IP}:8888
          level: detailed
        logs:
          level: "debug"

Log output

2024-02-29T09:40:46.827Z	debug	[email protected]/processor.go:140	getting the pod	{"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "pod": {"Name":"opentelemetry-demo-adservice-8674c5c-sw2n7","Address":"10.40.0.231","PodUID":"bb3383d3-cd60-4238-afe4-699bec00c5b3","Attributes":{"k8s.deployment.name":"opentelemetry-demo-adservice","k8s.namespace.name":"opentelemetry-demo","k8s.node.name":"gke-supportlab-dev-supportlab-dev-492967d6-lnjn","k8s.pod.name":"opentelemetry-demo-adservice-8674c5c-sw2n7","k8s.pod.start_time":"2024-02-29T09:36:54Z","k8s.pod.uid":"bb3383d3-cd60-4238-afe4-699bec00c5b3"},"StartTime":"2024-02-29T09:36:54Z","Ignore":false,"Namespace":"opentelemetry-demo","NodeName":"gke-supportlab-dev-supportlab-dev-492967d6-lnjn","HostNetwork":false,"Containers":{"ByID":{"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c":{"Name":"adservice","ImageName":"ghcr.io/open-telemetry/demo","ImageTag":"1.8.0-adservice","Statuses":{"0":{"ContainerID":"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c"}}}},"ByName":{"adservice":{"Name":"adservice","ImageName":"ghcr.io/open-telemetry/demo","ImageTag":"1.8.0-adservice","Statuses":{"0":{"ContainerID":"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c"}}}}},"DeletedAt":"0001-01-01T00:00:00Z"}}


2024-02-29 10:42:46.466	
Descriptor:
2024-02-29 10:42:46.466	
Metric #1
2024-02-29 10:42:46.466	
Value: 0
2024-02-29 10:42:46.466	
Timestamp: 2024-02-29 10:42:46.336978757 +0000 UTC
2024-02-29 10:42:46.466	
StartTimestamp: 2024-02-29 10:04:46.278470849 +0000 UTC
2024-02-29 10:42:46.466	
     -> service.namespace: Str(opentelemetry-demo)
2024-02-29 10:42:46.466	
     -> pod: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466	
     -> deployment.name: Str(opentelemetry-demo-adservice)
2024-02-29 10:42:46.466	
     -> deployment.environment: Str(production)
2024-02-29 10:42:46.466	
     -> source: Str(otelcol-opentelemetry-demo)
2024-02-29 10:42:46.466	
     -> cluster: Str(supportlab-dev-gcp)
2024-02-29 10:42:46.466	
     -> processorType: Str(BatchLogRecordProcessor)
2024-02-29 10:42:46.466	
Data point attributes:
2024-02-29 10:42:46.466	
NumberDataPoints #0
2024-02-29 10:42:46.466	
     -> DataType: Gauge
2024-02-29 10:42:46.466	
     -> Unit: 1
2024-02-29 10:42:46.466	
     -> Description: 
2024-02-29 10:42:46.466	
     -> Name: queueSize
2024-02-29 10:42:46.466	
Descriptor:
2024-02-29 10:42:46.466	
Metric #0
2024-02-29 10:42:46.466	
InstrumentationScope io.opentelemetry.sdk.logs 
2024-02-29 10:42:46.466	
ScopeMetrics SchemaURL: 
2024-02-29 10:42:46.466	
ScopeMetrics #0
2024-02-29 10:42:46.466	
     -> deployment.environment: Str(production)
2024-02-29 10:42:46.466	
     -> k8s.cluster.name: Str(supportlab-dev-gcp)
2024-02-29 10:42:46.466	
     -> service.instance.id: Str(fc6d9ae0-bbb6-4c67-baf3-31689857cfd0)
2024-02-29 10:42:46.466	
     -> k8s.deployment.name: Str(opentelemetry-demo-adservice)
2024-02-29 10:42:46.466	
     -> k8s.pod.uid: Str(fc6d9ae0-bbb6-4c67-baf3-31689857cfd0)
2024-02-29 10:42:46.466	
     -> k8s.pod.start_time: Str(2024-02-29T10:04:15Z)
2024-02-29 10:42:46.466	
     -> k8s.namespace.name: Str(opentelemetry-demo)
2024-02-29 10:42:46.466	
     -> k8s.pod.name: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466	
     -> k8s.node.name: Str(gke-supportlab-dev-supportlab-dev-492967d6-lnjn)
2024-02-29 10:42:46.466	
     -> k8s.pod.ip: Str(10.40.0.81)
2024-02-29 10:42:46.466	
     -> telemetry.sdk.version: Str(1.34.1)
2024-02-29 10:42:46.466	
     -> telemetry.sdk.name: Str(opentelemetry)
2024-02-29 10:42:46.466	
     -> telemetry.sdk.language: Str(java)
2024-02-29 10:42:46.466	
     -> telemetry.distro.version: Str(2.0.0)
2024-02-29 10:42:46.466	
     -> telemetry.distro.name: Str(opentelemetry-java-instrumentation)
2024-02-29 10:42:46.466	
     -> service.namespace: Str(opentelemetry-demo)
2024-02-29 10:42:46.466	
     -> service.name: Str(adservice)
2024-02-29 10:42:46.466	
     -> process.runtime.version: Str(21.0.2+13-LTS)
2024-02-29 10:42:46.466	
     -> process.runtime.name: Str(OpenJDK Runtime Environment)
2024-02-29 10:42:46.466	
     -> process.runtime.description: Str(Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.2+13-LTS)
2024-02-29 10:42:46.466	
     -> process.pid: Int(1)
2024-02-29 10:42:46.466	
     -> process.executable.path: Str(/opt/java/openjdk/bin/java)
2024-02-29 10:42:46.466	
     -> process.command_line: Str(/opt/java/openjdk/bin/java -javaagent:/usr/src/app/opentelemetry-javaagent.jar oteldemo.AdService)
2024-02-29 10:42:46.466	
     -> os.type: Str(linux)
2024-02-29 10:42:46.466	
     -> os.description: Str(Linux 5.15.133+)
2024-02-29 10:42:46.466	
     -> host.name: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466	
     -> host.arch: Str(amd64)
2024-02-29 10:42:46.466	
     -> container.id: Str(8d936168d137ef81c741e0807de0986a5fba395e1cd1302692ba9db7ad822959)

Additional context

I dont see the Attribute when setting the Debug Exporter.

I can see in the logs its able to extract the Container ID and Container Name.

SeamusGrafana avatar Feb 29 '24 11:02 SeamusGrafana

Pinging code owners:

  • processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Feb 29 '24 11:02 github-actions[bot]

In order to extract container.id the data must contain the resource attribute k8s.container.restart_count: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor#configuration. Does your data contain that resource attribute?

TylerHelmuth avatar Mar 12 '24 17:03 TylerHelmuth

Hello @TylerHelmuth

Thanks for that.

I can see that container.id is already available as a resource attribute in my data (can be seen above too), and for k8s.container.restart_count it does not mention anything about k8s.container.name which is missing.

However, I added it into my config as a test, and it then caused an error;

Error: invalid configuration: processors::k8sattributes: "k8s.container.restart_count" is not a supported metadata field

Is there something specific I need to do for this, or where to add it?

Thanks

SeamusGrafana avatar Mar 14 '24 17:03 SeamusGrafana

I got mixed up on what you wanted to extract.

In order to extract k8s.container.name the resource attributes must already contain container.id when it arrives at the processor. Is that true for your data or is the k8sattributesprocessor adding it?

TylerHelmuth avatar Mar 14 '24 18:03 TylerHelmuth

Thanks @TylerHelmuth

I believe it is from what I can see, but not 100% certain.

How could I verify this please? Thanks

SeamusGrafana avatar Mar 14 '24 18:03 SeamusGrafana

Remove the k8sattributesprocessor and then add a debugexporter with verbosity: detailed.

TylerHelmuth avatar Mar 14 '24 18:03 TylerHelmuth

Hello @TylerHelmuth

Thanks for that, yes they do have a container.id attribute;

image

SeamusGrafana avatar Mar 15 '24 11:03 SeamusGrafana

If you inspect your pod does that container.id match the real container id?

TylerHelmuth avatar Mar 15 '24 13:03 TylerHelmuth

Thanks @TylerHelmuth

No, the debug exporter shows an ID that I cant correlate to anything;

image

But the k8sproccessor does show the correct ContainerID value;

2024-03-15T16:37:24.921Z	debug	[email protected]/processor.go:140	getting the pod	{"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "pod": {"Name":"opentelemetry-demo-adservice-8674c5c-bbnkc","Address":"10.40.0.148","PodUID":"ae11eb0b-08ae-4e0e-bd90-d90000033f02","Attributes":{"k8s.deployment.name":"opentelemetry-demo-adservice","k8s.namespace.name":"opentelemetry-demo","k8s.node.name":"gke-REDACTED-cvbc","k8s.pod.name":"opentelemetry-demo-adservice-8674c5c-bbnkc","k8s.pod.start_time":"2024-03-15T15:51:54Z","k8s.pod.uid":"ae11eb0b-08ae-4e0e-bd90-d90000033f02"},"StartTime":"2024-03-15T15:51:54Z","Ignore":false,"Namespace":"opentelemetry-demo","NodeName":"gke-REDACTED-cvbc","HostNetwork":false,"Containers":{"ByID":{"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23":{"Name":"adservice","ImageName":"","ImageTag":"","Statuses":{"0":{"ContainerID":"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23"}}}},"ByName":{"adservice":{"Name":"adservice","ImageName":"","ImageTag":"","Statuses":{"0":{"ContainerID":"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23"}}}}},"DeletedAt":"0001-01-01T00:00:00Z"}}

image

SeamusGrafana avatar Mar 15 '24 16:03 SeamusGrafana

It looks like the incoming metrics/logs have a container.id value which does not match any of the actual containers in the pod in question. The k8sattribute processor can't lookup the container name since it does not find a match for the container.id in the resource attr in the pod object. I'd suggest looking at the source of that container.id in the instrumented service and why it does not reflect that of the running container.

jinja2 avatar Mar 15 '24 18:03 jinja2

Thanks.

The Service is the AdService from the OpenTelemetry Demo, however the logs show that the k8sproccesor does have the right container ID, if thats the case, should it not be able to add the Container Name attribute, or does it still defer to the Metric?

SeamusGrafana avatar Mar 21 '24 15:03 SeamusGrafana

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar May 21 '24 03:05 github-actions[bot]

This issue has been closed as inactive because it has been stale for 120 days with no activity.

github-actions[bot] avatar Jul 20 '24 05:07 github-actions[bot]