opentelemetry-collector-contrib
opentelemetry-collector-contrib copied to clipboard
Unable to extract `k8s.container.name` and use as a Resource Attribute
Component(s)
processor/k8sattributes
What happened?
Description
I am unable to extract k8s.container.name and use as a Resource Attribute in my Telemetry data
Steps to Reproduce
Deploy OpenTelemetry Collector on Kubernetes (I am using GCP)
Deploy the OpenTelemetry Demo
Set the k8s.container.name and container.id to be extracted in the k8sattributes processor
Observe that the k8s.container.name is not there as a resource attribute
Expected Result
The k8s.container.name should be a Resource Attribute
Actual Result
The k8s.container.name is not available as a Resource Attribute
Collector version
0.93.0
Environment information
Environment
Kubernetes on GCP, 1.27.8-gke.1067004
OpenTelemetry Collector configuration
exporters:
debug: {}
# debug:
# verbosity: detailed
# sampling_initial: 10
# sampling_thereafter: 200
otlphttp:
auth:
authenticator: basicauth
endpoint: ""
extensions:
basicauth:
client_auth:
password: ""
username: ""
health_check: {}
memory_ballast:
size_in_percentage: 40
processors:
batch: {}
filter/ottl:
error_mode: ignore
metrics:
metric:
- name == "rpc.server.duration"
attributes:
actions:
- action: insert
key: cluster
value: supportlab-dev-gcp
- action: insert
key: source
value: otelcol-opentelemetry-demo
attributes/fix_upstream_attribute:
actions:
- key: upstream.cluster
from_attribute: upstream_cluster
action: upsert
k8sattributes:
extract:
metadata:
- container.id
# - container.image.name
# - container.image.tag
- k8s.namespace.name
- k8s.deployment.name
- k8s.statefulset.name
- k8s.daemonset.name
- k8s.cronjob.name
- k8s.job.name
- k8s.node.name
- k8s.pod.name
- k8s.pod.uid
- k8s.pod.start_time
- k8s.container.name
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
resource:
attributes:
- action: insert
from_attribute: k8s.pod.uid
key: service.instance.id
- action: insert
key: k8s.cluster.name
value: supportlab-dev-gcp
- action: insert
key: service.namespace
from_attribute: k8s.namespace.name
- action: insert
key: deployment.environment
value: production
resourcedetection:
detectors: ["env", "system"]
override: false
system:
hostname_sources: ["os"]
resourcedetection/gcp:
detectors: ["env", "gcp"]
override: false
transform:
metric_statements:
- context: datapoint
statements:
- set(attributes["cluster"], resource.attributes["k8s.cluster.name"])
- set(attributes["container"], resource.attributes["k8s.container.name"])
- set(attributes["deployment.environment"], resource.attributes["deployment.environment"])
- set(attributes["deployment.name"], resource.attributes["k8s.deployment.name"])
- set(attributes["pod"], resource.attributes["k8s.pod.name"])
- set(attributes["service.namespace"], resource.attributes["k8s.namespace.name"])
- set(attributes["service.version"], resource.attributes["service.version"])
receivers:
hostmetrics:
collection_interval: 1m
root_path: /
scrapers:
cpu: {}
load:
cpu_average: true
memory: {}
disk: {}
filesystem: {}
network: {}
paging: {}
process: {}
processes: {}
otlp:
protocols:
grpc:
endpoint: $${env:MY_POD_IP}:4317
http:
endpoint: $${env:MY_POD_IP}:4318
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-collector
scrape_interval: 10s
static_configs:
- targets:
- $${env:MY_POD_IP}:8888
service:
extensions:
- basicauth
- health_check
- memory_ballast
pipelines:
logs:
exporters:
# - debug
- otlphttp
processors:
- k8sattributes
- resourcedetection/gcp
- memory_limiter
- resource
- batch
receivers:
- otlp
metrics:
exporters:
# - debug
- otlphttp
processors:
- attributes
- k8sattributes
- filter/ottl
- resource
- resourcedetection
- transform
- memory_limiter
- batch
receivers:
- hostmetrics
- otlp
- prometheus
traces:
exporters:
# - debug
- otlphttp
processors:
- k8sattributes
- memory_limiter
- resource
- resourcedetection/gcp
- batch
receivers:
- otlp
telemetry:
metrics:
address: $${env:MY_POD_IP}:8888
level: detailed
logs:
level: "debug"
Log output
2024-02-29T09:40:46.827Z debug [email protected]/processor.go:140 getting the pod {"kind": "processor", "name": "k8sattributes", "pipeline": "traces", "pod": {"Name":"opentelemetry-demo-adservice-8674c5c-sw2n7","Address":"10.40.0.231","PodUID":"bb3383d3-cd60-4238-afe4-699bec00c5b3","Attributes":{"k8s.deployment.name":"opentelemetry-demo-adservice","k8s.namespace.name":"opentelemetry-demo","k8s.node.name":"gke-supportlab-dev-supportlab-dev-492967d6-lnjn","k8s.pod.name":"opentelemetry-demo-adservice-8674c5c-sw2n7","k8s.pod.start_time":"2024-02-29T09:36:54Z","k8s.pod.uid":"bb3383d3-cd60-4238-afe4-699bec00c5b3"},"StartTime":"2024-02-29T09:36:54Z","Ignore":false,"Namespace":"opentelemetry-demo","NodeName":"gke-supportlab-dev-supportlab-dev-492967d6-lnjn","HostNetwork":false,"Containers":{"ByID":{"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c":{"Name":"adservice","ImageName":"ghcr.io/open-telemetry/demo","ImageTag":"1.8.0-adservice","Statuses":{"0":{"ContainerID":"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c"}}}},"ByName":{"adservice":{"Name":"adservice","ImageName":"ghcr.io/open-telemetry/demo","ImageTag":"1.8.0-adservice","Statuses":{"0":{"ContainerID":"d27e0b01ff6c7c5b5ad2c751c8d49f117190bd9a157c252e7e9098596a7cae2c"}}}}},"DeletedAt":"0001-01-01T00:00:00Z"}}
2024-02-29 10:42:46.466
Descriptor:
2024-02-29 10:42:46.466
Metric #1
2024-02-29 10:42:46.466
Value: 0
2024-02-29 10:42:46.466
Timestamp: 2024-02-29 10:42:46.336978757 +0000 UTC
2024-02-29 10:42:46.466
StartTimestamp: 2024-02-29 10:04:46.278470849 +0000 UTC
2024-02-29 10:42:46.466
-> service.namespace: Str(opentelemetry-demo)
2024-02-29 10:42:46.466
-> pod: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466
-> deployment.name: Str(opentelemetry-demo-adservice)
2024-02-29 10:42:46.466
-> deployment.environment: Str(production)
2024-02-29 10:42:46.466
-> source: Str(otelcol-opentelemetry-demo)
2024-02-29 10:42:46.466
-> cluster: Str(supportlab-dev-gcp)
2024-02-29 10:42:46.466
-> processorType: Str(BatchLogRecordProcessor)
2024-02-29 10:42:46.466
Data point attributes:
2024-02-29 10:42:46.466
NumberDataPoints #0
2024-02-29 10:42:46.466
-> DataType: Gauge
2024-02-29 10:42:46.466
-> Unit: 1
2024-02-29 10:42:46.466
-> Description:
2024-02-29 10:42:46.466
-> Name: queueSize
2024-02-29 10:42:46.466
Descriptor:
2024-02-29 10:42:46.466
Metric #0
2024-02-29 10:42:46.466
InstrumentationScope io.opentelemetry.sdk.logs
2024-02-29 10:42:46.466
ScopeMetrics SchemaURL:
2024-02-29 10:42:46.466
ScopeMetrics #0
2024-02-29 10:42:46.466
-> deployment.environment: Str(production)
2024-02-29 10:42:46.466
-> k8s.cluster.name: Str(supportlab-dev-gcp)
2024-02-29 10:42:46.466
-> service.instance.id: Str(fc6d9ae0-bbb6-4c67-baf3-31689857cfd0)
2024-02-29 10:42:46.466
-> k8s.deployment.name: Str(opentelemetry-demo-adservice)
2024-02-29 10:42:46.466
-> k8s.pod.uid: Str(fc6d9ae0-bbb6-4c67-baf3-31689857cfd0)
2024-02-29 10:42:46.466
-> k8s.pod.start_time: Str(2024-02-29T10:04:15Z)
2024-02-29 10:42:46.466
-> k8s.namespace.name: Str(opentelemetry-demo)
2024-02-29 10:42:46.466
-> k8s.pod.name: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466
-> k8s.node.name: Str(gke-supportlab-dev-supportlab-dev-492967d6-lnjn)
2024-02-29 10:42:46.466
-> k8s.pod.ip: Str(10.40.0.81)
2024-02-29 10:42:46.466
-> telemetry.sdk.version: Str(1.34.1)
2024-02-29 10:42:46.466
-> telemetry.sdk.name: Str(opentelemetry)
2024-02-29 10:42:46.466
-> telemetry.sdk.language: Str(java)
2024-02-29 10:42:46.466
-> telemetry.distro.version: Str(2.0.0)
2024-02-29 10:42:46.466
-> telemetry.distro.name: Str(opentelemetry-java-instrumentation)
2024-02-29 10:42:46.466
-> service.namespace: Str(opentelemetry-demo)
2024-02-29 10:42:46.466
-> service.name: Str(adservice)
2024-02-29 10:42:46.466
-> process.runtime.version: Str(21.0.2+13-LTS)
2024-02-29 10:42:46.466
-> process.runtime.name: Str(OpenJDK Runtime Environment)
2024-02-29 10:42:46.466
-> process.runtime.description: Str(Eclipse Adoptium OpenJDK 64-Bit Server VM 21.0.2+13-LTS)
2024-02-29 10:42:46.466
-> process.pid: Int(1)
2024-02-29 10:42:46.466
-> process.executable.path: Str(/opt/java/openjdk/bin/java)
2024-02-29 10:42:46.466
-> process.command_line: Str(/opt/java/openjdk/bin/java -javaagent:/usr/src/app/opentelemetry-javaagent.jar oteldemo.AdService)
2024-02-29 10:42:46.466
-> os.type: Str(linux)
2024-02-29 10:42:46.466
-> os.description: Str(Linux 5.15.133+)
2024-02-29 10:42:46.466
-> host.name: Str(opentelemetry-demo-adservice-8674c5c-q5422)
2024-02-29 10:42:46.466
-> host.arch: Str(amd64)
2024-02-29 10:42:46.466
-> container.id: Str(8d936168d137ef81c741e0807de0986a5fba395e1cd1302692ba9db7ad822959)
Additional context
I dont see the Attribute when setting the Debug Exporter.
I can see in the logs its able to extract the Container ID and Container Name.
Pinging code owners:
- processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth
See Adding Labels via Comments if you do not have permissions to add labels yourself.
In order to extract container.id the data must contain the resource attribute k8s.container.restart_count: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor#configuration. Does your data contain that resource attribute?
Hello @TylerHelmuth
Thanks for that.
I can see that container.id is already available as a resource attribute in my data (can be seen above too), and for k8s.container.restart_count it does not mention anything about k8s.container.name which is missing.
However, I added it into my config as a test, and it then caused an error;
Error: invalid configuration: processors::k8sattributes: "k8s.container.restart_count" is not a supported metadata field
Is there something specific I need to do for this, or where to add it?
Thanks
I got mixed up on what you wanted to extract.
In order to extract k8s.container.name the resource attributes must already contain container.id when it arrives at the processor. Is that true for your data or is the k8sattributesprocessor adding it?
Thanks @TylerHelmuth
I believe it is from what I can see, but not 100% certain.
How could I verify this please? Thanks
Remove the k8sattributesprocessor and then add a debugexporter with verbosity: detailed.
Hello @TylerHelmuth
Thanks for that, yes they do have a container.id attribute;
If you inspect your pod does that container.id match the real container id?
Thanks @TylerHelmuth
No, the debug exporter shows an ID that I cant correlate to anything;
But the k8sproccessor does show the correct ContainerID value;
2024-03-15T16:37:24.921Z debug [email protected]/processor.go:140 getting the pod {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics", "pod": {"Name":"opentelemetry-demo-adservice-8674c5c-bbnkc","Address":"10.40.0.148","PodUID":"ae11eb0b-08ae-4e0e-bd90-d90000033f02","Attributes":{"k8s.deployment.name":"opentelemetry-demo-adservice","k8s.namespace.name":"opentelemetry-demo","k8s.node.name":"gke-REDACTED-cvbc","k8s.pod.name":"opentelemetry-demo-adservice-8674c5c-bbnkc","k8s.pod.start_time":"2024-03-15T15:51:54Z","k8s.pod.uid":"ae11eb0b-08ae-4e0e-bd90-d90000033f02"},"StartTime":"2024-03-15T15:51:54Z","Ignore":false,"Namespace":"opentelemetry-demo","NodeName":"gke-REDACTED-cvbc","HostNetwork":false,"Containers":{"ByID":{"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23":{"Name":"adservice","ImageName":"","ImageTag":"","Statuses":{"0":{"ContainerID":"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23"}}}},"ByName":{"adservice":{"Name":"adservice","ImageName":"","ImageTag":"","Statuses":{"0":{"ContainerID":"c684a7f643dc3cd2a1816ffb40ba8591e0c106a0ea038e46d79c5ffc0f55de23"}}}}},"DeletedAt":"0001-01-01T00:00:00Z"}}
It looks like the incoming metrics/logs have a container.id value which does not match any of the actual containers in the pod in question. The k8sattribute processor can't lookup the container name since it does not find a match for the container.id in the resource attr in the pod object. I'd suggest looking at the source of that container.id in the instrumented service and why it does not reflect that of the running container.
Thanks.
The Service is the AdService from the OpenTelemetry Demo, however the logs show that the k8sproccesor does have the right container ID, if thats the case, should it not be able to add the Container Name attribute, or does it still defer to the Metric?
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
- processor/k8sattributes: @dmitryax @rmfitzpatrick @fatsheep9146 @TylerHelmuth
See Adding Labels via Comments if you do not have permissions to add labels yourself.
This issue has been closed as inactive because it has been stale for 120 days with no activity.