docs: Known issue on missing dependency in service map for Kong OpenTelemetry Plugin
Add documentation on the known issue about Kong OpenTelemetry Plugin leading to missing dependency link in APM Service Map.
Description of problem
To reproduce
I have a playground which is based on @1pkg 's playground: https://github.com/carsonip/kong-quickstart-otel/tree/carson
Tested with kong/kong-gateway:3.9.0.0
Setup: curl sends to service A behind Kong. Service A sends to B via Kong, and service B sends to C via Kong. See illustration
Result:
- resulting OTel docs from debugexporter: collector.log
- docs indexed in ES: docs.json
Highlight:
The exit span from Kong lacks required fields for APM to interpret it as a service dependency link
opentelemetry-collector-1 | Span #4
opentelemetry-collector-1 | Trace ID : bed5e28cfe42774e8ef1c9509685001d
opentelemetry-collector-1 | Parent ID : 052dc3538172140c
opentelemetry-collector-1 | ID : 3b8c984527a7f95e
opentelemetry-collector-1 | Name : kong.balancer
opentelemetry-collector-1 | Kind : Client
opentelemetry-collector-1 | Start time : 2025-03-27 10:03:09.028379392 +0000 UTC
opentelemetry-collector-1 | End time : 2025-03-27 10:03:09.030311168 +0000 UTC
opentelemetry-collector-1 | Status code : Unset
opentelemetry-collector-1 | Status message :
opentelemetry-collector-1 | Attributes:
opentelemetry-collector-1 | -> try_count: Double(1)
opentelemetry-collector-1 | -> net.peer.name: Str(172.17.0.1)
opentelemetry-collector-1 | -> net.peer.port: Double(10082)
opentelemetry-collector-1 | -> net.peer.ip: Str(172.17.0.1)
which gets translated to
{
"_index": ".ds-traces-apm-default-2025.03.27-000001",
"_id": "oscO15UBVFQppTdlRJwT",
"_score": 1,
"_source": {
"parent": {
"id": "052dc3538172140c"
},
"agent": {
"name": "otlp",
"version": "unknown"
},
"destination": {
"address": "172.17.0.1",
"ip": "172.17.0.1"
},
"processor": {
"event": "span"
},
"tags": [
"_geoip_database_unavailable_GeoLite2-City.mmdb"
],
"observer": {
"hostname": "e76267643b45",
"type": "apm-server",
"version": "8.17.4"
},
"trace": {
"id": "bed5e28cfe42774e8ef1c9509685001d"
},
"@timestamp": "2025-03-27T10:03:09.028Z",
"data_stream": {
"namespace": "default",
"type": "traces",
"dataset": "apm"
},
"numeric_labels": {
"net_peer_port": 10082,
"try_count": 1
},
"service": {
"node": {
"name": "f8636234-6a5d-4420-86df-4d89ec1c35a4"
},
"framework": {
"name": "kong-internal",
"version": "0.1.0"
},
"name": "kong-dev",
"language": {
"name": "unknown"
},
"version": "3.9.0.0"
},
"event": {
"ingested": "2025-03-27T10:03:11Z",
"success_count": 1,
"outcome": "success"
},
"span": {
"duration": {
"us": 1931
},
"representative_count": 1,
"name": "kong.balancer",
"id": "3b8c984527a7f95e",
"type": "unknown"
},
"timestamp": {
"us": 1743069789028379
}
}
}
Resulted in a service map that lacks every arrow from Kong to services:
Workaround
The goal of a workaround is to add the minimally required fields for APM to establish an arrow in service map.
Workaround is in commit https://github.com/carsonip/kong-quickstart-otel/tree/carson-workaround
The key is to add a transform processor to the trace pipeline of the otel collector that receives from kong. The processor will fill in peer.service from net.peer.name only for spans from kong balancer.
processors:
transform:
trace_statements:
- context: span
statements:
- set(attributes["peer.service"], attributes["net.peer.name"]) where attributes["net.peer.name"] != "" and name == "kong.balancer"
With that, the service map looks like this:
Long term fix
Kong exit spans should follow OTel HTTP client span spec https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-client and Elastic APM should work out of the box. See https://github.com/Kong/kong/issues/14381
Task
Update docs to mention the known issue briefly. Candidate location for docs: https://www.elastic.co/guide/en/observability/current/apm-open-telemetry-known-limitations.html