apm icon indicating copy to clipboard operation
apm copied to clipboard

OpenTelemetry spans received from ADOT appear as having different `service.name` and coloured differently

Open michaelhyatt opened this issue 3 years ago • 9 comments

APM Server version (apm-server version): 7.13.1

Description of the problem including expected versus actual behavior: I am collecting OpenTelemetry traces from AWS Distro for OpenTelemetry Lambda using the APM server OTel intake and the recorded spans are coloured, potentially because they don't have the OTel equivalent of the service.name field.

Steps to reproduce: Attached:

  1. instructions on how to set up Java or Python Lambdas to send the APM data into Elastic Cloud Environment setup.pdf
  2. JSON documents for the transaction and 2 underlying spans 3docs.json.txt.
  3. Screenshot of the APM UI with 1 transaction and 2 spans Screen Shot 2021-06-11 at 7 21 45 pm
  4. CloudWatch debug events for a similar call log-events-viewer-result.csv

michaelhyatt avatar Jun 11 '21 09:06 michaelhyatt

Sample trace

Transaction JSON Document
{
    "_index": "apm-7.13.0-transaction-000001",
    "_type": "_doc",
    "_id": "n8LRx3kBE5dbnzlq3bNC",
    "_version": 1,
    "_score": null,
    "fields": {
      "transaction.name.text": [
        "lambda_function.lambda_handler"
      ],
      "service.framework.version": [
        "0.20b0"
      ],
      "labels.faas_name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "service.language.name": [
        "python"
      ],
      "labels.faas_id": [
        "arn:aws:lambda:ap-southeast-2:401722391821:function:aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "transaction.sampled": [
        true
      ],
      "transaction.id": [
        "2119d8efcc1eee15"
      ],
      "trace.id": [
        "60b63874bfcddf1479deae000bb97df6"
      ],
      "processor.event": [
        "transaction"
      ],
      "agent.name": [
        "opentelemetry/python"
      ],
      "labels.faas_execution": [
        "72a1dc04-684a-466a-9955-181b964bde29"
      ],
      "event.outcome": [
        "unknown"
      ],
      "cloud.region": [
        "ap-southeast-2"
      ],
      "service.name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "service.framework.name": [
        "opentelemetry.instrumentation.aiohttp_client"
      ],
      "processor.name": [
        "transaction"
      ],
      "transaction.duration.us": [
        884967
      ],
      "labels.faas_version": [
        "$LATEST"
      ],
      "observer.version_major": [
        7
      ],
      "observer.hostname": [
        "ip-172-31-18-228"
      ],
      "transaction.type": [
        "custom"
      ],
      "cloud.provider": [
        "aws"
      ],
      "event.ingested": [
        "2021-06-01T13:44:50.242Z"
      ],
      "timestamp.us": [
        1622554740413283
      ],
      "@timestamp": [
        "2021-06-01T13:39:00.413Z"
      ],
      "ecs.version": [
        "1.8.0"
      ],
      "observer.type": [
        "apm-server"
      ],
      "observer.version": [
        "7.13.0"
      ],
      "agent.version": [
        "1.1.0"
      ],
      "parent.id": [
        "d9bc7a4fa2219c9f"
      ],
      "transaction.name": [
        "lambda_function.lambda_handler"
      ]
    },
    "highlight": {
      "cloud.provider": [
        "@kibana-highlighted-field@aws@/kibana-highlighted-field@"
      ],
      "trace.id": [
        "@kibana-highlighted-field@60b63874bfcddf1479deae000bb97df6@/kibana-highlighted-field@"
      ]
    },
    "sort": [
      1622554740413
    ]
  }
Span 1 JSON Document
{
    "_index": "apm-7.13.0-span-000001",
    "_type": "_doc",
    "_id": "ocLRx3kBE5dbnzlq3bNC",
    "_version": 1,
    "_score": null,
    "fields": {
      "labels.aws_service": [
        "s3"
      ],
      "span.name": [
        "s3"
      ],
      "labels.faas_name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "labels.aws_operation": [
        "ListBuckets"
      ],
      "labels.aws_request_id": [
        "HYEHAEQV9VHQRDV0"
      ],
      "trace.id": [
        "60b63874bfcddf1479deae000bb97df6"
      ],
      "span.duration.us": [
        469960
      ],
      "processor.event": [
        "span"
      ],
      "labels.aws_region": [
        "ap-southeast-2"
      ],
      "agent.name": [
        "opentelemetry/python"
      ],
      "event.outcome": [
        "success"
      ],
      "cloud.region": [
        "ap-southeast-2"
      ],
      "service.name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "processor.name": [
        "transaction"
      ],
      "labels.faas_version": [
        "$LATEST"
      ],
      "span.id": [
        "66d2ca36295df4d3"
      ],
      "span.subtype": [
        "http"
      ],
      "observer.version_major": [
        7
      ],
      "observer.hostname": [
        "ip-172-31-18-228"
      ],
      "span.type": [
        "external"
      ],
      "cloud.provider": [
        "aws"
      ],
      "timestamp.us": [
        1622554740826959
      ],
      "@timestamp": [
        "2021-06-01T13:39:00.826Z"
      ],
      "labels.retry_attempts": [
        0
      ],
      "ecs.version": [
        "1.8.0"
      ],
      "observer.type": [
        "apm-server"
      ],
      "observer.version": [
        "7.13.0"
      ],
      "agent.version": [
        "1.1.0"
      ],
      "parent.id": [
        "2119d8efcc1eee15"
      ]
    },
    "highlight": {
      "cloud.provider": [
        "@kibana-highlighted-field@aws@/kibana-highlighted-field@"
      ],
      "trace.id": [
        "@kibana-highlighted-field@60b63874bfcddf1479deae000bb97df6@/kibana-highlighted-field@"
      ]
    },
    "sort": [
      1622554740826
    ]
  }
Span 2 JSON Document
 {
    "_index": "apm-7.13.0-span-000001",
    "_type": "_doc",
    "_id": "oMLRx3kBE5dbnzlq3bNC",
    "_version": 1,
    "_score": null,
    "fields": {
      "span.destination.service.type": [
        "external"
      ],
      "span.name": [
        "HTTP GET"
      ],
      "destination.port": [
        80
      ],
      "labels.faas_name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "trace.id": [
        "60b63874bfcddf1479deae000bb97df6"
      ],
      "span.duration.us": [
        412528
      ],
      "processor.event": [
        "span"
      ],
      "agent.name": [
        "opentelemetry/python"
      ],
      "destination.address": [
        "httpbin.org"
      ],
      "event.outcome": [
        "success"
      ],
      "cloud.region": [
        "ap-southeast-2"
      ],
      "service.name": [
        "aws-otel-python-demo-function-UFypDHV2urZa"
      ],
      "processor.name": [
        "transaction"
      ],
      "labels.faas_version": [
        "$LATEST"
      ],
      "span.id": [
        "d34a9be3eb46124c"
      ],
      "span.subtype": [
        "http"
      ],
      "observer.version_major": [
        7
      ],
      "observer.hostname": [
        "ip-172-31-18-228"
      ],
      "span.type": [
        "external"
      ],
      "cloud.provider": [
        "aws"
      ],
      "timestamp.us": [
        1622554740413887
      ],
      "@timestamp": [
        "2021-06-01T13:39:00.413Z"
      ],
      "ecs.version": [
        "1.8.0"
      ],
      "observer.type": [
        "apm-server"
      ],
      "observer.version": [
        "7.13.0"
      ],
      "agent.version": [
        "1.1.0"
      ],
      "parent.id": [
        "2119d8efcc1eee15"
      ],
      "span.destination.service.name": [
        "http://httpbin.org"
      ],
      "span.destination.service.resource": [
        "httpbin.org:80"
      ]
    },
    "highlight": {
      "cloud.provider": [
        "@kibana-highlighted-field@aws@/kibana-highlighted-field@"
      ],
      "trace.id": [
        "@kibana-highlighted-field@60b63874bfcddf1479deae000bb97df6@/kibana-highlighted-field@"
      ]
    },
    "sort": [
      1622554740413
    ]
  }

cyrille-leclerc avatar Jun 15 '21 08:06 cyrille-leclerc

The spans have the same service.name, I think that the UI question is why they are displayed as type http and have a different color than the parent transaction, it's not intuitive.

An additional UI challenge is that the distributed trace visualisation sometimes uses different color for different "span type" and sometimes for different services (service.name)

 Transaction Span 1 Span 2

image

image

cyrille-leclerc avatar Jun 15 '21 08:06 cyrille-leclerc

@michaelhyatt @cyrille-leclerc I think the behavior you describe here is intended to be an explicit UI feature (rather than a bug) that has been introduced in 7.13.

See this: https://github.com/elastic/kibana/pull/90424

If a transaction covers only one service, then spans are colored by span type instead of by services.

cc @dgieselaar

AlexanderWert avatar Jun 23 '21 06:06 AlexanderWert

Yes, we show different colors for transactions and span types when the trace covers one service, and different colors for service names if the trace covers more than one service. It was a bit of an experiment, if we feel it doesn't add any value we should remove it again. We're planning to make changes to coloring in https://github.com/elastic/kibana/issues/93011, would you mind adding any feedback there?

dgieselaar avatar Jun 23 '21 06:06 dgieselaar

Thanks @dgieselaar and @AlexanderWert .

I'm not sure if I see more benefits or downsides. On one hand, I see the benefit when displaying the spans of a single service and, on the other hand,I feel it's hard to understand that we have different color codes when displaying the spans of one single service or the spans of multiple services.

@formgeist , @michaelhyatt what's your point of view?

We're planning to make changes to coloring in elastic/kibana#93011, would you mind adding any feedback there?

I don't know what to think about this type of enhancement, it's not a pain point that came to my mind. If we had to invest on our trace analytics capabilities, other improvements would come to my mind such as:

  • search on span attributes
  • display the traces that match with the selected criteria on the same screen rather than requiring image

cyrille-leclerc avatar Jun 23 '21 21:06 cyrille-leclerc

IMHO, I am used to seeing different colours represent traces belonging to different components, so the change of meaning of colours was unexpected and somewhat confusing.

michaelhyatt avatar Jun 24 '21 03:06 michaelhyatt

There are a few options available; we could choose to flip the order of the visualization palette colors when using them for span type instead, which will look different in most cases. The other option is that it's a user interaction to switch between service or type breakdown. I personally think utilizing the different colors in context is better than the previous, but it does have that possibility of confusing if you're used to one or the other.

formgeist avatar Jun 24 '21 11:06 formgeist

@formgeist I'm like @michaelhyatt, my mental model is to have one color per service. Could you show an example of what you mean by "flip the order of the visualization palette colors"?

cyrille-leclerc avatar Jun 28 '21 19:06 cyrille-leclerc

@cyrille-leclerc Just for the sake of illustrating what I meant, I've created some quicks mocks.

  1. We could use the reverse visualization palette for span type visualization on a single service transaction. This would certainly use colors not typically associated with traces, because we typically see traces with 3-4 services and they never reach the other end of the palette range. Additionally, we had a concept of assigning specific colors to the services in a trace first, then render them the same way for individual transactions. This might require us to extend the palette eventually.

Trace - Reverse color palette

  1. We could add an option for users to choose between coloring the timeline rows by type or service. This would be saved either in local storage, session, or in the user profile (when that's available).

Trace - Sort by option

formgeist avatar Jun 30 '21 09:06 formgeist