opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Review the language around standard attributes (and breaking changes)

Open lmolkova opened this issue 9 months ago • 10 comments

https://github.com/open-telemetry/opentelemetry-specification/blob/c4493096788ce9517cd55f684428976ea164cbb2/specification/common/README.md?plain=1#L78-L99

It's not clear how the arguments provided lead to the summary that extending standard attributes is a breaking change. The discussion in the https://github.com/open-telemetry/opentelemetry-specification/pull/3858 and the language used in arguments clearly tries to prevent the evolution and document the current decision, but there are no arguments that explain the breaking nature of such change.

If we want to stick to this decision in practice (i.e. as SDKs implement non-standard attributes), we need to:

  • remove the "breaking change" part
  • remove the The standard attribute definition SHOULD be used to represent attributes in data modeling unless there is a strong justification to diverge - there is no reason for Profiling to stick to standard attributes - there is no prior art in OTel that would be challenged by them
  • explain the limits of this decision such as whether it applies to:
    • OTel API which should not allow complex attributes on spans/metrics API - this does not seem like a good limitation - API should not be that opinionated - that's for SDKs and backends. Moreover, several SDKs violate it already.
    • OTel SDK - API permits complex attributes, but SDK doesn't (e.g. drops them by default) - that'd be the least problematic way to keep the limitation and would still allow the evolution without requiring API v2
    • Semantic conventions - API and SDK allow end users to do anything that makes sense for them in their backend, but OTel semconv won't define complex attributes for certain signals

Having said this, I'd like to re-litigate this decision on the following grounds:

  • it was driven by the desire to limit the problem space, https://github.com/open-telemetry/opentelemetry-specification/pull/2888#issuecomment-1904888370
  • was based on the tracing and metrics prior art and didn't consider future problem spaces (events and correlation/consistency between signals)
  • it's limiting OTel evolution and requiring hacky designs in logging SDK. We should not have a spec language that closes the door for API evolution for the sake of closing the discussion. The bar on such changes has to be extremely high.
  • It goes against the spirit of Spec principles
    • Being user-driven: we see the demand for complex attributes on spans in semantic conventions. We already went far defining string formats for exception stacktraces, HTTP headers, and now we're considering JSON serialization #4446.
    • Being consistent: we encourage recording complex values in different ways on logs and spans
    • Being simple: we complicate development experience by introducing multiple attribute types and defining conversion between them, while also assuming that this API may not be necessary in the future #4201, #4462

Relitigating this decision does not go against the values - it can be done in non-breaking manner.


The proposal would be to

  • have the same attributes API on all signals
  • by default SDKs would drop (or serialize to JSON) complex attributes on metrics. Users were never able to populate them, so this won't break anything existing.
  • we may allow customizing this behavior and we'll pick a default (drop or serialize). We may even allow complex attributes on metrics as opt-in. This is SDK/exporter configuration.
  • we'd recommend backends that don't support complex attributes on some signal to use the same default - drop or serialize at exporting or ingestion time.
  • we'll restrict complex attribute usage on metrics in otel semconv via tooling/policies

lmolkova avatar Mar 28 '25 19:03 lmolkova

From triage session on 2025-04-07 adding to TC inbox since we think this should go through the TC first to get clear guidance on whether this would be accepted in some form before discussing further

mx-psi avatar Apr 07 '25 09:04 mx-psi

We discussed this during the Go SIG meeting and would like to request that the topic of extending standard attributes be prioritized by the TC.

We’re nearing specification compliance (and stabilization) of the OTel Go Logs API and would like to understand how likely it is that standard attributes will be extended to support unification across all signals, including resources.

If such unification is likely, we would prefer to delay the stabilization of the Logs API to align with the broader direction. However, we do not want to postpone stabilization indefinitely. Our preference would be to reach a clear decision around extending the standard attributes before end of May and finalize its stabilization before the end of July 2025. We would love to stabilize OTel Go Logs before KubeCon NA 2025 (November).

CC @open-telemetry/go-maintainers

pellared avatar Apr 10 '25 19:04 pellared

I've added this to the Spec SIG meeting agenda again for next week and will commit to keeping it on the agenda for the following weeks until it's resolved one way or the other. The preference is to drive these decisions through the Spec SIG (with the TC actively participating in the Spec SIG meeting) instead of asking the TC to discuss on their own since the broader community cannot then participate in the discussion. Escalating to a TC vote should be a last resort if the Spec SIG stalemates. After the Spec and Log SIG meeting discussions this past week, I think we have enough progress to justify continuing the discussion for at least another week or two.

trask avatar Apr 10 '25 19:04 trask

@trask should we have a way to represent this in triage? Our rationale for TC inbox is that we absolutely need TC consensus on this before moving forward, and we want to signal that (so that people don't waste their efforts without seeking this consensus first)

mx-psi avatar Apr 11 '25 07:04 mx-psi

@trask, I haven’t seen any new information or feedback in over a year. It seems we’re revisiting the same arguments repeatedly. I believe it would be helpful to consolidate all the feedback and assess the technical implications of the proposed change, so we can move toward a well-informed decision. I’m a bit concerned that bringing this to the Spec SIG meeting could lead to a more rhetorical discussion, where authority and persuasion might outweigh technical merit, ultimately resulting in further delays without meaningful progress.

pellared avatar Apr 11 '25 07:04 pellared

A recommendation I brought up in the SIG meeting:

Allow complex attributes, but apply limits such that a flattened view is still within the configured limits of regular attributes. (Complex attributes should not be a work around for attribute limits.) This would also provide a natural support path for vendors that don't have native support for them.

tylerbenson avatar Apr 15 '25 15:04 tylerbenson

From today's Spec SIG meeting, we have a tentative agreement to move forward, with the next step being an OTEP which covers both

  • Why we're changing our mind (@trask to write)
  • Concrete proposal addressing implementation details (@lmolkova to write)

We will do our best to get a draft up by next week's Spec SIG meeting.

trask avatar Apr 16 '25 02:04 trask

My team would really, really like complex attributes to be supported on the tracing signal and a unified API for attributes across all signals, at least at the wire protocol level (i.e. it makes sense for metrics to disallow them and such).

This would also provide a natural support path for vendors that don't have native support for them.

One thing that stood out to me from previous discussions I was involved with is that is that from a vendors perspective since Logs are becoming such an important part of OpenTelemetry and since Logs have complex attributes practically speaking you have to be thinking about how you're going to support complex attributes in your system even if they are not present in other signals.

adriangb avatar Apr 17 '25 17:04 adriangb

Does Zipkin, Jaeger, or Prometheus support complex attributes yet? These are important platforms for OTel that aren't immediately related to logging. When I was talking about having a natural support path, these are the systems that I had in mind more so than the commercial vendors.

tylerbenson avatar Apr 17 '25 18:04 tylerbenson

Yes makes sense, I do agree that having a way to represent complex attributes as flat attributes makes sense. As you suggest my point was more about commercial vendors.

adriangb avatar Apr 17 '25 18:04 adriangb