opentelemetry-python Generate a service.instance.id resource attribute if it is not present

trafficstars

[service.instance.id] MUST be unique for each instance of the same service.namespace,service.name pair (in other words service.namespace,service.name,service.instance.id triplet MUST be globally unique). The ID helps to distinguish instances of the same service that exist at the same time (e.g. instances of a horizontally scaled service) [...] If the service has no inherent unique ID that can be used as the value of this attribute it is recommended to generate a random [UUID]

At the moment, just using the default resource in our SDK violates the requirement that "service.namespace,service.name,service.instance.id triplet MUST be globally unique":

In [1]: from opentelemetry.sdk.resources import Resource

In [2]: Resource.create().attributes
Out[2]: BoundedAttributes({'telemetry.sdk.language': 'python', 'telemetry.sdk.name': 'opentelemetry', 'telemetry.sdk.version': '1.5.0', 'service.name': 'unknown_service'}, maxlen=None)

This is very important for metrics where resource is part of the identity of a time series and we must not violate the single-writer principle.

Should we generate the service.instance.id from a UUID if it is not present? I think yes, so that we do not violate the global uniqueness.
Should we try to automagically populate this with a more meaningful value when it is present in a resource? E.g. container.id when present (this might be a spec question).
Since this is such a common use case, should we provide an easy way for the user to pass some of these key service.* attributes into the SDK if they don't want it generated? E.g. an environment variable, constructor kwarg to the Tracer/Meter provider, kwarg to the resource.

Sep 14 '21 22:09 aabmass

I like the idea of generating 1 if no service.instance.id is present.

Sep 16 '21 16:09 codeboten

AI for me is to investigate what other SIGs are doing and when this was added to the spec. At SIG, we decided that option 2. would need some spec discussion in SIG so we will skip it for now.

Sep 16 '21 16:09 aabmass

There's some spec discussion here https://github.com/open-telemetry/opentelemetry-specification/issues/1034. One of the things they mentioned is similar to option 2. Let's see if we can get some clarity in the spec.

Sep 16 '21 20:09 aabmass

@aabmass, has there been any new update on this?

Jan 06 '23 22:01 srikanthccv

Spec discussion https://github.com/open-telemetry/opentelemetry-specification/issues/3136

Feb 04 '23 20:02 srikanthccv

There has been some updates in the spec regarding this issue since last post in the thread.

Implementations, such as SDKs, are recommended to generate a random Version 1 or Version 4 RFC 4122 UUID, but are free to use an inherent unique ID as the source of this value if stability is desirable. In that case, the ID SHOULD be used as source of a UUID Version 5 and SHOULD use the following UUID as the namespace: 4d63009a-8d0f-11ee-aad7-4c796ed8e320. [...] For applications running behind an application server (like unicorn), we do not recommend using one identifier for all processes participating in the application. Instead, it's recommended each division (e.g. a worker thread in unicorn) to have its own instance.id.

Such an identifier would be a lifesaver for automated differentiation of metric streams from services with a fork-process model.

Jun 20 '24 12:06 rbagd

opentelemetry-python opentelemetry-python copied to clipboard

Generate a service.instance.id resource attribute if it is not present

opentelemetry-python
opentelemetry-python copied to clipboard