opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Resource/Entity merge logic prevents fine-grained detectors

Open tigrannajaryan opened this issue 1 year ago • 2 comments

In Go SDK currently Resource detectors are fine grained with multiple detectors detecting portions of what in fact constitutes a single entity. For example there are separate host and hostid detectors. host detects only the host.name attribute while hostid detects the host.id attribute.

Attempting to refactor these detectors in a straightforward way, such that each existing Resource detector becomes an Entity detector is unsuccessful because of how the entity merging rules are defined. Each of these detectors detects a "host" entity, and each needs to use some attributes as the Id of the entity however they naturally use different Ids. The net result of trying to use host and hostid detectors at the same time is that one of the detectors takes priority over the over because of this merging rule:

If the entity identity is different: drop the new entity d'.

So, only one of these detector's attributes are effective, the other gets dropped.

It is unclear how to overcome this problem.

One possible way is to combine these 2 detectors into just one detector of "host" type which will detect both host.name and host.id attributes. However, this means a breaking change in the SDK since these detectors are part of public API.

Another way is to introduce a new host detector which will do this new detection job while keeping the old ones for backward compatibility purposes. The downside is that existing end-user code will not automatically benefit from upgrading to a newer SDK version.

Yet another way is to rethink the merging rules so that we allow multiple detectors of the same entity type and their results are merged. This would allow to keep the existing detectors' set intact, keep the API the same and allow existing end-user code to automatically gain entity-awareness by simply upgrading to a newer SDK version. It is not clear though if a reasonable merging rule exists.

tigrannajaryan avatar Oct 18 '24 16:10 tigrannajaryan

As an example output with merging rules implemented according to OTEP (i.e. "If the entity identity is different: drop the new entity d'.") and with Host and HostId detectors:

                "EntityRefs": [
                        {
                                "Type": "host",
                                "Id": [
                                        "host.id"
                                ],
                                "Attrs": [],
                                "SchemaURL": "https://opentelemetry.io/schemas/1.24.0"
                        },
                ]

As you can see the "Attrs" are empty, basically the "Host" detector that had lower priority is completely ignored.

However if we change the rule to merge both "host" entities instead of dropping one we get this:

                "EntityRefs": [
                        {
                                "Type": "service",
                                "Id": [
                                        "service.name",
                                        "service.version"
                                ],
                                "Attrs": null,
                                "SchemaURL": "https://opentelemetry.io/schemas/1.24.0"
                        },
                        {
                                "Type": "host",
                                "Id": [
                                        "host.id"
                                ],
                                "Attrs": [
                                        "host.name"
                                ],
                                "SchemaURL": "https://opentelemetry.io/schemas/1.24.0"
                        },
                ]

Now both detector's output is merged and we get both a correct Id and an additional descriptive attribute.

tigrannajaryan avatar Oct 18 '24 16:10 tigrannajaryan

We briefly discussed this in 2025-02-27 Meeting. We see a few options going forward here:

  1. Allow Go to violate spec for this
  2. Allow 'empty id' entities where we allow detectors to provide descriptive attributes an merge. There is a prototype of this in the collector used between hostmetricsreceiver + resourcedetector(host) processor.
  3. Put a layer on top of entity detection for Go, where using just Host without HostId would NOT create an entity.

We think while (2) is flexible, it would open the door for invalid entities to escape SDKs and the Collector, and could cause severe user confusion and complexity in backend tooling.

We think (1) while allowable would be undesirable.

Instead we think we should move forward with 3.

Here's a brief strawman proposal (note: this would need more work).

type Entity struct {
  type string
  identifying attributes.Set
  descriptive attributes.Set
}

// this provides compatibility for existing fine-grained detection.
type PartialDetector interface {
  /// Entity's type this contributes to.
  EntityType() string
  /// Fills out portions of an entity.  This may only fill out descriptive attributes.
  DetectPartial(ctx context.Context, entity *Entity) error
}

Existing resource.Detector implementations would be expanded to include entity.PartialDetector.

When resolving resources, there would be a phase that takes all entity.PartialDetector out of resource and groups them by entity type. For each type, an entity is constructed and passed into DetectPartial. If the entity comes out with identifying attirbutes, it is considered complete and is returned as an entity. If it does not contain identifying attributes, then it is instead insert as 'raw' resource attributes.

This would allow Go to preserve compatibility with the existing solution will evolving into Entities.

jsuereth avatar Feb 27 '25 17:02 jsuereth