Datadog Tracer: Set Resource Name from Component Type/Name
Is your feature request related to a problem? Please describe.
Currently, the Datadog tracer sets the span resource name to be the same as the operation_name (e.g., haystack.component.run, haystack.pipeline.run, haystack.agent.run). This is because the resource parameter isn't explicitly set when creating the span, and dd-trace-py defaults the resource to the operation name (see here).
This makes it harder to distinguish between different component executions in the Datadog UI, as all components of the same type will share the same generic resource name.
Describe the solution you'd like
For component runs (haystack.component.run), I propose setting the Datadog resource name dynamically based on the specific Haystack component being traced.
The desired format would be the combination haystack.component.type and haystack.component.name.
For example:
- Operation name:
haystack.component.run - Resource name:
BranchJoiner joiner_component_name(ifhaystack.component.typeisBranchJoinerandhaystack.component.nameisjoiner_component_name)
I believe this provides much more context directly in the Datadog trace view, where the resource represents the specific dynamic instance or endpoint being hit (ref)
Describe alternatives you've considered The main alternative is the current behavior - leaving the resource name unset and letting Datadog default it to the operation name. However, this lacks the granularity needed for effective quick debugging and analysis in complex Haystack pipelines, we could still see the component name through the span metadata though.
Additional context
The diff below shows how the DatadogTracer class can be modified to set the resource accordingly:
--- a/haystack/tracing/datadog.py
+++ b/haystack/tracing/datadog.py
@@ -9,11 +11,12 @@
from ddtrace.trace import Span as ddSpan
from ddtrace.trace import Tracer as ddTracer
+
+_COMPONENT_NAME_KEY = "haystack.component.name"
+_COMPONENT_TYPE_KEY = "haystack.component.type"
+_COMPONENT_RUN_OPERATION_NAME = "haystack.component.run"
@@ -22,6 +25,19 @@
ddtrace_import.check()
self._tracer = tracer
+
+ def _get_resource_name(self, operation_name: str, tags: Optional[Dict[str, Any]]) -> Optional[str]:
+ """
+ Get the resource name for the Datadog span.
+ """
+ if operation_name == _COMPONENT_RUN_OPERATION_NAME and tags:
+ component_type = tags.get(_COMPONENT_TYPE_KEY, '')
+ component_name = tags.get(_COMPONENT_NAME_KEY, '')
+ resource_name = f"{component_type} {component_name}".strip()
+ return resource_name if resource_name else None
+ return None
@contextlib.contextmanager
def trace(
@@ -29,7 +45,11 @@
) -> Iterator[Span]:
"""Activate and return a new span that inherits from the current active span."""
- with self._tracer.trace(operation_name) as span:
+ # Determine the resource name based on operation and tags
+ resource_name = self._get_resource_name(operation_name, tags)
+
+ with self._tracer.trace(
+ name=operation_name, # Use the standard operation name for the 'name' field
+ resource=resource_name, # Use specific component info for 'resource' if available
) as span:
custom_span = DatadogSpan(span)
if tags:
@wochinge would this be nice to have for the platform?
yes 😍 Super cool contribution!
Only nit pick:
f"{component_type} {component_name}".strip()
how about something like
f"{component_name} ({component_type})".strip()
or maybe also
f"{component_type}: {component_name}".strip()
Hi, thanks for the ping on this, should I reopen PR #9337 ?
Hey @lan666as yes that would be great! The feature would be greatly appreciated