haystack-core-integrations icon indicating copy to clipboard operation
haystack-core-integrations copied to clipboard

Scope tracing per pipeline via runtime parameters

Open vblagoje opened this issue 6 months ago • 5 comments

Is your feature request related to a problem? Please describe.

Currently, tracing in Haystack is globally enabled or disabled for all pipelines via the tracer configuration (e.g., Langfuse). This means that if tracing is enabled, it applies to every pipeline in the interpreter, and there is no built-in way to scope tracing to a specific pipeline or to enable/disable tracing dynamically per pipeline execution. This is an issue for users who run multiple pipelines in the same process and want to trace only a subset of them.

Describe the solution you'd like

I would like the ability to enable or disable tracing on a per-pipeline or per-run basis. Ideally, this could be achieved by:

  • Allowing a runtime parameter (e.g., tracing_enabled) to be passed to the tracing component (such as LangfuseConnector) in the pipeline.
  • The tracing component (and/or the DefaultSpan implementation) should inspect this parameter at runtime and decide whether to emit traces for that pipeline run execution.
  • We are talking about Langfuse here but perhaps this should be standardized across all tracing frameworks

Describe alternatives you've considered

  • Passing a custom flag (e.g., tracing_enabled) in the invocation context and customizing the tracing component to check this flag before emitting traces. However, this requires custom code and is not supported out of the box.

Additional context For origination see this Discord exchange.

vblagoje avatar May 26 '25 09:05 vblagoje

Instead of handling this in the Tracing Connector component I think it might be nicer to figure out how to do it at a pipeline level. So indicate that PipelineBase._create_component_span should return the NullTracer which is A no-op implementation of the Tracer interface. This is used when tracing is disabled..

This way we wouldn't need to handle this specifically in each tracing integration.

sjrl avatar May 26 '25 10:05 sjrl

Instead of handling this in the Tracing Connector component I think it might be nicer to figure out how to do it at a pipeline level. So indicate that PipelineBase._create_component_span should return the NullTracer which is A no-op implementation of the Tracer interface. This is used when tracing is disabled..

This way we wouldn't need to handle this specifically in each tracing integration.

Let me see if I understand you correctly @sjrl . So in LangfuseConnector or any other tracer you would specify the names of the pipelines traced? That solves tracing on pipeline level, do we need it on run invocation level? It would be nice.

vblagoje avatar May 27 '25 10:05 vblagoje

@vblagoje no not quite. I'd add a pipeline level run field (e.g. trace: bool = True). That if specified as False turns off tracing for that pipeline run. That would at least solve allowing users turn on and off tracing right?

Or are users somehow also getting two different pipeline runs combined into the same trace in Langfuse?

sjrl avatar May 27 '25 10:05 sjrl

Ok, that makes sense

Or are users somehow also getting two different pipeline runs combined into the same trace in Langfuse?

I don't think so, these are created on ContextVar so they should be carrying different id during run execution.

vblagoje avatar May 27 '25 11:05 vblagoje

Hi! I would like to use a different tracer for each pipeline. I'm logging all traces to a different project in my tracing service (braintrust.dev), and this is configured in my custom Tracer. Currently, when the tracer is configured globally to all pipelines in the process, I face concurrency problems as one is overriding another.

At the moment, I'm trying to find a workaround by having a custom Tracer that recognizes the Pipeline, but when the Tracer.trace() gets executed, I don't have a way to know which pipeline component is currently running.

ernoaapa avatar May 28 '25 03:05 ernoaapa