haystack
haystack copied to clipboard
Add (intermediate) results to pipeline_state, this makes it easier to accumulate results without keeping additional client-side state info
@davidsbatista @julian-risch I drafted the solution if it is ok I will create a PR.
Proposed Solution
Add a minimal, opt-in feature that automatically accumulates intermediate results in the pipeline's internal state:
Core Changes
-
Add
pipeline_stateproperty - A read-only view of accumulated intermediate results -
Add
accumulate_intermediate_resultsparameter - Simple boolean flag for pipeline run methods - Automatic accumulation - When enabled, collects outputs from all components during execution
🔧 How it works
# Before (manual approach)
pipeline.run(data, include_outputs_from={"comp1", "comp2", "comp3"})
# After (automatic approach)
pipeline.run(data, accumulate_intermediate_results=True)
# Access accumulated state anytime
state = pipeline.pipeline_state
print(f"Component 'comp1' output: {state.get('comp1')}")
Implementation Details
The implementation is extremely lightweight:
- Adds
_pipeline_state: Dict[str, Any] = {}to base pipeline class - When
accumulate_intermediate_results=True, automatically setsinclude_outputs_fromto all components - During execution, populates
_pipeline_statewith component outputs - Exposes read-only access via
pipeline_stateproperty