Add (intermediate) results to pipeline_state, this makes it easier to accumulate results without keeping additional client-side state info

Open davidsbatista opened this issue 7 months ago • 1 comments

May 30 '25 12:05 davidsbatista

@davidsbatista @julian-risch I drafted the solution if it is ok I will create a PR.

Proposed Solution

Add a minimal, opt-in feature that automatically accumulates intermediate results in the pipeline's internal state:

Core Changes

Add pipeline_state property - A read-only view of accumulated intermediate results
Add accumulate_intermediate_results parameter - Simple boolean flag for pipeline run methods
Automatic accumulation - When enabled, collects outputs from all components during execution

🔧 How it works

# Before (manual approach)
pipeline.run(data, include_outputs_from={"comp1", "comp2", "comp3"})

# After (automatic approach)  
pipeline.run(data, accumulate_intermediate_results=True)

# Access accumulated state anytime
state = pipeline.pipeline_state
print(f"Component 'comp1' output: {state.get('comp1')}")

Implementation Details

The implementation is extremely lightweight:

Adds _pipeline_state: Dict[str, Any] = {} to base pipeline class
When accumulate_intermediate_results=True, automatically sets include_outputs_from to all components
During execution, populates _pipeline_state with component outputs
Exposes read-only access via pipeline_state property

Jun 01 '25 09:06 YassinNouh21