haystack icon indicating copy to clipboard operation
haystack copied to clipboard

`_SuperComponent` always define `run_async` even when the underlying Pipeline is sync

Open anakin87 opened this issue 7 months ago • 1 comments

Discovered in https://github.com/deepset-ai/haystack/pull/9420#pullrequestreview-2864343991

from haystack.components.preprocessors import DocumentPreprocessor
from haystack import Document
import asyncio

preprocessor = DocumentPreprocessor()

preprocessor.warm_up()

asyncio.run(preprocessor.run_async(documents=[Document(content="something")]))

Raises: TypeError: Pipeline is not an AsyncPipeline. run_async is not supported.


_SuperComponent defines a run_async method even when the underlying pipeline is synchronous. This leads to runtime errors and potentially confuses users, as the method appears available but always fails.

anakin87 avatar May 23 '25 13:05 anakin87

@anakin87 I see a few ways to solve this:

  • Auto convert a Pipeline to AsyncPipeline under-the-hood if a user calls run_async on a SuperComponent defined with Pipeline. And I guess vice-versa if the opposite happens.
  • Consolidate our Pipeline and AsyncPipeline abstractions into one object? Not sure how easy this would be.
  • Dynamically add the correct run method through the decorator? This approach wouldn't work with the inheritance approach creating generic SuperComponents when using SuperComponent directly.
  • Leave as-is but throw an error at run time if a user tries to use run_async but defined their SuperComponent with Pipeline

What do you think?

sjrl avatar May 28 '25 12:05 sjrl