haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Ensure run Method Returns Dictionary with Annotated Values

Open vblagoje opened this issue 2 years ago • 4 comments

Description:

Haystack component developers sometimes forget to adhere to the expected return type of the run method in various components. Specifically, the run method is designed to return a dictionary with values that are annotated on the method.

Issue Details:

  • Expected Behavior: The run method, across different components, should return a dictionary where the keys are the annotated output types, and the values are the corresponding results.
  • Current Behavior: Users sometimes return only the values without encapsulating them in a dictionary, as the output annotations specify.
  • Impact: This inconsistency can lead to pipeline runtime errors; users find out at Pipeline run time that they made a mistake. We should warn them earlier - at code time.

Example:

A correct implementation of the run method looks like this:

@component.output_types(documents=List[Document])
def run(self, ...):
    ...
    return {"documents": docs}

However, users often incorrectly implement it as:

@component.output_types(documents=List[Document])
def run(self, ...):
    ...
    return docs

vblagoje avatar Dec 06 '23 09:12 vblagoje

Hi @vblagoje, I would like to take up this issue as my first OS contribution at Haystack, if I am allowed to!

Thanks!

tanaymeh avatar Dec 06 '23 09:12 tanaymeh

@vblagoje I went through all the files, all of them are returning dictionaries but the keys have a multitude of key names from answers to replies to values to document_written.

Do you want documents as a key in all return dictionaries?

sahusiddharth avatar Dec 24 '23 16:12 sahusiddharth

we would need to check how to check if the output is a dict maybe even without running the component. Ideally we could check this when a component is initialized.

julian-risch avatar Apr 04 '25 13:04 julian-risch

One idea that could partially help here is to enforce that we provide return types to all of our functions so

@component.output_types(documents=List[Document])
def run(self, ...) -> Dict[str, List[Document]]:
    ...
    return docs

then when running mypy on this code we'd get a return-type error. This wouldn't fully solve the issue but it could be a helpful start.

sjrl avatar Jun 03 '25 06:06 sjrl