Ensure run Method Returns Dictionary with Annotated Values
Description:
Haystack component developers sometimes forget to adhere to the expected return type of the run method in various components. Specifically, the run method is designed to return a dictionary with values that are annotated on the method.
Issue Details:
- Expected Behavior: The
runmethod, across different components, should return a dictionary where the keys are the annotated output types, and the values are the corresponding results. - Current Behavior: Users sometimes return only the values without encapsulating them in a dictionary, as the output annotations specify.
- Impact: This inconsistency can lead to pipeline runtime errors; users find out at Pipeline run time that they made a mistake. We should warn them earlier - at code time.
Example:
A correct implementation of the run method looks like this:
@component.output_types(documents=List[Document])
def run(self, ...):
...
return {"documents": docs}
However, users often incorrectly implement it as:
@component.output_types(documents=List[Document])
def run(self, ...):
...
return docs
Hi @vblagoje, I would like to take up this issue as my first OS contribution at Haystack, if I am allowed to!
Thanks!
@vblagoje I went through all the files, all of them are returning dictionaries but the keys have a multitude of key names from answers to replies to values to document_written.
Do you want documents as a key in all return dictionaries?
we would need to check how to check if the output is a dict maybe even without running the component. Ideally we could check this when a component is initialized.
One idea that could partially help here is to enforce that we provide return types to all of our functions so
@component.output_types(documents=List[Document])
def run(self, ...) -> Dict[str, List[Document]]:
...
return docs
then when running mypy on this code we'd get a return-type error. This wouldn't fully solve the issue but it could be a helpful start.