haystack
haystack copied to clipboard
ConditionalRouter output type is not working anymore
Describe the bug
We tried to update the Haystack version to 2.3.1 due to the security update and our code stopped working. We identified that the ConditionalRouter is returning a string instead of the type assigned in output_type
Error message We are getting errors related to the fact that a string is now being returned. Here is an example of the error when running the provided code:
AttributeError: 'str' object has no attribute 'content'
Expected behavior
We expected that the output_type provided in the route would be returned
Additional context The problem is related to version 2.3.1. Our provided example works on 2.3.0
To Reproduce The following code works on haystack 2.3.0 but it does not work on haystack 2.3.1:
from haystack.components.routers import ConditionalRouter
from haystack.dataclasses import ChatMessage, ChatRole
routes = [
{
"condition": "{{streams|length > 2}}",
"output": "{{message}}",
"output_name": "long_streams_message",
"output_type": ChatMessage,
},
{
"condition": "{{streams|length <= 2}}",
"output": "{{message}}",
"output_name": "short_streams_message",
"output_type": ChatMessage,
},
]
message = ChatMessage(content='Hi', role=ChatRole('user'), name='', meta={})
print(message.content)
router = ConditionalRouter(routes)
kwargs = {"streams": [1, 2, 3], "message": message}
result = router.run(**kwargs)
print(result['long_streams_message'].content)
FAQ Check
- [x] Have you had a look at our new FAQ page?
System:
- OS: Ubuntu 22.04 LTS
- GPU/CPU: CPU
- Haystack version (commit or version number): 2.3.1
That's expected.
Because of the changes to fix the security issue we had to make this breaking change. The ConditionalRouter can only return the following Python literal types now: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None and Ellipsis.
Any other will be returned as a string.
I would suggest creating a custom Component to cover those cases where the ConditionalRouter can't be used as before.
Something like this should do the trick:
from typing import List
from haystack import component
from haystack.dataclasses import ChatMessage
@component
class Decision:
@component.output_types(short_streams_message=ChatMessage, long_streams_message=ChatMessage)
def run(self, streams: List[str], message: ChatMessage):
if len(streams) > 2:
return {"long_streams_message": message}
return {"short_streams_message": message}
I see. So the parameter output_type will not work anymore?
We did not expect to see a breaking change with a patch version change. Should we worry about those updates too?
output_type has never been necessary to convert the type, it's only used to set the correct output type for the Component. Even before this change you could put any type in there but it wouldn't guarantee that the Component will return that type when calling run. That's only necessary when calling Pipeline.connect() to validate the connection.
We decided to release this security change as a patch release even if it's a breaking change as it's a pretty critical, it could lead to remote code execution.
Perhaps we could add an option to the ConditionalRouter so that it can route inputs to the selected route's output without rendering a jinja2 template?
I also had this case, where I wanted to route a list of Documents depending on a reply generated from the LLM. I don't have to go through jinja to route these documents because I don't want to change anything about them.
So something like:
routes = [
{"condition": "{{...}}", output: "documents", output_name: "a_documents", "output_type": List[Document], "use_literal_output": True}
]
router = ConditionalRouter(routes=routes)
router.run(documents=[Document(content="hey")])
# returns the document passed in unchanged
https://github.com/deepset-ai/haystack/blob/e17d0c41926849975166e70ec46411f402c8098e/haystack/components/routers/conditional_router.py#L206
This part would need to be changed to something like:
if route.get("use_literal_output"):
output = kwargs[route["output"]]
else:
...
Would that solve your use case too @gustavo-has-stone ?
In our scenario, we were routing a variable of a custom type (similar to ChatMessage) based on the value of an attribute of this variable. I did not fully understand your example. Could you provide more details?
Hi, I ran into this problem too when upgrading to 2.3.1. I use a custom dataclass type for my pipeline which isn't working as expected anymore. Is there a suggested fix for this without creating a custom router?
@gustavo-has-stone @alexsniffin
We discussed a bit internally and decided to give the possibility to enable the previous behaviour by creating ConditionalRouter and OutputAdapter with unsafe=True.
I just created a PR to bring that change in and we'll release it on the upcoming 2.4.0 some time next week. For the time being, and if you consider it safe, you can still rely on the previous behaviour by using 2.3.0 instead of the latest 2.3.1.
I don't think that using an unsafe version is a good solution for us. We will try to find an alternative solution for our use case
Hey @silvanocerza Idk if it`s a good solution because you are publishing a version that the only way to use previews behavior is forcing unsafe. It forces people to open security breaches when need a feature.
An approach that could help us is using Haystack ChatMessage as Router output. Do you know if is it possible?
I don't think that using an unsafe version is a good solution for us. We will try to find an alternative solution for our use case
It's unsafe only if the Jinja templates used are users inputs too. If you know the templates it's completely safe to use.
Hey @silvanocerza Idk if it`s a good solution because you are publishing a version that the only way to use previews behavior is forcing unsafe. It forces people to open security breaches when need a feature.
An approach that could help us is using Haystack
ChatMessageas Router output. Do you know if is it possible?
Setting unsafe to True doesn't automatically open you to security breaches. It enables a behaviour that is considered unsafe in certain circumstances, that is when ConditionalRouter or OutputAdapter use templates that can be considered users input.
If you know what you're doing and know you're not using users inputs as Jinja template that is completely fine to use.
Let me see if I understand. If we try to route based on some controlled condition (e.g., message length), we will be fine. However, if we try to use the content of a message for conditional routing, we could be in trouble. Is that right?
Let me see if I understand. If we try to route based on some controlled condition (e.g., message length), we will be fine. However, if we try to use the content of a message for conditional routing, we could be in trouble. Is that right?
No. You are vulnerable if you let your users write their own templates from scratch.
A simple example:
from typing import List
from haystack.components.routers import ConditionalRouter
user_input = input("Write a Jinja template:")
routes = [
{
"condition": user_input,
"output": "{{streams}}",
"output_name": "enough_streams",
"output_type": List[int],
},
{
"condition": "{{streams|length <= 2}}",
"output": "{{streams}}",
"output_name": "insufficient_streams",
"output_type": List[int],
},
]
router = ConditionalRouter(routes)
...
router.run(whatever_input)
Thank you @silvanocerza . It sounds like a plan! 😄