bug: Pipeline with conditional router hangs when branch starts with prompt builder
Describe the bug I discovered this issue when building a pipeline with a conditional router:
- one route goes to a branch starting with a prompt builder
- another goes to a branch starting with a custom component
When the branch that requires the prompt builder is selected, there are no issues, however when the branch with the other is required, the pipeline hangs.
We investigated this with @anakin87 and we discovered the following:
- The branch with the custom component actually does run and produce a result
- however it seems that the pipeline is still hanging expecting inputs for something
- we also tried to see if @silvanocerza 's fix in #7531 fixes it, but it does not.
To Reproduce Use this colab, create the custom component, and eventually, run the pipeline called "conditional_sql_pipeline" Notice that:
- when you run with a query that 'cannot be answered', it's all good, as it goes to the branch with the promptbuilder
- when you run with a query that 'can be answered', the pipeline hangs..
FAQ Check
- [x] Have you had a look at our new FAQ page?
System:
- OS:
- GPU/CPU:
- Haystack version (commit or version number):
- DocumentStore:
- Reader:
- Retriever:
Investigated a bit and indeed the referenced PR partly fixes this issue and the Pipeline doesn't hang.
Though there's still an issue with that Pipeline as the PromptBuilder in the skipped branch runs even though it shouldn't. This happens cause the secondo PromptBuilder is still in the internal lists of components to run and it has only default inputs. This means that it will be run in any case before the Pipeline returns.
This could be solved by making the optionality of the PromptBuilder inputs configurable, there's already been some discussion here.
@silvanocerza assigning you as there are more similar issues assigned to you already. Please close duplicates if any
I believe this issue was already resolved, PR #7553
True that, now that you can set PromptBuilder inputs as mandatory or not you can work around this bug.
Though I want to see if a fix in the Pipeline.run() logic is feasible so I'll keep it open. This is quite a nasty behaviour and if any other Component in the future causes this it's not good.
I believe this is failing for DynamicChatPromptBuilder as well. I was trying to get DynamicChatPromptBuilder to work with looped-validation, but pipeline never runs it because it expects all 4 inputs to be present. See my test code here, and from some initial debugging it seems this line fails for the prompt_builder node.
@silvanocerza can we close?
The same problem with PromptBuilder seems to happen for DynamicChatPromptBuilder(optional inputs + loop).
Here's another discussion related to this issue: https://github.com/deepset-ai/haystack/discussions/7719
Colab nb to reproduce: https://colab.research.google.com/drive/1R28-_tOYKQnVVCNdJ2M9hLvGa--Dy3Kw?usp=sharing
@masci This is not solved, the change to PromptBuilder is just a workaround, even though it still a sensible change for the Component.
Ideally we could fix this in Pipeline.run() but I need some time to verify everything works ok and nothing break when fixing this.
Here is the use case from our Discord, simplified:
from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.routers import ConditionalRouter
routes = [
{
"condition": "{{'reisen' in sentence}}",
"output": "German",
"output_name": "langauge_1",
"output_type": str,
},
{
"condition": "{{'viajar' in sentence}}",
"output": "Spanish",
"output_name": "langauge_2",
"output_type": str,
},
]
router = ConditionalRouter(routes)
pipeline = Pipeline()
pipeline.add_component("router", router)
pipeline.add_component("pb", PromptBuilder(template="Ok, I know, that's {{langauge}}"))
pipeline.connect("router.langauge_2", "pb.langauge")
# fires both routes, and it should only fire the first one, second fired and output is `Ok, I know, that's `
print(pipeline.run(data={"router": {"sentence": "Wir mussen reisen"}}))
# fires only the second route correctly, output is `Ok, I know, that's Spanish`
print(pipeline.run(data={"router": {"sentence": "Yo tengo que viajar"}}))
Solved by #7799.