haystack bug: Pipeline with conditional router hangs when branch starts with prompt builder

Describe the bug I discovered this issue when building a pipeline with a conditional router:

one route goes to a branch starting with a prompt builder
another goes to a branch starting with a custom component

When the branch that requires the prompt builder is selected, there are no issues, however when the branch with the other is required, the pipeline hangs.

We investigated this with @anakin87 and we discovered the following:

The branch with the custom component actually does run and produce a result
however it seems that the pipeline is still hanging expecting inputs for something
we also tried to see if @silvanocerza 's fix in #7531 fixes it, but it does not.

To Reproduce Use this colab, create the custom component, and eventually, run the pipeline called "conditional_sql_pipeline" Notice that:

when you run with a query that 'cannot be answered', it's all good, as it goes to the branch with the promptbuilder
when you run with a query that 'can be answered', the pipeline hangs..

FAQ Check

[x] Have you had a look at our new FAQ page?

System:

OS:
GPU/CPU:
Haystack version (commit or version number):
DocumentStore:
Reader:
Retriever:

Apr 11 '24 13:04 TuanaCelik

Investigated a bit and indeed the referenced PR partly fixes this issue and the Pipeline doesn't hang.

Though there's still an issue with that Pipeline as the PromptBuilder in the skipped branch runs even though it shouldn't. This happens cause the secondo PromptBuilder is still in the internal lists of components to run and it has only default inputs. This means that it will be run in any case before the Pipeline returns. This could be solved by making the optionality of the PromptBuilder inputs configurable, there's already been some discussion here.

Apr 11 '24 15:04 silvanocerza

@silvanocerza assigning you as there are more similar issues assigned to you already. Please close duplicates if any

Apr 22 '24 07:04 masci

I believe this issue was already resolved, PR #7553

Apr 29 '24 20:04 CarlosFerLo

True that, now that you can set PromptBuilder inputs as mandatory or not you can work around this bug.

Though I want to see if a fix in the Pipeline.run() logic is feasible so I'll keep it open. This is quite a nasty behaviour and if any other Component in the future causes this it's not good.

Apr 30 '24 09:04 silvanocerza

I believe this is failing for DynamicChatPromptBuilder as well. I was trying to get DynamicChatPromptBuilder to work with looped-validation, but pipeline never runs it because it expects all 4 inputs to be present. See my test code here, and from some initial debugging it seems this line fails for the prompt_builder node.

May 07 '24 13:05 shivanker

@silvanocerza can we close?

May 21 '24 08:05 masci

The same problem with PromptBuilder seems to happen for DynamicChatPromptBuilder(optional inputs + loop). Here's another discussion related to this issue: https://github.com/deepset-ai/haystack/discussions/7719 Colab nb to reproduce: https://colab.research.google.com/drive/1R28-_tOYKQnVVCNdJ2M9hLvGa--Dy3Kw?usp=sharing

May 21 '24 17:05 bilgeyucel

@masci This is not solved, the change to PromptBuilder is just a workaround, even though it still a sensible change for the Component. Ideally we could fix this in Pipeline.run() but I need some time to verify everything works ok and nothing break when fixing this.

May 24 '24 08:05 silvanocerza

Here is the use case from our Discord, simplified:

from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.routers import ConditionalRouter

routes = [
    {
        "condition": "{{'reisen' in sentence}}",
        "output": "German",
        "output_name": "langauge_1",
        "output_type": str,
    },
    {
        "condition": "{{'viajar' in sentence}}",
        "output": "Spanish",
        "output_name": "langauge_2",
        "output_type": str,
    },
]
router = ConditionalRouter(routes)

pipeline = Pipeline()
pipeline.add_component("router", router)
pipeline.add_component("pb", PromptBuilder(template="Ok, I know, that's {{langauge}}"))
pipeline.connect("router.langauge_2", "pb.langauge")

# fires both routes, and it should only fire the first one, second fired and output is `Ok, I know, that's `
print(pipeline.run(data={"router": {"sentence": "Wir mussen reisen"}}))

# fires only the second route correctly, output is `Ok, I know, that's Spanish`
print(pipeline.run(data={"router": {"sentence": "Yo tengo que viajar"}}))

May 30 '24 10:05 vblagoje

Solved by #7799.

Jun 06 '24 13:06 silvanocerza