haystack icon indicating copy to clipboard operation
haystack copied to clipboard

`Pipeline.draw` timeouts

Open anakin87 opened this issue 9 months ago • 2 comments

Currently, Pipeline.draw and Pipeline.show call the mermaid.ink server by default. (Users can also configure a custom Mermaid server using Docker.)

Recent problems

Pipeline.draw has been experiencing frequent timeouts. Over the past month, Mermaid servers have faced reliability issues, likely due to high traffic. See the following issues: https://github.com/jihchi/mermaid.ink/issues/491, https://github.com/jihchi/mermaid.ink/issues/498.

We recently introduced changes to pipeline drawing (#8767, #8799), but these do not appear to be the cause of the timeouts.

These failures impact users and our CI pipeline, causing integration tests to fail and slowing down development.

Affected tests

  • integration tests in haystack/test/core/pipeline/test_draw.py
  • nightly e2e tests (these have not been failing in the last few days)
  • tutorials tests

Action taken/in progress

  • Configurable timeout in Pipeline.draw #8967
  • Retry mechanism in Pipeline.draw #9045 (uncertain if this is effective for CI due to repeated calls in a short timeframe.)

Possible next steps

  • ~Skip non-critical integration tests that frequently fail~ done in #9108
  • ~remove Pipeline.draw from e2e tests if they start to fail again~ done in #9121
  • reflect on long-term solutions (hosting our own Mermaid server, find a python visualization library, ...)

anakin87 avatar Mar 25 '25 09:03 anakin87

In our nightly runs of our tutorial notebooks it could be worth updating our conversion script to skip lines containing pipeline.draw. Since often our nightly runs fail due to mermaid time out errors.

sjrl avatar Apr 04 '25 08:04 sjrl

I have experienced this issue consistently too when trying to illustrate a pipeline on the remote server. Imo, I think two major downsides of Mermaid are:

  • using the remote server, the pipeline image will be stored and can be retrieved easily by copying the link
  • locally is that it requires to use Docker

As a user, I like the ability to switch between remote and local visualization mode with Mermaid, but I would like the option to choose whether to store the image on the remote server or not (sharing vs. data/information protection). Regarding local visualization, I think it would be really helpful to have a tool that allows visualizing pipelines directly, without requiring Docker, on any OS.

Many tools like use graphviz, would that be an option?

d-kleine avatar Apr 28 '25 14:04 d-kleine