haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Standardize streaming tool call & tool result format across LLM providers

Open vblagoje opened this issue 7 months ago • 4 comments

Is your feature request related to a problem? Please describe.
When using streamed responses across various LLM providers, there is no standardized output format for tool calls and tool call results. This inconsistency makes it difficult to format and display results uniformly in user interfaces. Yet many user facing UI tools e.g. Claude Desktop, Cursor etc nicely format these streamed tool calls and responses. Cursor is perhaps the best examples because it standardizes tool call/result UI widgets for tool calls/results regardless of LLM provider.

A clear, structured format for tools is essential for a good user experience. Currently OpenWebUI/Cursor may be using their own adaptation layer to standardize format across LLM outputs so that their UI can nicely render these tool calls/results regardless of the LLM provider. We should do something similar.

Describe the solution you'd like
I would like to see a standardized output format for tool calls and tool call results across all supported LLMs in Haystack, first and foremost for streamed responses. This format should be well-documented and easy to adapt in various UIs, ensuring that tool interactions are consistently represented regardless of the underlying LLM provider.

We should handle multiple tool calls/results per response as well. This is becoming more common nowadays.

Describe alternatives you've considered

  • None

Additional context

  • Screenshots from Cursor and Claude Desktop are attached to illustrate the desired output format and user experience.
  • Before implementing this feature, we should research how different UI applications handle tool call formatting and results, and gather best practices to inform our design.
Image Image

vblagoje avatar May 23 '25 08:05 vblagoje

@vblagoje I don't think this fully addresses this issue, but I'm actively working on standardizing our StreamingChunk dataclass to include fields like tool_call, tool_call_result, index and start in this PR https://github.com/deepset-ai/haystack/pull/9424. The PR also:

  • Updates all ChatGenerators in Haystack main to use the new format
  • Updates the print_streaming_chunk to utilize this new format (example shown in PR description)

Once this is merged then I'll open issues to update each of our ChatGenerators to utilize these new fields.

This expanded version of StreamingChunk should contain enough information in a standardized format to prettify it as we like.

sjrl avatar May 28 '25 11:05 sjrl

Amazing! Looking forward to this @sjrl

vblagoje avatar May 28 '25 14:05 vblagoje

@sjrl and @julian-risch any chance we can give this one P2 - imagine having standardized streaming tool format here so that tool invocations could be nicely rendered in OpenWebUI and other user facing UIs, it would significantly improve UX. It's dependency https://github.com/deepset-ai/haystack/pull/9424 should be integrated very soon enabling this one to be completed as well

vblagoje avatar May 30 '25 14:05 vblagoje

As discussed offline, we'll focus on one or two LLM providers (ChatGenerators) before we roll out any standardization across more LLM providers. StreamingChunk dataclass improvements have been released as part of Haystack version 2.15.0.

julian-risch avatar Jul 01 '25 08:07 julian-risch