ai Support multiple tool calls in `render` function

Firstly, I would like to apologize for proposing such a significant change without first discussing its API in an issue, and inquiring if contributions of this nature would be welcome. My initial intention was to make a minor adjustment to enable support for multiple tool calls. However, one thing led to another, and here we are.

The motivation behind this change was an issue in the current implementation: when the LLM decides to send more than one tool call, the latter would overwrite the UI of the previous tool calls. Additionally, the text UI would be overwritten by the tool UI unless specifically handled by storing the last text content and inserting it into the tool UI.

With the proposed compose renderer (defaulting to a React Fragment), we aim to support composing all three types of LLM output:

Text
A function call
Multiple tool calls

I am fully open to completely revising the API based on your feedback, or even discarding this approach if you think it is unnecessary or if you have other plans for the SDK. After all, it should be considered that advanced use cases like this might be better off utilizing the low-level APIs, as demonstrated in examples/next-ai-rsc/app/action.tsx, rather than using render.

PS: The diff may appear larger than it actually is because I also added a basic example app that replicates the example from the Generative UI docs, which can also be used to test multiple tool calls with render:

Disclaimer: This text was edited by an LLM.

Mar 22 '24 10:03 unstubbable

When the LLM decides to send more than one tool call, the latter would overwrite the UI of the previous tool calls. Additionally, the text UI would be overwritten by the tool UI unless specifically handled by storing the last text content and inserting it into the tool UI. [...] It should be considered that advanced use cases like this might be better off utilizing the low-level APIs, as demonstrated in examples/next-ai-rsc/app/action.tsx, rather than using render.

I changed my mind on this being an "advanced use case", because any user that follows the render example from the Generative UI docs is likely to stumble over these two issues. So I would argue this should just work out of the box, which it would with the provided default compose renderer.

Mar 25 '24 10:03 unstubbable

Love this, thank you! I think we want this PR to be in, but I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

Mar 26 '24 15:03 shuding

I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

@shuding Just out of curiosity, how does this issue manifest itself?

Mar 29 '24 20:03 unstubbable

I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

@shuding Just out of curiosity, how does this issue manifest itself?

I think now I understand what you're trying to solve with #1195. It's basically the same issue that's reported in #1257, where update(node) ➞ done() leads to the node being rendered in the suspense fallback and children. And with Flight currently not being able to deduplicate these elements (only detriplicate), this leads to a remount in the client. Hm, tricky...

On a related note, with https://github.com/facebook/react/pull/28847 having landed, and https://github.com/facebook/react/pull/28849 landing soon, I wonder how the two different async generator concepts (one representing a list, the other representing a value over time) won't confuse users of the AI SDK. 🤔

Apr 17 '24 08:04 unstubbable

Yes @unstubbable that's totally what I've been thinking about (sorry I missed your previous message). React will reconcile the node even the same node is passed to both fallback and the children of Suspense (<Suspense fallback={node}>{node}</Suspense>). This makes render() unable to stream any client component without remounting it.

In terms of UX, it's fine for tool calls but for text() it becomes tricky. Imagine that if you do:

render({
  ...
  text({ content }) {
    return <Message>{content}</Message>
  }
})

Then there's no way to keep any internal state of <Message>. And with compose() that issue becomes unavoidable as well.

With low level APIs like createStreamableValue, the issue can be avoided:

const streamableValue = createStreamableValue('')

render({
  ...
  text({ content }) {
    streamableValue.update(content)
  }
})

return <Message content={streamableValue.value} />

Hence I still think that the API or maybe abstraction of render() isn't super great.

The short-term move is to make these "render" functions not that like React's "render". Like renaming the render method of tools as execute or maybe generate.

Apr 17 '24 22:04 shuding

the other representing a value over time

Could you point me to the place of this? I briefly check the React PRs and they all seem to represent a list. Thanks!

Apr 17 '24 22:04 shuding

the other representing a value over time

Could you point me to the place of this? I briefly check the React PRs and they all seem to represent a list. Thanks!

@shuding Sorry, I wasn't clear about that. With "two different async generator concepts" I meant one being the React (Async) Generator Server Components, and the other being the tools render (async) generator function in ai/rsc.

In React, this would render a list of items:

async function* ItemsServerComponent() {
  yield <p>A</p>;
  await Promise.resolve();
  yield <p>B</p>;
  await Promise.resolve();
  yield <p>C</p>;
}

function ListClientComponent({ children }) {
  // Unwrap async iterables here with React.use(), until natively supported in React.
}

<ListClientComponent>
  <ItemsServerComponent />
</ListClientComponent>

Result:

<p>A</p>
<p>B</p>
<p>C</p>

Whereas in ai/rsc, this would render first a spinner, and afterwards the final component, and not both in a list:

{
  render: async function* ({ city }) {
    yield <div>🌀</div>;

    const { temperature } = await getWeatherInfo(city);
    // update AI state ...

    return <div>{temperature}</div>
  }
}

Intermediate result:

<div>🌀</div>

Final result:

<div>24°C</div>

So at first glance, the two look very similar, but they behave totally different.

The short-term move is to make these "render" functions not that like React's "render". Like renaming the render method of tools as execute or maybe generate.

With this apprach though, you might be able solve the aforementioned conflict of concepts.

Apr 18 '24 06:04 unstubbable

@unstubbable Yes I think we're on the same page :)

Apr 19 '24 22:04 shuding

We now recommend using streamUI instead of render. Would this apply to streamUI as well?

May 08 '24 14:05 lgrammel

Yes, it would, but I was holding off on re-implementing it in streamUI because of the aforementioned reconciliation issue. I'm fine with closing this PR for now. We can create a new one if/when it makes sense.

May 08 '24 14:05 unstubbable

ai ai copied to clipboard

Support multiple tool calls in `render` function

ai
ai copied to clipboard