semantic-kernel icon indicating copy to clipboard operation
semantic-kernel copied to clipboard

.Net Agents: Should InvokeAsync for ChatCompletionAgent return an IAsyncEnumerable?

Open madsbolaris opened this issue 1 year ago • 5 comments

Once we have the streaming version of agents, shouldn't ChatCompletionAgent just return a list of chat messages?

madsbolaris avatar Jun 18 '24 14:06 madsbolaris

This decision point might not be necessarily linked to streaming support.

crickman avatar Jun 18 '24 22:06 crickman

Curious to know if the streaming version of agents provide a new API call for streaming or the whole agent is stream. This question relates to how we do in SK services where we do provide two APIs for Streaming (IAsyncEnumerable) and Non-Streaming scenarios in the same service.

graph LR
Agent --> InvokeAsync
Agent --> InvokeStreamingAsync

OR

graph LR
Agent --> InvokeAsync_
StreamingAgent --> InvokeAsync 

rogerbarreto avatar Jun 19 '24 06:06 rogerbarreto

@RogerBarreto - Likely the former. Do you have thoughts / suggestions?

crickman avatar Jun 19 '24 16:06 crickman

@matthewbolanos - Signature modification also indicated for https://github.com/microsoft/semantic-kernel/issues/6813.

crickman avatar Jun 19 '24 16:06 crickman

@matthewbolanos - For the Open AI Assistant case, a run can produce multiple messages which are sometimes separated by tool calls that might have latency (i.e. code-interpreter). IAsyncEnumerable enables earlier messages to be accessed while the run proceeds. Returning the messages in a syncronous fashion may increase the percieved latency.

This consideration is likely also applicable to chat-completion scenarios but would require reconsideration of IChatCompletionService

crickman avatar Jun 19 '24 17:06 crickman

Already is!

crickman avatar Aug 05 '24 19:08 crickman