.Net Agents: Should InvokeAsync for ChatCompletionAgent return an IAsyncEnumerable?
Once we have the streaming version of agents, shouldn't ChatCompletionAgent just return a list of chat messages?
This decision point might not be necessarily linked to streaming support.
Curious to know if the streaming version of agents provide a new API call for streaming or the whole agent is stream. This question relates to how we do in SK services where we do provide two APIs for Streaming (IAsyncEnumerable) and Non-Streaming scenarios in the same service.
graph LR
Agent --> InvokeAsync
Agent --> InvokeStreamingAsync
OR
graph LR
Agent --> InvokeAsync_
StreamingAgent --> InvokeAsync
@RogerBarreto - Likely the former. Do you have thoughts / suggestions?
@matthewbolanos - Signature modification also indicated for https://github.com/microsoft/semantic-kernel/issues/6813.
@matthewbolanos - For the Open AI Assistant case, a run can produce multiple messages which are sometimes separated by tool calls that might have latency (i.e. code-interpreter). IAsyncEnumerable enables earlier messages to be accessed while the run proceeds. Returning the messages in a syncronous fashion may increase the percieved latency.
This consideration is likely also applicable to chat-completion scenarios but would require reconsideration of IChatCompletionService
Already is!