azure-sdk-for-net
azure-sdk-for-net copied to clipboard
Azure OpenAI: Add support for faster, streaming Completions responses for ChatGPT-style use
This change introduces support for streaming Completions to the Azure OpenAI C# client library. This allows Completions responses to provide incremental result data and facilitate latency-sensitive use of ChatGPT-style use cases where waiting many seconds for a full response to finish its entirety is prohibitive.
OpenAI's public documentation for REST/etc. covers this as a boolean stream
parameter on the Completions request payload: https://platform.openai.com/docs/api-reference/completions/create
In the client library, this is projected via supplementary methods and objects:
-
OpenAIClient
gets a new pair of methods,GetCompletionsStreaming
andGetCompletionsStreamingAsync
, that parallel their non-streaming counterparts but provide a newStreamingCompletions
type as the response payload. - The
StreamingCompletions
response value will be created and usable as soon as the response stream is available and provides aGetChoicesStreaming()
method to begin asynchronous enumeration ofStreamingChoice
objects -- these are populated as new server-sent event messages arrive with new choice data (as delineated by the 'index' field) -
StreamingChoice
, in turn, has aGetTextStreaming()
method that will asynchronously enumerate the streamed text components (typically each token in Completions.Choice) as they become available - Common concepts between streaming and non-streaming Completions/Choices (ID, Created, Logprobs, etc.) are reimplemented/exposed from the base, deserialized SSE payloads while inapplicable/unavailable concepts (like Usage) are omitted
In implementation, this is accomplished by disabling HttpMessage buffering in the Completions request and then forking a fire-and-forget Task (owned by StreamingCompletions) to pump the response content stream and update collections/enumerators.
API change check
APIView has identified API level changes in this PR and created following API reviews.