kernel-memory Support ChatCompletion streaming over Completions

Support ChatCompletion streaming over Completions

Open aaronpowell opened this issue 8 months ago • 1 comments

The OpenAI/AzureOpenAI TextGeneration classes use the legacy Completions API (deprecation from OpenAI, announcement and Azure model compatibility).

This poses a challenge when working with Azure OpenAI Service in particular as you are required to deploy a gpt-35-turbo model version of 0301, which is deprecated, but due to quota limits on standard accounts you are unlikely to be able to deploy that plus the embeddings model and a 0613 model for gpt-35-turbo for the application to use.

Having Semantic Memory move off the Completions API to ChatCompletions would unblock the usage in AOAI applications and ensure that applications aren't caught in the upcoming deprecations.

Oct 17 '23 04:10 aaronpowell

kernel-memory kernel-memory copied to clipboard

Support ChatCompletion streaming over Completions

kernel-memory
kernel-memory copied to clipboard