chat-with-your-data-solution-accelerator icon indicating copy to clipboard operation
chat-with-your-data-solution-accelerator copied to clipboard

Investigate adding streaming to `/api/conversation/custom` endpoint

Open cecheta opened this issue 1 year ago • 0 comments

Motivation

At present, there is an env var AZURE_OPENAI_STREAM which controls streaming when using the byod endpoint, but it is not used at all when using the custom endpoint.

Investigate the feasibility of implementing streamed responses on the custom endpoint. This is probably non-trivial because:

  • We would not want to stream the initial LLM response if it is a function call, but instead stream the response of the function call
  • There are a number of post-response steps that occur after the response is generate, such as calling content safety, and storing the result in the index
    • We probably should not stream the response before calling content safety, explore using built-in Azure OpenAI and Content Safety integration: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cpython-new

How would you feel if this feature request was implemented?

image

Requirements

A list of requirements to consider this feature delivered

  • Stream response back to the client from the /custom endpoint
  • Ensure content safety is included in the response

Tasks

To be filled in by the engineer picking up the issue

cecheta avatar Apr 17 '24 12:04 cecheta