components-contrib icon indicating copy to clipboard operation
components-contrib copied to clipboard

Azure OpenAI binding: Enable support for multiple endpoints

Open stuartleeks opened this issue 1 year ago • 1 comments

Describe the feature

The current Azure OpenAI binding takes in configuration for connecting to a single Azure OpenAI endpoint.

In an number of scenarios it is useful to be able to work with a number of Azure OpenAI endpoints.

Scenario 1 - fail-over

For high-volume usage, customers may purchase a Provisioned Throughput Unit(PTU). In this scenario, the PTU capacity isn't always sufficient for peak-load and a customer might want to send a request to the PTU first and then re-send to a Pay-As-You-Go (PAYG) endpoint if the PTU endpoint returns a 429 response.

Scenario 2 - round-robin

The limits for Azure OpenAI are per-region and customers may set up multiple PAYG endpoints across regions and want to distribute requests between them

Proposal

Sometimes customers with either of the above requirements will set up a gateway in front of the Azure OpenAI endpoints and have that handle the load distribution, but in other cases they come back to the application code to add these capabilities in as the usage scales up.

The proposal is to update the Azure OpenAI binding to allow multiple endpoints to be configured along with a distribution mode (failover or round-robin).

Release Note

RELEASE NOTE: ADD Enable multiple endpoints to be configured in Azure OpenAI binding.

stuartleeks avatar Mar 18 '24 09:03 stuartleeks

@ItalyPaleAle since our OpenAI binding component is not stable, I don't think this is a P1 for the project.

Instead, updating the component to the latest SDK and making the component stable should be P1s first. Then this item could follow.

Did you mean to add it to milestone v1.14 by the way?

berndverst avatar Mar 27 '24 00:03 berndverst