Http stack improvements
Motivation and Context
When working with HTTP REST API endpoints it’s very handy to keep all HTTP REST related/low-level code in so called “Service Client” classes that hide/wrap/encapsulate the codes that translates native language method calls to HTTP requests, serializes method parameters into the request payload and returns deserialized response. It’s also convenient to have Manager/Service classes that have business/higher-level logic and delegate communication with the HTTP REST API to the service client classes. Having the logic in Service Client and Service classes have a few benefits:
- Single responsibility – easy to read, maintain and test.
- Relatively easy to swap Service Client classes by native one if required.
Another good practice when working with HTTP REST API, when using IHttpClientFactory is not an option, is to have a few long-lived HttpClient instances, one per host, that are reused throughout the life of the application.
Lastly, it’s always good to maintain low coupling between classes where it’s possible by leveraging factory pattern for removing instantiation of actual implementation classes from code.
So, the changes are required to onboard points mentioned above to address the following problems:
- *TextCompletion, *TextEmbeddings backend classes have many responsibilities – caching, url building/validation, serialization/deserialization, business logic.
- Creation of big number of HttpClient instances instead of reusing of a few ones.
- Kernel and some other code are tightly coupled to *TextCompletion, *TextEmbeddings backend classes.
The problems can manifest themselves in the following scenarios:
- Feature development and bug fixing: read, maintain, reuse and test classes having many responsibilities is usually much harder than the ones having clear separation of concerns.
- Importing thousands of semantic functions: each semantic function requires an instance of a backend service class. Each backend service class creates an instance of HttpClient. So, potential big bang of HttpClient instances is possible in case of importing many semantic functions.
Description
This PR introduces the following changes:
-
The AzureOpenAIServiceClient and OpenAIServiceClient classes are added to encapsulate HTTP specific logic of accessing Azure OpenAI and OpenAI HTTP REST APIs.
-
The backend service classes *TextCompletion, *TextEmbeddings are modified to use the service client classes.
-
The BackendServiceFactory class is added to encapsulate instantiation of actual implementation of the backend service classes based on provided configuration.
-
The Kernel and MemoryConfiguration classes are modified to use the factory.
-
A new HttpClientFactory class is added to encapsulate HttpClient creation and disposal. Its instance lives inside the Kernal class and the Kernal is responsible for disposing all the HttpClient created through the factory. Having Kernal responsible for creation and disposal of the client instances simplifies the other code that don’t have to deal with their disposal since it did not create the clients.
-
The factory also caches one instance of the HttpClient class per HTTP REST API host. That allows to have only a few of them and reuse them for all requests for those hosts.
Contribution Checklist
- [x] The code builds clean without any errors or warnings
- [x] The PR follows SK Contribution Guidelines (https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
- [x] The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with
dotnet format - [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone :smile:
This change appears to add approximately 1K LOC without introducing new functionality. Is this the right approach?
In addition, a couple of notes:
- The change adds custom pooling for HttpClients which contradicts guidelines and may introduce issues due to captured underlying handlers with stale DNS entries and "dead" connections: https://learn.microsoft.com/en-us/dotnet/fundamentals/networking/http/httpclient-guidelines#recommended-use
- Service factories can be replaced with standard
Microsoft.Extensions.DependencyInjectionpackage in order to not have to maintain extra code
This change appears to add approximately 1K LOC without introducing new functionality. Is this the right approach?
The larges part of the change is code refactoring - changing code internally without changing its external behavior. Refactoring usually is done in order to improve non-functional aspects like - structure, readability, maintainability, etc. So, if the goal is to make the mentioned improvements through the refactoring, then the answer to the question is - yes, it's right approach. In addition, a couple of notes:
The change adds custom pooling for HttpClients which contradicts guidelines and may introduce issues due to captured underlying handlers with stale DNS entries and "dead" connections: https://learn.microsoft.com/en-us/dotnet/fundamentals/networking/http/httpclient-guidelines#recommended-use
Yes, it's an issue we are aware of. This change doesn't introduce it, it exists in original code and will be addressed. The plan is to do it iteratively and not to increase the size this, already big, PR.
Service factories can be replaced with standard Microsoft.Extensions.DependencyInjection package in order to not have to maintain extra code
If I understood the idea right, it presumes using IOC as a service locator that will allow classes, the service locator injected to, to resolve required instance dynamically instead of depending on a factory? If that's the case, then it might be done later if team decides that an additional dependency on service locator is better than maintaining the extra code.