semantic-kernel
semantic-kernel copied to clipboard
Python: Refactor completion integration tests
Motivation and Context
We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion.
We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors.
We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust.
Description
- Create a base class to enforce the structure of the completion test modules.
- Create two new test modules for image content and function calling contents.
- Simply the original text completion test and chat completion test to only test for the simplest scenarios.
- No longer check if the services return specific words. They are considered passing as long as they return something non empty.
- Bug fixes on function calling for Azure AI Inference and the Google connectors.
Contribution Checklist
- [x] The code builds clean without any errors or warnings
- [x] The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone :smile:
Python Unit Test Overview
| Tests | Skipped | Failures | Errors | Time |
|---|---|---|---|---|
| 2244 | 1 :zzz: | 0 :x: | 0 :fire: | 56.382s :stopwatch: |