Python: Refactor completion integration tests

Open TaoChenOSU opened this issue 1 year ago • 1 comments

Motivation and Context

We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion.

We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors.

We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust.

Description

Create a base class to enforce the structure of the completion test modules.
Create two new test modules for image content and function calling contents.
Simply the original text completion test and chat completion test to only test for the simplest scenarios.
No longer check if the services return specific words. They are considered passing as long as they return something non empty.
Bug fixes on function calling for Azure AI Inference and the Google connectors.

Contribution Checklist

[x] The code builds clean without any errors or warnings
[x] The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
[x] All unit tests pass, and I have added new tests where possible
[x] I didn't break anyone :smile:

Aug 06 '24 16:08 TaoChenOSU

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
2244	1 :zzz:	0 :x:	0 :fire:	56.382s :stopwatch:

Aug 07 '24 01:08 markwallace-microsoft