semantic-kernel
semantic-kernel copied to clipboard
Python: #6761 Onnx Connector
Motivation and Context
- Why is this changed required ? To enable Onnx Models with Semantic Kernel, there was the issue #6761 in the Backlog to add a Onnx Connector
- What problem does it solve ? It solves the problem, that semantic kernel is not yet integrated with Onnx Gen Ai Runtime
- What scenario does it contribute to? The scenario is to use different connector than HF,OpenAI or AzureOpenAI. When User's want to use Onnx they can easliy integrate it now
- If it fixes an open issue, please link to the issue here. #6761
Description
The changes made are designed by my own based on other connectors, i tried to stay as close as possible to the structure. For the integration i installed the mistral python package in the repository.
I added the following Classes :
- OnnxCompletionBase --> Responsible to control the inference
- OnnxTextCompletion --> Inherits from OnnxCompletionBase
- Support for Text Completion with and without Images
- Ready for Multimodal Inference
- Ready for Text Only Inference
- Supports all Models on onnxruntime-genai
- OnnxChatCompletion -->Inherits from OnnxCompletionBase
- Support for Text Completion with and without Images
- The user needs to provide the corresponding chat template to use this class
- Ready for Multimodal Inference
- Ready for Text Only Inference
- Supports all Models on onnxruntime-genai
What is integrated yet :
- [X] OnnxCompletionBase Class
- [x] OnnxChatCompletionBase Class with Dynamic Template Support
- [x] OnnxTextCompletionBase Class
- [x] Sample Multimodal Inference with Phi3-Vision
- [x] Sample of OnnxChatCompletions with Phi3
- [x] Sample of OnnxTextCompletions with Phi3
- [x] Integration Tests
- [x] Unit Tests
Some Notes
Contribution Checklist
- [x] The code builds clean without any errors or warnings
- [x] The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone :smile:
Regarding our offline conversation on the prompt template, is using a prompt template to parse the chat history to some format an overkill? Prompt template can do much more that substituting arguments. Is it possible to override the _prepare_chat_history_for_request method to get what Onnx wants?
As disscussed with @eavanvalkenburg i introduced hardcoded Templates in the onnx_utils.py file and integrated applying the Template into the _prepare_chat_history_for_request.
If people want to introduce custom templates they can overwrite the _prepare_chat_history_for_request.
Also the output from string to the SK Objects can now be overwritten, since they are seperated into _create_chat_message_content and _create_streaming_chat_message_content.
A couple of typos we'll need to fix:
Warning: "interogate" should be "interrogate". Warning: "choosen" should be "chosen".
Also, this is failing on the MacOS unit tests:
Using CPython 3.10.11 interpreter at: /Library/Frameworks/Python.framework/Versions/3.10/bin/python
Creating virtual environment at: .venv
Resolved 288 packages in 4.62s
error: distribution onnxruntime-genai==0.4.0 @ registry+https://pypi.org/simple can't be
installed because it doesn't have a source distribution or wheel for the current platform
Thanks for the contribution! Will approve once the unit tests pass.
Python Test Coverage Report
Python Unit Test Overview
| Tests | Skipped | Failures | Errors | Time |
|---|---|---|---|---|
| 2560 | 4 :zzz: | 0 :x: | 0 :fire: | 1m 5s :stopwatch: |