Support content array in completions
Pull Request Description
- The current version is raising unmarshel error for the openai for the below openai format
"messages": [
{"role": "user", "content": [{"type": "text", "text":"who are you?"}]}
]
This is primarily raising issue while integrating with litellm. Fixed the util to consider the array based content in messages.
- Fixed the streaming response handling issue. The problem was that when handling SSE (Server-Sent Events) streaming responses, the code wasn't accumulating partial chunks before parsing.
Related Issues
Resolves: #[Insert issue number(s)]
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)
Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
- [ ] PR title includes appropriate prefix(es)
- [ ] Changes are clearly explained in the PR description
- [ ] New and existing tests pass successfully
- [ ] Code adheres to project style and best practices
- [ ] Documentation updated to reflect changes (if applicable)
- [ ] Thorough testing completed, no regressions introduced
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.
Thank you for the PR!
Few comments
- In PR: https://github.com/vllm-project/aibrix/pull/1145, openai-golang version was bumped and now unmarshal support for request message is added (previously unmarshal for request body would yield nil value for messages and hence jsonMap and custom message struct was used.
Now that messages is accessible, can you please refactor getCompletionMessage to build content something like this.
Another thing is, CompletionMessage used in PR is part of response. Please see another screenshot to use request objects to build request message.
-
For streaming, collection of partial chunks will be done at client. Gateway will stream chunks in separate response objects (as streamed from vllm engine) to client.
-
For response headers nil check, it is not needed as it is internally handled by CRD object.
Check this PR: https://github.com/vllm-project/aibrix/pull/1160
Check this PR: #1160
This PR is merged, which should address your requirement. Can you rebase master for other changes OR better to close this PR and start other PR.
Thanks for the update, the other PR seems resolved the issue. Closing this one.