refactor(plugins/compat-oai): use ChatCompletionAccumulator for strea…
- Simplified generateStream by using openai-go's ChatCompletionAccumulator
- Removed manual tool call accumulation logic (currentToolCall, toolCallCollects)
- Created convertChatCompletionToModelResponse helper for unified response conversion
- Added support for detailed token usage fields:
- ThoughtsTokens (reasoning tokens)
- CachedContentTokens (cached tokens)
- Audio, prediction tokens in custom field
- Added support for refusal messages and system fingerprint metadata
- Refactored generateComplete to reuse convertChatCompletionToModelResponse
Description here... Help the reviewer by:
- linking to an issue that includes more details
- if it's a new feature include samples of how to use the new feature
- (optional if issue link is provided) if you fixed a bug include basic bug details
Checklist (if applicable):
- [ ] PR title is following https://www.conventionalcommits.org/en/v1.0.0/
- [ ] Tested (manually, unit tested, etc.)
- [ ] Docs updated (updated docs or a docs bug required)
@hugoaguirre @apascal07 pls review it
Hi Eric,
Thank you for this PR. I have given it a quick look over and will do a more thorough review once I investigate how, if at all, the changes to the tools API will interact with our in-progress implementation for multi-part tool responses as you are eager to see supported. We'd like to roll multiple changes to the API into one release. Thanks for your patience!
@apascal07 OK. Thank you very much for your patience in handling this.
Some updates: your idea to support parallel and sequential tools is a good one but we want to take it a step further and not assume that all parallel tools must go first (or vice versa), but rather do stages of all parallel, all sequential, all parallel, etc based on what tool requests the model returns and what the tools are marked with (parallel or not). We're discussing the details of the design now and will implement shortly after.
Ok, I'll close the pr and reopen a pr submitted: refactor (plugins/compat - oai) : use ChatCompletionAccumulator for streaming