genkit refactor(plugins/compat-oai): use ChatCompletionAccumulator for strea…

Simplified generateStream by using openai-go's ChatCompletionAccumulator
Removed manual tool call accumulation logic (currentToolCall, toolCallCollects)
Created convertChatCompletionToModelResponse helper for unified response conversion
Added support for detailed token usage fields:
- ThoughtsTokens (reasoning tokens)
- CachedContentTokens (cached tokens)
- Audio, prediction tokens in custom field
Added support for refusal messages and system fingerprint metadata
Refactored generateComplete to reuse convertChatCompletionToModelResponse

Description here... Help the reviewer by:

linking to an issue that includes more details
if it's a new feature include samples of how to use the new feature
(optional if issue link is provided) if you fixed a bug include basic bug details

Checklist (if applicable):

[ ] PR title is following https://www.conventionalcommits.org/en/v1.0.0/
[ ] Tested (manually, unit tested, etc.)
[ ] Docs updated (updated docs or a docs bug required)

Nov 19 '25 09:11 eric642

@hugoaguirre @apascal07 pls review it

Nov 20 '25 03:11 eric642

Hi Eric,

Thank you for this PR. I have given it a quick look over and will do a more thorough review once I investigate how, if at all, the changes to the tools API will interact with our in-progress implementation for multi-part tool responses as you are eager to see supported. We'd like to roll multiple changes to the API into one release. Thanks for your patience!

Nov 25 '25 18:11 apascal07

@apascal07 OK. Thank you very much for your patience in handling this.

Nov 26 '25 01:11 eric642

Some updates: your idea to support parallel and sequential tools is a good one but we want to take it a step further and not assume that all parallel tools must go first (or vice versa), but rather do stages of all parallel, all sequential, all parallel, etc based on what tool requests the model returns and what the tools are marked with (parallel or not). We're discussing the details of the design now and will implement shortly after.

Dec 01 '25 16:12 apascal07

Ok, I'll close the pr and reopen a pr submitted: refactor (plugins/compat - oai) : use ChatCompletionAccumulator for streaming

Dec 02 '25 01:12 eric642