exo icon indicating copy to clipboard operation
exo copied to clipboard

Feat: Tool calling + thinking tokens

Open frenchtoasters opened this issue 1 month ago • 1 comments

Motivation

This change is to enable tool calling within chats and produce thinking tokens.

Changes

  • Added new API field tests for chat completion in the master component to verify thinking, tool_calls, and content handling in streamed completions.
  • Added multiple unit tests for parsing logic in the worker component under test_parsing, covering chunk parsing with tools, reasoning, and minimax prefix logic.
  • Added parser configuration tests for the runner component.
  • Included a previously missed test that validates edge cases in chat completion message construction.
  • Overall improvement of test coverage and validation for the tool calling and parsing features introduced earlier.

Why It Works

This adds the parser system which will enable it to actually parse the <tool>args</tool> tokens, or ones similar, that are produced when a user tries to invoke a tool. They are parsed using the new https://github.com/exo-explore/exo/compare/main...frenchtoasters:tool_calling?expand=1#diff-411f9e51dde01e45ea0b31b98247a38037bdd293eeb5a1b8dee35e266bfdbda7R212 function and returned as structured tool calls based on the model type. We are also now creating thinking tokens in the ChatCompletionMessage to be parsed as well.

Test Plan

Manual Testing

Hardware: 2 M3 Mac Studio nodes Using MCPHub and CodeCompanion I was able to invoke tools image further manual testing of a wider range of models and chats could be beneficial

Automated Testing

Tests were added in this commit 54afa9e3267ff953ddd93abebb4cf81cfcb3c618 which cover the changes that were added here.

frenchtoasters avatar Jan 18 '26 15:01 frenchtoasters

Thanks for this PR! However, I do have several concerns about landing something like this.

I'll take a closer look another time, but at first glance: Why this solution over the example in mlx lm? Do we want this code inside EXO - should it be implemented as a proxy?

Ah that is a very good point I was not aware there was parser logic in the mlx lm, I could revise things to use the upstream parser logic from there which is probably a better solution long term.

The main thing I would like to see is the feature enabled when exo is stood up, without anything other than exo needing to be managed on the host. The exact implementation details are not something that really matter to me.

frenchtoasters avatar Jan 18 '26 19:01 frenchtoasters

Heads up @rltakashige is also working on this so we may close one in favour of the other

Evanev7 avatar Jan 19 '26 09:01 Evanev7

Heads up @rltakashige is also working on this so we may close one in favour of the other

No issues there, I just pushed

Thanks for this PR! However, I do have several concerns about landing something like this.

I'll take a closer look another time, but at first glance: Why this solution over the example in mlx lm? Do we want this code inside EXO - should it be implemented as a proxy?

@rltakashige just pushed a change that moves away from standing up the parsers in exo in favor of upstream plus found the pre-commit checks and got those passing.

frenchtoasters avatar Jan 19 '26 20:01 frenchtoasters