opencode icon indicating copy to clipboard operation
opencode copied to clipboard

feat(mistral): allow Mistral transforms for local/custom models

Open graelo opened this issue 1 month ago • 6 comments

Thanks for this wonderful project!

This PR addresses an issue when running Mistral-family models (codestral, devstral, etc.) through custom providers like local vLLM/sglang/llama.cpp or ollama setups.

The problem I have

Mistral models have specific requirements:

  • Tool call IDs must be exactly 9 alphanumeric characters
  • A tool message cannot be directly followed by a user message (needs an assistant message in between)

The existing transform logic only kicked in when providerID === "mistral" or the model name contained mistral. This is smart, but it means custom providers serving codestral/devstral would fail with cryptic API errors, typically for me, after tool calls:

Unexpected role 'user' after role 'tool' Unexpected role 'user' after role 'tool'

The fix I suggest

Two changes:

  1. Added a transforms config option at the provider level, so you can explicitly opt-in:

    {
      "provider": {
        "my-local-llm": {
          "api": "http://localhost:8080/v1",
          "npm": "@ai-sdk/openai-compatible",
          "transforms": "mistral",
          "models": { ... }
        }
      }
    }
    
  2. Extended pattern matching to also catch codestral, devstral, ministral and pixtral model names automatically.

The detection now works as: explicit config > providerID match > model name pattern.

Also updated the DeepSeek detection to follow the same pattern for consistency (supports transforms: "deepseek" now too).

Testing

Added tests covering:

  • Config-based transform activation
  • Pattern matching for each model variant
  • Tool call ID normalization
  • Message sequence fixing
  • Non-Mistral providers unaffected

Happy to adjust the approach if there's a better way to handle this!

graelo avatar Dec 04 '25 10:12 graelo

Can you share your opencode.json? So you are using a mistral model but mistral isn't in the model name at all?

Idk I think it may be cleaner just to rename the model in your config rather than introduce new config variations that complicate stuff further?

Maybe im missing something

rekram1-node avatar Dec 04 '25 17:12 rekram1-node

Hi, thanks for your kind reply!

True, I actually don't always have "mistral" in the model name, and that's most of the time fixable on the server side. 100%.

However, it took me some time to realize model name has to contain "mistral". So yes that's easily fixable on the server side, once you know about there's a mechanism and the conditions under which it applies ;).

What motivates my PR is to let the user control the application of that critical correction mechanism. It's currently applied without the user being involved at all.

This silent interjection is truly helpful for most users, no doubt, but if you happen to dig a little bit more like fine-tuning or swapping models from different sources/quants, it may add up to the trickiness of the situation. See my funny rant below.

  • Take a mistralai/ministral-3-14b-instruct-2512 and the quantized unsloth/Ministral-3-14B-Instruct-2512-GGUF equivalent (has no "mistral" in the name!, same for devstral, etc). Unsloth changes the tokenizer and the chat template, so they straighten message order in their template (when they don't mess it up, nothing's perfect). So for their quant, opencode's mechanism is not applied, but tool calling works!

  • You then pick a different quant that does not hack the chat template, then it breaks, for instance cpatonn/Devstral-Small-2507-AWQ-4bit.

  • If you rename the original model after fine-tuning, it won't work again.

Debugging this kind of situation is tricky, at best. You end up attributing the issues to

  • the vLLM tool call parser (happens it has been having actual Mistral tool calling issues for the past year),

  • or to a streaming issue

  • to the original model itself

  • or the quant tech, or quant level wrt the model size, at that context window fill ratio,

  • maybe you should try sglang for this, or llama.cpp, etc. 😆

I'm sure you know very well what I'm talking about. There are so many ways that these things can break. It's passionating, but sometimes, just a bit of control really helps stabilize things and debug.

I'm not happy with the fact my PR changes the config template, it's very visible while you had tried to make this elegant and silent. Do you think another config approach would be better? Somewhere else than in the model provider?

At the end of the day, I'll understand if you feel this is too much added complexity for no real benefit: the user still must know about this parameter 🤷‍♂️

graelo avatar Dec 04 '25 21:12 graelo

For your usecase it sounds like it makes more sense as a plugin... I wonder if we should add a hook here:

export function message(msgs: ModelMessage[], model: Provider.Model) {
    msgs = normalizeMessages(msgs, model)
    if (model.providerID === "anthropic" || model.api.id.includes("anthropic") || model.api.id.includes("claude")) {
      msgs = applyCaching(msgs, model.providerID)
    }

    return msgs
  }

rekram1-node avatar Dec 04 '25 22:12 rekram1-node

What do u think?

rekram1-node avatar Dec 04 '25 22:12 rekram1-node

Brilliant!

graelo avatar Dec 04 '25 22:12 graelo

Only thing is normalizeMessages has them in the ai sdk format, and wed need the plugin to expose our format

rekram1-node avatar Dec 04 '25 22:12 rekram1-node

Thanks, I'll try to wrap my head around plugins and suggest something. I think I'll close this PR for now. WDYT?

graelo avatar Dec 06 '25 22:12 graelo

someone started working on a plugin hook for this actually: https://github.com/sst/opencode/pull/4910

I've been discussing w/ them on discord

rekram1-node avatar Dec 06 '25 22:12 rekram1-node

Thanks @rekram1-node! I'm happy to close this one. Cheers

graelo avatar Dec 06 '25 23:12 graelo