zed
zed copied to clipboard
language_models: Add tool use support for Mistral models
Closes https://github.com/zed-industries/zed/issues/29855
Implement tool use handling in Mistral provider, including mapping tool call events and updating request construction. Add support for tool_choice and parallel_tool_calls in Mistral API requests.
This works fine with all the existing models. Didn't touched anything else but for future. Fetching models using their models api, deducting tool call support, parallel tool calls etc should be done from model data from api response.
Tasks:
- [x] Add tool call support
- [x] Auto Fetch models using mistral api
- [x] Add tests for mistral crates.
- [x] Fix mistral configurations for llm providers.
Release Notes:
- agent: Add tool call support for existing mistral models
I was able to test this, it worked some, but I was able to trigger the following error:
Failed to connect to Mistral API: 400 Bad Request {"object":"error","message":"Unexpected role 'user' after role 'tool'","type":"invalid_request_error","param":null,"code":null}
Hey @notpeter thanks for testing, can you help me with the which model did you try this with. I am currently unable to reproduce this at the mode. Will see if I am able to do it with the specific model.
It was with codestral-latest. I haven't been able to come up with a consistent repro. :(
@notpeter I think the last mistral-medium should be a lot more reliable. I have very good results with other agentic tools such as Cline or Aider.dev
One thing that i do not love is that /models seems to return a very long list for available models (150 for me). It does not seem like most people are interested in using most of them, so i think it just adds a lot of unnecessary noise.
Is there any way that we can de-dup these? Or go back to the original approach and not fetch them at all?
I tried the de-duping (filter out duplicate ids, non chat models and de-dup by name) but still end up with a lot of unnecessary models that i'm not interested in using:
https://github.com/user-attachments/assets/0e3b1b02-0275-4242-9ce7-d48def37a72b
Hey @bennetbo got it. I guess in my case it was less so I didn't see this issue. In that case i can revert model fetch and have a hardcoded models in this pr. Probably we can discuss a better approach for exposing all models as a part of seperate pr wdyt?
Sounds good 👍🏻
Maybe only -latest models ?
Hey @bennetbo , I have removed the code changes for fetching the model dynamically. Also I have added the language_models.mistral.available_models[n].supports_tools in settings so that users can define their own models and use it with tools. I have attached the demo for custom model usage. It's ready for review now and let me know if there is anyother changes you want me to do.
https://github.com/user-attachments/assets/a7de2687-f9bc-4a1e-8284-f402944e9b9d
Tested it today myself, but it needs 'reminding' to use tools, be it mistral-small-latest or codestral-latest.
Outside of that - works quite well.
EDIT: After hitting rate limit, it broke a little at least. After it, it needs few seconds and works fine.
Current best mistral agentic model is mistral medium latest actually