zed icon indicating copy to clipboard operation
zed copied to clipboard

language_models: Add tool use support for Mistral models

Open imumesh18 opened this issue 6 months ago • 9 comments

Closes https://github.com/zed-industries/zed/issues/29855

Implement tool use handling in Mistral provider, including mapping tool call events and updating request construction. Add support for tool_choice and parallel_tool_calls in Mistral API requests.

This works fine with all the existing models. Didn't touched anything else but for future. Fetching models using their models api, deducting tool call support, parallel tool calls etc should be done from model data from api response.

Screenshot 2025-05-06 at 4 52 37 PM

Tasks:

  • [x] Add tool call support
  • [x] Auto Fetch models using mistral api
  • [x] Add tests for mistral crates.
  • [x] Fix mistral configurations for llm providers.

Release Notes:

  • agent: Add tool call support for existing mistral models

imumesh18 avatar May 06 '25 11:05 imumesh18

I was able to test this, it worked some, but I was able to trigger the following error:

Failed to connect to Mistral API: 400 Bad Request {"object":"error","message":"Unexpected role 'user' after role 'tool'","type":"invalid_request_error","param":null,"code":null}

Screenshot 2025-05-06 at 14 00 53

mistral-tool.md

notpeter avatar May 06 '25 18:05 notpeter

Hey @notpeter thanks for testing, can you help me with the which model did you try this with. I am currently unable to reproduce this at the mode. Will see if I am able to do it with the specific model.

imumesh18 avatar May 06 '25 19:05 imumesh18

It was with codestral-latest. I haven't been able to come up with a consistent repro. :(

notpeter avatar May 06 '25 19:05 notpeter

@notpeter I think the last mistral-medium should be a lot more reliable. I have very good results with other agentic tools such as Cline or Aider.dev

vlebert avatar May 14 '25 11:05 vlebert

One thing that i do not love is that /models seems to return a very long list for available models (150 for me). It does not seem like most people are interested in using most of them, so i think it just adds a lot of unnecessary noise. Is there any way that we can de-dup these? Or go back to the original approach and not fetch them at all?

I tried the de-duping (filter out duplicate ids, non chat models and de-dup by name) but still end up with a lot of unnecessary models that i'm not interested in using:

https://github.com/user-attachments/assets/0e3b1b02-0275-4242-9ce7-d48def37a72b

bennetbo avatar May 15 '25 08:05 bennetbo

Hey @bennetbo got it. I guess in my case it was less so I didn't see this issue. In that case i can revert model fetch and have a hardcoded models in this pr. Probably we can discuss a better approach for exposing all models as a part of seperate pr wdyt?

imumesh18 avatar May 15 '25 09:05 imumesh18

Sounds good 👍🏻

bennetbo avatar May 15 '25 09:05 bennetbo

Maybe only -latest models ?

vlebert avatar May 15 '25 09:05 vlebert

Hey @bennetbo , I have removed the code changes for fetching the model dynamically. Also I have added the language_models.mistral.available_models[n].supports_tools in settings so that users can define their own models and use it with tools. I have attached the demo for custom model usage. It's ready for review now and let me know if there is anyother changes you want me to do.

https://github.com/user-attachments/assets/a7de2687-f9bc-4a1e-8284-f402944e9b9d

imumesh18 avatar May 15 '25 11:05 imumesh18

Tested it today myself, but it needs 'reminding' to use tools, be it mistral-small-latest or codestral-latest. Outside of that - works quite well.

EDIT: After hitting rate limit, it broke a little at least. After it, it needs few seconds and works fine. pr-tool-call-issues

KhazAkar avatar May 17 '25 15:05 KhazAkar

Current best mistral agentic model is mistral medium latest actually

vlebert avatar May 17 '25 16:05 vlebert