open-interpreter
open-interpreter copied to clipboard
Hosted multimodal models from Open Router currently don't work on Open Interpreter
Describe the bug
I am trying to use multimodal models from Open Router (Claude, Llava, ChatGPT, Gemini...) but it seems that the current implementation does not support that. (Or at least I have not figured it out).
In the settings documentation and documentation on how to use hosted models from Open Router https://docs.openinterpreter.com/language-models/hosted-models/openrouter
It says I need to specify the model like this in my profile YAML: model: "openrouter/anthropic/claude-3-haiku"
That works fine if I want to use a non-multimodal model.
If I want to use a multimodal model, I need to specify the API base for Open Router as: api_base: https://openrouter.ai/api/v1/chat/completions
If I do that, the Open Router automatically puts "openai/" in front of the model name.
When I start Open Interpreter with: interpreter --profile openrouter.yaml I get the message: interpreter --profile openrouter.yaml
I think the issue is in start_terminal_interface lines 401-410. interpreter.llm.model = "openai/" + interpreter.llm.model
It can be fixed by including the line: and "openrouter" not in interpreter.llm.api_base.lower()
I haven't done a lot of coding recently or contributed to open-source projects. I'm happy to open a pull request if that's not overkill for such a small thing.
If I do open a pull request, I could try to fix the documentation for Open Interpreter with multimodal models, since it would currently not work with multimodal models.
Thanks for your awesome project :)
Reproduce
The following settings in the profile .yaml
api_base: https://openrouter.ai/api/v1/chat/completions model: "openrouter/anthropic/claude-3-haiku"
results in: Model set to openai/openrouter/anthropic/claude-3-haiku
Expected behavior
Should result in:
Model set to openrouter/anthropic/claude-3-haiku
Screenshots
No response
Open Interpreter version
0.2.5
Python version
3.11.9
Operating System name and version
Ubuntu 22
Additional context
No response
Here is the documentation on OpenRouter about multi-modal models. https://openrouter.ai/docs#images-_-multimodal-requests
Sorry my mistake ... litellm already implements this. I should have just omitted the api_base: https://openrouter.ai/api/v1/chat/completions