[Feature] Allow configuring multiple LLMs
While I use qwen2.5:70b as my daily driver, I use other models for specific tasks (e.g., coding). Thus, it would be great to allow multiple LLMs to be configured. We will have to add an option to allow the user to change the model at runtime:
@arsaboo what backend do you use to run your inference? Ollama? Tabby? Ooba? Kobold? Or you don't run inference locally but use something like OpenRouter?
@momokrono I'm using Ollama
Please multiple models support! Not priority but please implement it later :)
@arsaboo Hi, I recently implemented a basic model selection, you can find it in the corresponding branch.
Note, it only works for the Ollama provider (it loads automatically all models available) and only if text is selected.
@theJayTea do you think we should enable it for the no-text-selected interaction too?
BTW, I'm thinking about how to also add the feature for the OpenAI provider, but I fear it will be another field inside the config.json file since pretty much every inference server can be added here... For the Gemini one, though, we can just pre-fill a list of Gemini models we support and let the user pick one from it.
Hi @momokrono! I just checked it out, and it looks really cool—awesome work!
Here's my two cents on bolstering this:
Quick model selection is a great feature, but at the same time, it has a nicher appeal and could risk cluttering the minimalist pop-up UI.
For the Ollama provider: It's a lovely implementation you've done, and the feature really suits this as it appeals to users who love customising things and running local models (and when running local LLMs, there's always a greater trade-off in model strengths, VRAM use, & latency, so a quick choice is nice). Within the Ollama provider settings, would you be able to add a checkbox for "Show quick model selection" to show that quick checkbox? This way, people who don't want a more cluttered UI can also have that.
For the Gemini provider: IMO, I don't really think it's necessary to implement it in the Gemini provider, as it's the most mainstream/simple choice where all the free models are already quick and leading in intelligence, so no one would benefit from constant switching here. Adding a checkbox/option to this provider here would also slightly add clutter to the onboarding experience. But let me know what you think!
For the OpenAI provider: As people use this for local models, it'd be nice to have the option here. The OpenAI API supports a "list models" call, but I'm not sure if most other/local inference servers support this. About maintaining a list manually (if I got you right), since there are thousands of combinations of inference servers and their models, this would be too painstaking and inevitably incomplete — welp!
This is a really sweet feature that hundreds would find essential—thank you, @momokrono!
@theJayTea I agree with you on all points, and in fact I created the branch instead of adding the feature into dev to experiment and lay down a proof of concept instead of a definitive implementation.
And yes, I think the checkbox would be great, but I'm not sure on how to implement it: we could either place a checkbox inside the popup menu -- more like an icon for an expansion panel that will show the model list, really -- or a checkbox inside the settings to enable the selector. Probably the latter could be less intrusive and save an extra click for those wanting the option, but going with the first option could leave the UI clean for 99% of the time while leaving the option to select the custom model that time you need it.
Another possibility is to load the model list for both the Ollama provider and the OpenAI one from within the settings window, recycling the component, to switch the model if I need to since by default my implementation picks to the one defined in the config file if I don't specify a new one.
Or, since you are the UI/UX wizard you can show me an even better way to do it since I suck at it... ._.
@theJayTea I've added a checkbox in the settings window at the bottom of the provider's options to enable/disable the dropdown.
I've added support for the OpenAI-compatible "list model" call too.
The GUI however widens quite a lot if the model name is a long one, for example if I select the model Qwen2.5-Coder-32B-Instruct-128K-GGUF:Q5_K_M the pop up window gets huge to fit the string in its entirety, so we need to address that.
Hi @momokrono ! I'm sorry for the late reply, something came up and I couldn't check GitHub.
I trust your UX choices at the same level I do mine haha :)
And splendid! Could a simple way to fix that be with a wrapper function/lambda that truncates the string if it's >x characters when we read it for that pop-up drop-down? I'm not sure if the QT drop-down has a built in length limit parameter.
Really cool stuff! :D