continue
continue copied to clipboard
Ability to use a different model for tab completion
Validations
- [X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
- [X] I'm not able to find an open issue that requests the same enhancement
Problem
For question answering, RAG, editing, code generation, etc. I want to use the biggest, slowest model I can fit on my machine, as that will have the highest accuracy. When I'm asking questions or using /edit
, I expect that to take some time. A good model for this might be DeepSeek33B. This should be an instruction-tuned model.
However, for tab completion, I want it to be fast, even at the cost of accuracy like DeepSeek1B. Furthermore, I don't want it to be instruction tuned, as the next-token pretraining objective is perfect for tab completion.
Solution
In config.json
, we should be able to specify which model we want for tab completion. Furthermore, the codebase should be able to handle sending tab completion requests to one model and all other requests to another model.
@jbohnslav this is already possible! Check out the docs here for setting up a custom tab autocomplete model. And here for setting up a chat/quick edit model
Let me know if I can help at all with setting up : )
@sestinj: Right now it is only possible with Ollama? Tried withLMStudio, but not able to suceed. Maybe you can help with the setup if LMStudio is supported? What i configured in config.json and tried with different endpoints is (below models sections):
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"provider": "lmstudio",
"model": "Phi2",
"apiBase": "http://localhost:1234/v1/models"
},
@JosefLaumer No need to change the API Base, but if you wanted to it should be http://localhost:1234/v1 (we default to this). I believe this would solve your problem:
"tabAutocompleteModel": {
"title": "Tab Autocomplete Model",
"provider": "lmstudio",
"model": "Phi2"
},
I have a similar problem, I run text gen webui API and it just tells me that openai doesnt have Mistral-7B as a model. When I add text-gen-webui as provider it tells me its unknown
I'm using a remote machine to run Ollama Models, trying to use the same model for chat and auto tab completion.
"models": [
{
"title": "Ollama Remote",
"model": "codestral:22b",
"completionOptions": {
"keepAlive": 3000000,
},
"apiBase": "http://192.168.1.131:11434",
"provider": "ollama"
}
],
"tabAutocompleteModel": {
"title": "Ollama Remote",
"provider": "ollama",
"apiBase": "http://192.168.1.131:11434",
"model": "codestral:22b",
"completionOptions": {
"keepAlive": 3000000,
}
},
When opening chat, the model is loaded, but when i hop in the editor then tab auto complete model is loaded and previous one is unloaded. this takes huge time, how can i use the same model in Chat and Autocomplete without reloading the model in ollama?
@craftpip given your config here it doesn't seem like the unloading/reloading should happen, but if anything it might be that the different values of the keepAlive parameter are causing this. It's just the first thing that comes to mind, but I would try adding the same keepAlive value to your chat model
This is now possible (in the latest VS Code Pre-release) by using an array for tabAutocompleteModel, and then clicking on the "Continue" button in the status bar