obsidian-copilot
obsidian-copilot copied to clipboard
Add context window size to Custom Model
Is your feature request related to a problem? Please describe.
The current documentation in https://github.com/logancyang/obsidian-copilot/blob/master/local_copilot.md instructs Ollama users to use ollama run <model> and /set parameter num_ctx <value> to change the context window size for Ollama models. This is IMHO way too cumbersome and hardly necessary, as the desired context size can be specified in each API request. The Ollama server will (re)initialize the model with new parameter values (including num_ctx) when it detects a change from their default or previous values.
Describe the solution you'd like
- Add a
Context window sizefield in the Ollama section of the plugin settings where users can fill in their desired context window size. - If the field is non-empty, send its value as a key/value pair
"num_ctx": <value>in theoptionsdict in every chat API request (remember to send it in every request, or else the server will revert back to default model parameter settings for any parameters left out). See: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion (Advanced parameters) and here: https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values . - If the field is left empty, don't send
num_ctxin the chat API request'soptionsdict, the server will use the model default in that case.
Describe alternatives you've considered
Explain to users how they can create their own Ollama models derived from base models using ollama create <model> with a Modelfile containing the desired changes. This can even be done programmatically, instead of running each and every model in turn using ollama run <model> to set parameters.
Additional context
This was my request to ollama some time ago and seems it's possible now.
One question is that when you see this num_ctx in copilot settings it will be sent for all requests, and since each model has a different max context length, when this setting is bigger than the max it will fail silently. If that's something people can live with, sure why not.
Is there any progress on this front? This is such an annoying problem for the end user, and it kinda prevents the offline workflow from working properly if the notes are just a bit longer than a page or so.
@Emt-lin @zeroliu We could add a new field to Custom Model for builtin models in the constant, and add an optional input for the user to set the context length of a model when they add new custom models. After that, there can be other logic like truncation or warning in the UI when input / context is too long before a user sends the request. What do you think?
Ollama-specific logic can be implemented after this change.