obsidian-copilot icon indicating copy to clipboard operation
obsidian-copilot copied to clipboard

Add context window size to Custom Model

Open ackalker opened this issue 1 year ago • 3 comments

Is your feature request related to a problem? Please describe. The current documentation in https://github.com/logancyang/obsidian-copilot/blob/master/local_copilot.md instructs Ollama users to use ollama run <model> and /set parameter num_ctx <value> to change the context window size for Ollama models. This is IMHO way too cumbersome and hardly necessary, as the desired context size can be specified in each API request. The Ollama server will (re)initialize the model with new parameter values (including num_ctx) when it detects a change from their default or previous values.

Describe the solution you'd like

  • Add a Context window size field in the Ollama section of the plugin settings where users can fill in their desired context window size.
  • If the field is non-empty, send its value as a key/value pair "num_ctx": <value> in the options dict in every chat API request (remember to send it in every request, or else the server will revert back to default model parameter settings for any parameters left out). See: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion (Advanced parameters) and here: https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values .
  • If the field is left empty, don't send num_ctx in the chat API request's options dict, the server will use the model default in that case.

Describe alternatives you've considered Explain to users how they can create their own Ollama models derived from base models using ollama create <model> with a Modelfile containing the desired changes. This can even be done programmatically, instead of running each and every model in turn using ollama run <model> to set parameters.

Additional context

ackalker avatar Apr 15 '24 12:04 ackalker

This was my request to ollama some time ago and seems it's possible now.

One question is that when you see this num_ctx in copilot settings it will be sent for all requests, and since each model has a different max context length, when this setting is bigger than the max it will fail silently. If that's something people can live with, sure why not.

logancyang avatar Apr 20 '24 17:04 logancyang

Is there any progress on this front? This is such an annoying problem for the end user, and it kinda prevents the offline workflow from working properly if the notes are just a bit longer than a page or so.

user82622 avatar Mar 26 '25 21:03 user82622

@Emt-lin @zeroliu We could add a new field to Custom Model for builtin models in the constant, and add an optional input for the user to set the context length of a model when they add new custom models. After that, there can be other logic like truncation or warning in the UI when input / context is too long before a user sends the request. What do you think?

Ollama-specific logic can be implemented after this change.

logancyang avatar Mar 26 '25 21:03 logancyang