feat: Increase context length setting for new models
Now that models such as this one https://ollama.com/library/dolphin-llama3:256k have created contexts lengths up to 256k, can the context length slider be adjusted to not max out at 16k? Also would it be possible to make context length settable by model? Is there a downside to just using the highest context value even for models that don't support it?
Is there a downside to just using the highest context value even for models that don't support it?
Short answer would be that yes, it's bad to use bigger context value than the model is trained for. The model starts to generate nonsense. There are some techniques to reach longer context windows by playing with the index positions but these are generally things for the model to do rather than setting it to unsupported values. Like for example the model you are speaking of likely already has these tricks.
Went ahead and made a pr with this change #1843 just waiting on approval
Looks like another model just dropped: https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k The context limit needs to be increased to 1048576 (Or maybe add an option for the user to set it as high as they want to avoid constantly increasing it?)
This setting might be fundamentally flawed to start with. It is inherently model-specific. Setting it too high consumes more memory and makes the model output nonsense because it wasn't trained for it. Having such global value might not be at all correct thing to do.
https://github.com/ggerganov/llama.cpp/discussions/3111
Maybe adding all the parameters as an option for modelfiles instead of a global setting could serve as a case by case solution? The only issue I see is not being able to adjust non-modelfile models.
So when using an external API, these settings would not take effect?
So when using an external API, these settings would not take effect?
do you got an answer for that? Is there an option to set context windows per model which are connected via api?