Bug Report: Context Loss
What is the issue?
Bug Report: Context Loss in mistral-nemo Model
Description
I am using the mistral-nemo model in Ollama, which is supposed to have a 128k context window. However, I am observing issues with context loss with this model and the llama3 model as well.
Steps to Reproduce
- Run the command
ollama run mistral-nemo. - Use the following prompt:
'{id: 0, name: "Alexander City"}'
'{id: 1, name: "Andalusia"}'
'{id: 2, name: "Anniston"}'
'{id: 3, name: "Athens"}'
'{id: 4, name: "Atmore"}'
'{id: 5, name: "Auburn"}'
'{id: 6, name: "Bessemer"}'
'{id: 7, name: "Birmingham"}'
'{id: 8, name: "Chickasaw"}'
'{id: 9, name: "Clanton"}'
'{id: 10, name: "Cullman"}'
'{id: 11, name: "Decatur"}'
'{id: 12, name: "Demopolis"}'
'{id: 13, name: "Dothan"}'
'{id: 14, name: "Enterprise"}'
'{id: 15, name: "Eufaula"}'
'{id: 16, name: "Florence"}'
'{id: 17, name: "Fort Payne"}'
'{id: 18, name: "Gadsden"}'
'{id: 19, name: "Greenville"}'
'{id: 20, name: "Guntersville"}'
'{id: 21, name: "Huntsville"}'
'{id: 22, name: "Jasper"}'
'{id: 23, name: "Marion"}'
'{id: 24, name: "Mobile"}'
'{id: 25, name: "Montgomery"}'
'{id: 26, name: "Opelika"}'
'{id: 27, name: "Ozark"}'
'{id: 28, name: "Phenix City"}'
'{id: 29, name: "Prichard"}'
'{id: 30, name: "Scottsboro"}'
'{id: 31, name: "Selma"}'
'{id: 32, name: "Sheffield"}'
'{id: 33, name: "Sylacauga"}'
'{id: 34, name: "Talladega"}'
'{id: 35, name: "Troy"}'
'{id: 36, name: "Tuscaloosa"}'
'{id: 37, name: "Tuscumbia"}'
'{id: 38, name: "Tuskegee"}'
'{id: 39, name: "Alaska"}'
'{id: 40, name: "Anchorage"}'
'{id: 41, name: "Cordova"}'
'{id: 42, name: "Fairbanks"}'
'{id: 43, name: "Haines"}'
'{id: 44, name: "Homer"}'
'{id: 45, name: "Juneau"}'
'{id: 46, name: "Ketchikan"}'
'{id: 47, name: "Kodiak"}'
'{id: 48, name: "Kotzebue"}'
'{id: 49, name: "Nome"}'
'{id: 50, name: "Palmer"}'
'{id: 51, name: "Seward"}'
'{id: 52, name: "Sitka"}'
'{id: 53, name: "Skagway"}'
'{id: 54, name: "Valdez"}'
'{id: 55, name: "Arizona"}'
'{id: 56, name: "Ajo"}'
'{id: 57, name: "Avondale"}'
'{id: 58, name: "Bisbee"}'
'{id: 59, name: "Casa Grande"}'
'{id: 60, name: "Chandler"}'
'{id: 61, name: "Clifton"}'
'{id: 62, name: "Douglas"}'
'{id: 63, name: "Flagstaff"}'
'{id: 64, name: "Florence"}'
'{id: 65, name: "Gila Bend"}'
'{id: 66, name: "Glendale"}'
'{id: 67, name: "Globe"}'
'{id: 68, name: "Kingman"}'
'{id: 69, name: "Lake Havasu City"}'
'{id: 70, name: "Mesa"}'
'{id: 71, name: "Nogales"}'
'{id: 72, name: "Oraibi"}'
'{id: 73, name: "Phoenix"}'
'{id: 74, name: "Prescott"}'
'{id: 75, name: "Scottsdale"}'
'{id: 76, name: "Sierra Vista"}'
'{id: 77, name: "Tempe"}'
'{id: 78, name: "Tombstone"}'
'{id: 79, name: "Tucson"}'
'{id: 80, name: "Walpi"}'
'{id: 81, name: "Window Rock"}'
'{id: 82, name: "Winslow"}'
'{id: 83, name: "Yuma"}'
'{id: 84, name: "Arkansas"}'
'{id: 85, name: "Arkadelphia"}'
'{id: 86, name: "Arkansas Post"}'
'{id: 87, name: "Batesville"}'
'{id: 88, name: "Benton"}'
'{id: 89, name: "Blytheville"}'
'{id: 90, name: "Camden"}'
'{id: 91, name: "Conway"}'
'{id: 92, name: "Crossett"}'
'{id: 93, name: "El Dorado"}'
'{id: 94, name: "Fayetteville"}'
'{id: 95, name: "Forrest City"}'
'{id: 96, name: "Fort Smith"}'
'{id: 97, name: "Harrison"}'
'{id: 98, name: "Helena"}'
'{id: 99, name: "Hope"}'
'{id: 100, name: "Hot Springs"}'
'{id: 101, name: "Jacksonville"}'
'{id: 102, name: "Jonesboro"}'
'{id: 103, name: "Little Rock"}'
'{id: 104, name: "Magnolia"}'
'{id: 105, name: "Morrilton"}'
'{id: 106, name: "Newport"}'
'{id: 107, name: "North Little Rock"}'
'{id: 108, name: "Osceola"}'
'{id: 109, name: "Pine Bluff"}'
'{id: 110, name: "Rogers"}'
'{id: 111, name: "Searcy"}'
'{id: 112, name: "Stuttgart"}'
'{id: 113, name: "Van Buren"}'
'{id: 114, name: "West Memphis"}'
'{id: 115, name: "California"}'
'{id: 116, name: "Alameda"}'
'{id: 117, name: "Alhambra"}'
'{id: 118, name: "Anaheim"}'
'{id: 119, name: "Antioch"}'
'{id: 120, name: "Arcadia"}'
'{id: 121, name: "Bakersfield"}'
'{id: 122, name: "Barstow"}'
'{id: 123, name: "Belmont"}'
'{id: 124, name: "Berkeley"}'
'{id: 125, name: "Beverly Hills"}'
'{id: 126, name: "Brea"}'
'{id: 127, name: "Buena Park"}'
'{id: 128, name: "Burbank"}'
'{id: 129, name: "Calexico"}'
'{id: 130, name: "Calistoga"}'
'{id: 131, name: "Carlsbad"}'
'{id: 132, name: "Carmel"}'
'{id: 133, name: "Chico"}'
'{id: 134, name: "Chula Vista"}'
'{id: 135, name: "Claremont"}'
'{id: 136, name: "Compton"}'
'{id: 137, name: "Concord"}'
'{id: 138, name: "Corona"}'
'{id: 139, name: "Coronado"}'
'{id: 140, name: "Costa Mesa"}'
'{id: 141, name: "Culver City"}'
'{id: 142, name: "Daly City"}'
'{id: 143, name: "Davis"}'
'{id: 144, name: "Downey"}'
'{id: 145, name: "El Centro"}'
'{id: 146, name: "El Cerrito"}'
'{id: 147, name: "El Monte"}'
'{id: 148, name: "Escondido"}'
'{id: 149, name: "Eureka"}'
'{id: 150, name: "Fairfield"}
- Ask: "What's the city with id 1?"
The response is:
The city with id 1 is not included in the provided list. The smallest id number present is for Arkadelphia, which has an id of 85.
This indicates that the model is losing context since Andalusia should be the correct answer.
Temporary Solution
I found that setting the num_ctx parameter to 120000 before the prompt resolves the issue. For example:
/set parameter num_ctx 120000
Then, when repeating the experiment, I receive the correct response:
The city with id 1 is "Andalusia".
Additional Information
Running ollama show mistral-nemo provides the following details:
Model
arch llama
parameters 12.2B
quantization Q4_0
context length 1.024e+06
embedding length 5120
Parameters
stop "[INST]"
stop "[/INST]"
License
"Apache License"
API Usage
When using the API directly with curl, the problem persists even if I use:
options = {"num_ctx": 120000}
To resolve this, I need to pass:
{"role": "user", "content": "/set parameter num_ctx 120000"}
within messages during the request.
Conclusion
There is a bug in Ollama where the mistral-nemo (and llama3) model loses context unless the num_ctx parameter is explicitly set within the prompts.
OS
Linux
GPU
Nvidia
CPU
Intel
Ollama version
0.2.8
As background, ollama API calls (and CLI prompts via ollama run) that don't specify a context length, use a context length of 2048 tokens. If you need more context, you either need to specify that in the API call or change the default. options = {"num_ctx": 120000} is the canonical way to do that, if it's not working that's a bug and it would be great if you could send server logs demontrating this issue. Adding {"role": "user", "content": "/set parameter num_ctx 120000"} to an API call will not change the context window because /set parameter is interpreted by the CLI (ollama run), which is not involved in inference during an API call.
Note that the OpenAI compatibility endpoints (localhost:11434/v1/) follow the OpenAPI API standard in that they don't allow setting the size of the context window. If you need a larger context window and want to use the OpenAI endpoints, you need to set PARAMETER num_ctx as detailed below.
To change the default context window of a model and so do away with having to add options to an API call, you need to create a new model. This is easy to do and if you follow the instructions, does not require more disk.
First, get a copy of the current modelfile:
ollama show --modelfile mistral-nemo > Modelfile
Now edit the file and do two things:
- Where it says
FROMand a long file path, change the file path to the name of the model. eg,FROM /root/.ollama/models/blobs/sha256-b559938ab7a0392fc9ea9675b82280f2a15669ec3e0e0fc491c9cb0a7681cf94becomesFROM mistral-nemo - Create a new line in the file and add
PARAMETER num_ctx 120000
Create the new model:
ollama create -f Modelfile mistral-nemo-120k
Verify the change in default context length, check num_ctx under paramaters:
$ ollama show mistral-nemo-120k
Model
arch llama
parameters 12.2B
quantization Q4_0
context length 1.024e+06
embedding length 5120
Parameters
num_ctx 120000
stop "[INST]"
stop "[/INST]"
This should solve your context problem, if it persists please follow up with server logs.
As background, ollama API calls (and CLI prompts via
ollama run) that don't specify a context length, use a context length of 2048 tokens.
this should be mentioned somewhere in the documentation, if not already.
https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size
thanks @rick-github :) :+1:
ollama create -f Modelfile mistral-nemo-120k
This doesnt seem to work when I try for llama3
What exactly doesn't work?
@rick-github thanks for the awesome explanation! Will close this for now. @rodolfo-nobrega @anubhav-agrawal-mu-sigma feel free to follow up if things still aren't working..
https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-specify-the-context-window-size
the faq needs to be updated with your example as it took a lot of time to understand and configure the solution as it is NOT mentioned in faq - https://github.com/ollama/ollama/issues/5965#issuecomment-2252354726
I agree, I have a WIP to update various bits of documentation but haven't pushed a PR yet.
It should be noted that changing the context size of a model via the Modelfile is mentioned, but in the OpenAI doc , not the FAQ.