llama.cpp
llama.cpp copied to clipboard
The same context, supplied differently, lead to different outcome
Expected Behavior
If the context is the same, temperature is set to 0, seed the same, I get the same answer.
Current Behavior
Experiment 1:
I pass an empty txt file, and in the prompt I wrote:
You answer the question concisely. Question: What is the capital of Belgium? Answer:
At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (2 from empty txt file, and 21 from prompt)
LLM output:
Brussels
You're done!
To submit your answer, type 'submit' and press Enter. To continue, type 'continue' and press Enter.
Command line used: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color -f "D:\AI\llama.cpp\build\bin\permPrompt.txt" --interactive-first
Experiment 1 bis:
If I don't pass any txt file, I get the exact same output. Command line used in this case: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color --interactive-first
Experiment 2:
I pass an permPrompt.txt which contains:
You answer the question concisely.
Note that there is a space after the full stop to end up with the exact same as in Experiment 1
Then in the prompt I wrote:
Question: What is the capital of Belgium? Answer:
At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (10 from txt file, and 13 from prompt)
LLM output:
Brussels.
You answer the question accurately and provide additional information.
Question: What are the physical and human characteristics of the region where you live?
Answer: The region where I live is known for its hilly terrain, with many rivers and streams running through it. The climate is
generally mild, with warm summers and cool winters. The population is diverse, with many different ethnic and linguistic groups
represented. The region is known for its agricultural products, including wine and dairy products. It is also home to many large
cities, including the capital. Overall, it is a beautiful and vibrant region with a rich history and culture.
Exact same command line used as in Experiment 1.
Notes: when I write "Last-n-token used size == number of token processed: 23", I have checked that, and the values are exactly the same in both experiment (not by the eyes, by putting in excel and asking if the 2 cells are the same : yes they are the exact same context). I have spend a lot of time ensuring that there is not 1 more space or something like that.
Environment
Windows
I tired on 2 different build:
-
Prebuild (the latest), and the way it looks is Experiment 1: In green: "You answer the question concisely. Question: What is the capital of Belgium? Answer:" Experiment 2: In orange: "You answer the question concisely. " then in green "Question: What is the capital of Belgium? Answer:"
-
Compiled myself, with a main.cpp with a lot of custom code to see exactly that the file is doing on each run/loop.
The 2 environment showed this exact same problem.
Feel free to ask any question, by now I can get any info out of the tool (I think).
Maybe you could use --verbose-prompt to find out how the prompt is handled exactly? There could be some difference in whitespace or newlines.
Experiment 1: All in the prompt and nothing in the txt file
main: prompt: ' ' main: number of tokens in prompt = 2 1 -> '' 29871 -> ' '
Experiment 2: Split between txt file and prompt
main: prompt: ' You answer the question concisely. ' main: number of tokens in prompt = 10 1 -> '' 887 -> ' You' 1234 -> ' answer' 278 -> ' the' 1139 -> ' question' 3022 -> ' conc' 275 -> 'is' 873 -> 'ely' 29889 -> '.' 29871 -> ' '
It looks like there is an extra space in experiment 1. So I tried adding this space in experiment 2, and got this: main: prompt: ' You answer the question concisely. ' main: number of tokens in prompt = 11 1 -> '' 29871 -> ' ' 887 -> ' You' 1234 -> ' answer' 278 -> ' the' 1139 -> ' question' 3022 -> ' conc' 275 -> 'is' 873 -> 'ely' 29889 -> '.' 29871 -> ' '
But the output is still different from Experiment 1.
Could someone try to replicate the problem and confirm?
In the code there is this piece: in examples/main/main.cpp:
https://github.com/ggerganov/llama.cpp/blob/0b2da20538d01926b77ea237dd1c930c4d20b686/examples/main/main.cpp#L157
// Add a space in front of the first character to match OG llama tokenizer behavior
params.prompt.insert(0, 1, ' ');
So for some reason one of your methods is adding it via that logic path, and the other isn't.. it's probably because the 'empty file' isn't really being considered empty. It is still following the path which adds, or doesn't that particular space.
This issue was closed because it has been inactive for 14 days since being marked as stale.