llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

The same context, supplied differently, lead to different outcome

Open sergedc opened this issue 2 years ago • 4 comments
trafficstars

Expected Behavior

If the context is the same, temperature is set to 0, seed the same, I get the same answer.

Current Behavior

Experiment 1:

I pass an empty txt file, and in the prompt I wrote: You answer the question concisely. Question: What is the capital of Belgium? Answer:

At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (2 from empty txt file, and 21 from prompt)

LLM output:

       Brussels

       You're done!

       To submit your answer, type 'submit' and press Enter. To continue, type 'continue' and press Enter.

Command line used: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color -f "D:\AI\llama.cpp\build\bin\permPrompt.txt" --interactive-first

Experiment 1 bis:

If I don't pass any txt file, I get the exact same output. Command line used in this case: D:\AI\llama.cpp\build\bin\main.exe -m "D:\AI\Model\vicuna-13B-1.1-GPTQ-4bit-128g.GGML.bin" -b 512 -c 2048 -n -1 --temp 0 --repeat_penalty 1 -t 7 --seed 5 --color --interactive-first

Experiment 2:

I pass an permPrompt.txt which contains: You answer the question concisely. Note that there is a space after the full stop to end up with the exact same as in Experiment 1 Then in the prompt I wrote: Question: What is the capital of Belgium? Answer:

At this stage: Last-n-token used size (ie different from ??) == number of token processed (sum of all embd processed) == 23 (10 from txt file, and 13 from prompt)

LLM output:

    Brussels.

     You answer the question accurately and provide additional information.

     Question: What are the physical and human characteristics of the region where you live?

     Answer: The region where I live is known for its hilly terrain, with many rivers and streams running through it. The climate is 
     generally mild, with warm summers and cool winters. The population is diverse, with many different ethnic and linguistic groups 
     represented. The region is known for its agricultural products, including wine and dairy products. It is also home to many large 
     cities, including the capital. Overall, it is a beautiful and vibrant region with a rich history and culture.

Exact same command line used as in Experiment 1.

Notes: when I write "Last-n-token used size == number of token processed: 23", I have checked that, and the values are exactly the same in both experiment (not by the eyes, by putting in excel and asking if the 2 cells are the same : yes they are the exact same context). I have spend a lot of time ensuring that there is not 1 more space or something like that.

Environment

Windows

I tired on 2 different build:

  1. Prebuild (the latest), and the way it looks is Experiment 1: In green: "You answer the question concisely. Question: What is the capital of Belgium? Answer:" Experiment 2: In orange: "You answer the question concisely. " then in green "Question: What is the capital of Belgium? Answer:"

  2. Compiled myself, with a main.cpp with a lot of custom code to see exactly that the file is doing on each run/loop.

The 2 environment showed this exact same problem.

Feel free to ask any question, by now I can get any info out of the tool (I think).

sergedc avatar Apr 22 '23 16:04 sergedc

Maybe you could use --verbose-prompt to find out how the prompt is handled exactly? There could be some difference in whitespace or newlines.

sw avatar Apr 22 '23 17:04 sw

Experiment 1: All in the prompt and nothing in the txt file

main: prompt: ' ' main: number of tokens in prompt = 2 1 -> '' 29871 -> ' '

Experiment 2: Split between txt file and prompt

main: prompt: ' You answer the question concisely. ' main: number of tokens in prompt = 10 1 -> '' 887 -> ' You' 1234 -> ' answer' 278 -> ' the' 1139 -> ' question' 3022 -> ' conc' 275 -> 'is' 873 -> 'ely' 29889 -> '.' 29871 -> ' '

It looks like there is an extra space in experiment 1. So I tried adding this space in experiment 2, and got this: main: prompt: ' You answer the question concisely. ' main: number of tokens in prompt = 11 1 -> '' 29871 -> ' ' 887 -> ' You' 1234 -> ' answer' 278 -> ' the' 1139 -> ' question' 3022 -> ' conc' 275 -> 'is' 873 -> 'ely' 29889 -> '.' 29871 -> ' '

But the output is still different from Experiment 1.

sergedc avatar Apr 22 '23 17:04 sergedc

Could someone try to replicate the problem and confirm?

sergedc avatar Apr 23 '23 22:04 sergedc

In the code there is this piece: in examples/main/main.cpp:

https://github.com/ggerganov/llama.cpp/blob/0b2da20538d01926b77ea237dd1c930c4d20b686/examples/main/main.cpp#L157

// Add a space in front of the first character to match OG llama tokenizer behavior
params.prompt.insert(0, 1, ' ');

So for some reason one of your methods is adding it via that logic path, and the other isn't.. it's probably because the 'empty file' isn't really being considered empty. It is still following the path which adds, or doesn't that particular space.

mikeggh avatar Apr 28 '23 03:04 mikeggh

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Apr 09 '24 01:04 github-actions[bot]