BitNet
BitNet copied to clipboard
Only the example works, everything else is gibberish
I am using the standard example
python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:" -n 6 -temp 0
which gives the mentioned output. Changing Where is Mary?
to Where is John?
also gives the right output.
But changing n
to 26
already gives a wrong output:
Answer: John is in the bedroom.- Mary went to the garden. John went to the bedroom. Where is John?Answer: John
When asking
How long is an average airplane?
at n = 26
it outputs
Answer: The average flight time of an airplane is 2 hours and 15 minutes.What is the average flight time of a 737
So it seems that it's luck that the answer to the former was accurate. But is this an issue of quantization or the underlying model itself or where lies the issue?