llama.cpp
llama.cpp copied to clipboard
Add parameter to ignore end of text token
Adds the --ignore-eos switch which prevents generation of the end of text (eos) token. This can be useful to avoid unexpected terminations in interactive mode and to force the model to generate longer output.
This is implemented by setting the logits of the eos token to zero, which seems to work well enough, but I am not sure if there may be any unwanted side effects.