llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

model's reply is incomplete with setting nlen to 64

Open AnswerZhao opened this issue 11 months ago • 2 comments

why set the nlen to 64 in Llm.kt in llama.android project? This parameter setting limits the length of the model's reply, that is, the current reply result is incomplete. But when i change the nlen value, i find the value is too hard to define, why the EOS can not work but only depends on the nlen? Thanks you for help.

AnswerZhao avatar Mar 17 '24 06:03 AnswerZhao

You can easily modify the example to check for EOS token and stop

ggerganov avatar Mar 17 '24 17:03 ggerganov

@ggerganov Yep, i check the EOS token, but not work, now i am digging into it.

AnswerZhao avatar Mar 20 '24 09:03 AnswerZhao

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar May 05 '24 01:05 github-actions[bot]