llama.cpp read chat prompts from a template file

this is a simple improvement to the chat-13B example which reads prompts from a text file instead of inlining them in the script. i've used this with vicuna 13B with excellent results.

Apr 26 '23 19:04 khimaros

I think the nice thing about this example is it basically creates a chat bot out of the standard LLaMA model. If you have the official models, there's nothing else to download or figure out to have a back and forth chat.

I probably prefer a single file example people can edit that also works with a base model. That said, I'm not against breaking out the prompt and passing that in separately, but it's going to be a bit of work keeping the same chat in 3 different formats - and then maybe adding more formats as new models come along.

If we did want to support prompts for different chat models, we might want to just have the chat script also convert regular chat style conversations to alpaca/vicuna-1.0 style prompting (with something like sed 's/^\(\[\[.*\]\]:\)/### \1/' prompt.txt, and then prepend the "infinite length chat" stuff when using a standard LLaMA model or somehow mark extended chat that can be cut out for Vicuna or GPT-like models.

Edit: Keep in mind there's also a chat-13B.bat which gets left behind from these changes and could particularly benefit from prompt separation.

Apr 27 '23 04:04 DannyDaemonic

this has now been rebased to master.

N.B. occasionally with this patch, encountering the reverse prompt does not halt text generation. i believe this is triggered by the trailing whitespace that i've added to the reverse prompt.

is this expected? if so, i can rollback the whitespace addition. it is a slightly better user experience to include the whitespace at the end, but not if it causes the aforementioned issue.

i tested this PR with the following invocations:

MODEL=./models/ggml-vicuna-13b-1.1-q4_0.bin ./examples/chat-13B.sh

MODEL=./models/ggml-vicuna-13b-1.1-q4_0.bin USER_NAME=USER AI_NAME=VICUNA PROMPT_TEMPLATE=./prompts/chat-with-vicuna-v1.txt ./examples/chat-13B.sh

May 03 '23 14:05 khimaros

Yes, this is expected. There should be no whitespace at the end. See this comment for more information: https://github.com/ggerganov/llama.cpp/pull/1297#issuecomment-1533314364

May 03 '23 16:05 ggerganov

@ggerganov thank you, fixed the whitespace in the reverse prompt.

i think this is ready to merge now.

May 03 '23 17:05 khimaros

llama.cpp llama.cpp copied to clipboard

read chat prompts from a template file

llama.cpp
llama.cpp copied to clipboard