alpaca.cpp icon indicating copy to clipboard operation
alpaca.cpp copied to clipboard

Modifying chat.cpp to implement a ChatGPT-like conversational context memory

Open MalekWahidi opened this issue 1 year ago • 8 comments

Is there any trivial way in which the code in chat.cpp could be tweaked to prepend each submitted prompt with the most recent couple of answers and prompts perhaps (or will this make the input sequence too large for the model?) so that alpaca could have some contextual understanding about the previous interactions within this conversation and reply accordingly as in ChatGPT. I would have liked to try it myself but I don't have enough expertise in C++ to attempt this. Any ideas or references?

MalekWahidi avatar Mar 29 '23 15:03 MalekWahidi

If needed, I posted in this related issue a prompt template that handles the conversational context. And about C++, I can't help unfortunately.

SamuelTallet avatar Mar 29 '23 21:03 SamuelTallet

I cant tell. Are you saying something needs to be written in the chat.exe app to do the context? I didn't know if you were brainstorming or this was something already implemented?

betolley avatar Apr 02 '23 23:04 betolley

I'm confused by the original request. Are you saying it needs prior context? If so, I'm shocked, because I've been carrying on contiguous conversations in 30B with no issue wrt conversation context. Is it chatgpt level? Nope. But it's not missing.

Terristen avatar Apr 03 '23 02:04 Terristen

I'm confused by the original request. Are you saying it needs prior context? If so, I'm shocked, because I've been carrying on contiguous conversations in 30B with no issue wrt conversation context. Is it chatgpt level? Nope. But it's not missing.

Really? My first few chats with the 7B model revealed a lack of conversational context awareness. Maybe I should just experiment more with it.

MalekWahidi avatar Apr 03 '23 16:04 MalekWahidi

You can't really compare the performance conversationally of 7b to 30b. 7b just isn't very good. It can answer some questions but doesn't have the capability to hold context well. 30b, with the same chat.cpp is considerably better, but still not great. The issues you're having are probably more about the model than the cpp code in the project. That's what I'm trying to say. Also, from personal experience, 13b is actually worse than 7b... just a fair warning.

Terristen avatar Apr 03 '23 16:04 Terristen

@Terristen Which CLI args do you use, please?

SamuelTallet avatar Apr 03 '23 16:04 SamuelTallet

@Terristen Which CLI args do you use, please?

I wish I could tell you right now, but I'm mid build on a new bigger/badder machine so I don't have access to my latest .bat file. I definitely run -t 20 and turn the temperature up to 1.35 if I recall. As for others, I've not done a lot of tuning but I think I dial up the repeat last to something higher than default. My ongoing issue is running out of memory after about 25 prompts, though up to that it's pretty cogent.

Terristen avatar Apr 03 '23 21:04 Terristen

I'm using LLAMA 60B and remember our conversation perfectly. LLAMA 7B is too simple to remember conversation. 60B is really good ....

My parameters are not fancy ...

main -m models/65B/ggml-model-q4_0.bin -t 16 -n 2048 --keep 48 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

Screenshot 2023-04-23 223054

alpaca.cpp is outdated anyway ... they merged to llama.ccp.

mirek190 avatar Apr 23 '23 21:04 mirek190