woheller69
woheller69
I have tried several models and do not get garbage. llama-cpp-python 0.2.74, updated yesterday.
Trying to save messages using self.llama_cpp_agent.chat_history.message_store.save_to_json("msg.txt") gives TypeError: Object of type Roles is not JSON serializable
saving messages now works but using it I find that adding a message does not work anymore. When interrupting inference manually , see #47, I am adding the partial message...
I found I can add it with self.llama_cpp_agent.chat_history.get_message_store().add_assistant_message(self.model_reply) But will it be used in follow-up conversation then?
Another thing: The ```prompt_suffix``` works nicely, but it is not stored as part of the assistants message. I think this should be the case. E.g. using "Sure thing!" as prompt_suffix...
It seems that the model always needs to evaluate its own previous answer as part of the prompt. In the following examples my own new prompt was quite short every...
maybe related to this? https://github.com/abetlen/llama-cpp-python/issues/893#issuecomment-1868070256 My guess is that the chat template differs from that used in the model response ( maybe just a \n or whatever) and it threrefore...
I provided fixes for the chat templates in #73. But it seems the models answer e.g. in case of GEMMA_2 contains the right number of "\n\n". Are you stripping these...
it is not about a keyword. If a long text is generated and it goes the wrong direction I want to stop it without losing the context by killing the...
I need this for a local model, just in case this makes a difference