godot-llm
godot-llm copied to clipboard
LLM is shut down when context is full, instead of clean context to 0 or to n_keep to keep running
Hello :)
LLM is shut down when context is full, instead of cleaning the context to 0 or to n_keep to keep it running.
When the context gets full, the LLM is shut down. It can be seem by checking the vram used by the system (it stops using any vram at all), and checking the "gd_llama.is_running()" and "gd_llama.is_waiting_input()", which both returning false.
Tried multiple combinations with the following parameters, and doesn't seem to work with any combination: gd_llama.n_keep, gd_llama.instruct, gd_llama.interactive, gd_llama.context_size, gd_llama.n_predict.
Having the n_keep parameter suggests that a feature of having the llm keep running when the context gets full should already be implemented, so I suppose this is a bug.
Thank you very much.