Added Detection against self responses
The models like writing "### Instruction:" from time to time. with this modification their generation gets interrupted as soon as they type ### effectively stopping them from generating their own prompts. Because this kills the llama process this also saves resources and isn't just a visual change.
Here is an example response before this change:

Here is the same prompt and the (disabled) debug output from docker showing the changes in action:
If https://github.com/nsarrazin/serge/pull/143 is merged in, then your generate.py changes won't be valid
@DVSProductions Can you fix the merge conflicts?
I've resolved the merge conflicts but the Docker push seems to fail with a 403? don't know that to do about that
@DVSProductions Seems like another bug, for some reason it fails with external branches
@DVSProductions I will merge this soon, trying to fix a last little bug with the CI :-)
@gaby How are things going?
@fishscene This branch has changes that are not compatible with the migration to Redis.
It's up to @DVSProductions to fix the merge conflicts
Thanks for the update, mate!
@DVSProductions Do you want to fix the conflicts or close this?