Bruce MacDonald
Bruce MacDonald
Thanks for the script in the report, I've reproduced this and found what is causing the issue. Working on getting to the root cause now.
We have a mitigation in for the next release by disabling prompt-caching: #2018 I'll follow up on why prompt-caching causes this in #2023 Thanks to everyone for the reports.
Behavior here will be improved by #2221, working on getting that unblocked now
Hi @johnlarkin1, thanks for sharing the logs. You can see the issue here: ` 5985.11 / 5461.34), warning: current allocated size is greater than the recommended max working set size`....
No worries, I'm going to think of a way to communicate these errors better. It's not obvious right now.
@yeshwanth1312 we are planning in the planning phase for this now, but it is not confirmed for a release yet. I'll update you when it is confirmed.
There is a lot of change overlap in the PR and #2885, you might want to build this change from that branch
Hi @CtrlAiDel, this is a pretty common problem. The issue is the prompt, try adding a system message to respond in JSON. ```json { "model": "llama2", "stream": false, "format": "json",...
Hi @AI-Guru, is `model.q5_k_m.gguf` in the same directory as your Modelfile, and are you on the latest version of Ollama? I just tested this out and it ran for me....
What error do you see? Is the host accessible on that port?