truncle

Results 1 comments of truncle

> Have you tried adjusting the `-b` prompt processing batch size? I believe IPEX-LLM Llama.cpp defaults it to 4096 which is rather memory intensive. This allows for faster prompt processing...