Pascal
Pascal
My proxy is just an intermediate OpenAI-compatible client : it doesn’t change or reinterpret tool-calling logic, it simply forwards requests and SSE chunks. I’m keeping it for now, but I’m...
Special characters (including Unicode and emoji) pass correctly in tool calls. But it don't like the \u2013 !
Turns out the entire Granite 4 detection issue was just missing this tiny condition block ! unbelievable. I’ll test it thoroughly and add it to the validation suite! ``` (root|~/llama.cpp.pascal)...
https://huggingface.co/ibm-granite/granite-4.0-h-small?chat_template=default Granite actually uses plain XML-style tags (...) instead of the Llama-style . If we don’t properly distinguish the two formats, the Granite condition will end up matching Hermes as...
Yeah, I'm aware the chat format detection is keyword-based in template, not driven by the message fragments 🙂 What I'm trying to find now is a reproducible test that clearly...
No patch : ``` srv params_from_: Chat format: Hermes 2 Pro ``` ``` curl https://www.serveurperso.com/ia/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Accept: text/event-stream" \ -d '{ "model": "MoE-Granite-4.0-h-small-32B", "stream": true,...
It’s a bit tricky to prove, since the responses look almost identical : the only clear evidence I have is the template detection line and the change in prompt length:...
> [@ServeurpersoCom](https://github.com/ServeurpersoCom) don't u think this could be added to [#16335](https://github.com/ggml-org/llama.cpp/pull/16335)? Yes, the model selector could definitely evolve into a more complete system, but as it stands it would need...
> > @pwilkin any chance to buy you a coffee?(Paterson etc.) so community able to donate for your efforts. Thank you! > > Added a buymeacoffee link to my profile...
Huge respect for grinding through all the quirks of Qwen3-Next integration. It’s amazing to see real output showing up already!