Results 185 comments of Pascal

Toolcall debug on SvelteUI with your #16932 + #16618 :) Custom JSON : ``` { "tools": [ { "type": "function", "function": { "name": "simple_addition_tool", "description": "A dummy calculator tool used...

> @ServeurpersoCom The problem is that I added some code that makes it fall back to llama.cpp’s original parser when there are no tools, so the new parser is never...

I just realized this, and it seems strange: shouldn’t --reasoning-format none completely bypass any parsing logic instead of still going through it? It’s meant to be the raw passthrough mode...

> @ServeurpersoCom My understanding of `--reasoning-format none` is that it simply places the reasoning content directly into the chat messages, while still keeping tool calls properly parsed and handled. >...

> As an active user of llama.cpp and a developer building products around it, I don't care how hacky the template parser is. If I can't call tools properly with...

I'm interested with the server mode because I use sd.cpp to create img2img video and I need to reload the model each time https://github.com/user-attachments/assets/05d974bf-af68-4397-9d98-d02f539d044b

I’ve experimented with splitting the current server.cpp monolith into smaller, well-scoped units: core.cpp -> core logic, model execution, scheduling, slots, etc. core.hpp -> public interfaces and shared structures http.cpp ->...

> A first iteration could be to move `cpp-httplib` into a new `common/http.cpp` (instead of having only `http.h`) and add a minimal abstraction for both downloading and server functionality ?...

My tool-calling works fine on MoE-Granite-4.0-h-small-32B, but since I currently use a custom proxy that handles the parsing myself, I’d much rather understand and leverage the native tool-calling logic in...

The current Granite codepath in llama.cpp is incomplete: template detection fails and falls back to the Hermes 2 Pro parser. When this happens, the runtime applies Hermes-style parsing rules to...