Pascal comments

Results 185 comments of


                                            Pascal

Feature request: Graphical GGUF viewer

I built this as an exercise, now I’ll be able to add it to my llama.cpp dev server page, so when I click on a .gguf, it’ll launch the backend...

Feature request: Graphical GGUF viewer

Next, I’ll try to visualize the quantization blocks, those little per-group slices (like 32×N tiles), and maybe add some filters to highlight scale or residual patterns. Later, I’ll make it...

Feature request: Graphical GGUF viewer

Built this in one day, as KISS as possible: pure C++/GGML + vanilla JS. I still need to transfer FP32 weights in binary instead of JSON to squeeze out more...

Feature request: Graphical GGUF viewer

> [@ServeurpersoCom](https://github.com/ServeurpersoCom) Very cool! Showing the weight values when hovering on the pixels would be useful. Sure, on it :) OBS doesn’t capture Firefox tooltips, so when I tried to...

Feature request: Graphical GGUF viewer

I'm noticing some strange artifacts on certain slices of specific models, looks like repeated patterns along one axis, which could either be mathematically expected or a quantization glitch. When hovering...

Feature request: Graphical GGUF viewer

For sure we can run the tiny backend on a HF Space. I just need to optimize the communication between the frontend and/or the streaming layer, avoiding resending data that’s...

Feature request: Graphical GGUF viewer

If we rely only on the GGML public API, the tooltip can safely decode any block using ggml_get_type_traits(type)->to_float, which gives us the FP32 values directly. That works fine and remains...

Webui dynamic config

You're absolutely right — this is the core issue I ran into as well. The current behavior of always sending the full WebUI config overrides any server-side defaults, even when...

common: Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo)

I’ve reverted my previous PR (reasoning-format-minimax-m2) and merged PR #16932 into my testing-branch16 for isolated testing. I’m running llama-swap with the new XML tool-call parser to check MiniMax-M2 compatibility without...

common: Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo)

> Oh! It seems you’re using non-streaming mode. I can now reproduce your issue with `stream: false`. > > Let me dig into what’s happening… Yes, exactly: it works correctly...