Jaap Buurman comments

Results 101 comments of


                                            Jaap Buurman

Eval bug: GPT-OSS-120B: Vulkan backend fails to allocate KV cache with OOM error, despite enough free memory

I mean I am more than happy to close it, as it doesn't really impact me since I will run with flash attention if possible. But isn't this still a...

Eval bug: GLM-Z1-9B-0414

It's also happening for me on a 7900XTX running on ROCm. I have also tried -ngl 0 (Eg, CPU only), FA enabled/disabled but all with the same result. Interestingly, the...

Eval bug: GLM-Z1-9B-0414

Example: ![Image](https://github.com/user-attachments/assets/5769d3a2-f1d6-487e-b84d-ae2b05704d66)

Qwen 2.5 72B missing stop parameter

Experiencing the same with the 32b model

Qwen 2.5 72B missing stop parameter

I was able to solve the issue by increasing the `num_ctx` parameter. Apparently when the context size is exceeded, the model starts spitting out stuff that looks like training data,...

Qwen 2.5 72B missing stop parameter

Ollama pre-release 0.4.0 is available here: https://github.com/ollama/ollama/releases/tag/v0.4.0-rc3 The thing that caught my eyes was the following statement: which includes improved vision model caching, model reliability, caching and **stop token detection**...

fail to run ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8

I am getting the same error with this command: `ollama run hf.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF:IQ4_NL` The IQ4_NL quant does exist in the repo, and is a valid and normal quant option though: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF...

Jaap Buurman

Eval bug: GPT-OSS-120B: Vulkan backend fails to allocate KV cache with OOM error, despite enough free memory

Eval bug: GLM-Z1-9B-0414

Eval bug: GLM-Z1-9B-0414

Qwen 2.5 72B missing stop parameter

Qwen 2.5 72B missing stop parameter

Qwen 2.5 72B missing stop parameter

fail to run ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8

fail to run ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8

Misc. bug: Vulkan backend shows negative scaling at low batch sizes with MOE models

Misc. bug: Vulkan backend shows negative scaling at low batch sizes with MOE models