Lissanro Rayen

Results 35 comments of Lissanro Rayen

There are already quantized versions that fit 16GB cards (and 24GB cards): https://huggingface.co/azaneko/HiDream-I1-Full-nf4 (full version recommends 50 steps) https://huggingface.co/azaneko/HiDream-I1-Dev-nf4 (28 steps) https://huggingface.co/azaneko/HiDream-I1-Fast-nf4 (16 steps) Full, Dev and Fast versions have...

What helped me is to fill all the fields, even irrelevant ones for my setup, like OpenAI Compatible API Key (just put one character in there) and Model ID (I...

I hit exactly the same error when trying to run https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS To make sure I have everything up to date, I tried ./update_wizard_linux.sh but it did not fix it.

I tested with llama.cpp directly (without text-generation-webui), and it worked without the error. Hopefully, this issue can be fixed in text-generation-webui, but until then using llama.cpp for the DeepSeek model...

Sure, here it is (I ran out of space on my 8 TB NVMe, so ended up converting on slow HDD which took a while, so it just finished). I...

I successfully converted using the new @ubergarm recipe as described at https://huggingface.co/ubergarm/Kimi-K2-Thinking-GGUF in the Q4_X section. It works, but encountered another issue (mostly broken tool calling), at first I thought...

@ikawrakow I have tried it, but still failing tool calling inside the think block. It just prints XML and then usually fails to generate any actual response, most likely because...

Kimi K2 and Kimi K2 Thinking are completely different models. According to https://huggingface.co/moonshotai/Kimi-K2-Thinking it can emit multiple tool calls during thinking: > Deep Thinking & Tool Orchestration: End-to-end trained to...

Interesting, but they are very different in practice though, like Kimi K2 0905 was always very reliable for me in Roo Code, I use it daily since its release. As...

My main hope that it will be just different. Like Kimi K2 is different from DeepSeek V3, for example. While V3 was giving often output similar to R1 0528 (especially...