DeeH0pe

Results 2 issues of DeeH0pe

Hello I dont have "issues" so far, I just wanted to ask if theres an option to add these models into deepcache? Best regards

Anyone else noticed a speed boost while consuming more vram with this commandline args? Model: GGUF Q4 --use-zluda --attention-quad --all-in-fp32 Im about around 3s/it faster than with bfloat16 ![grafik](https://github.com/user-attachments/assets/a7df89cc-4d28-4b8e-94bd-3a0bc4f263ec)