DeeH0pe
Results
2
issues of
DeeH0pe
Hello I dont have "issues" so far, I just wanted to ask if theres an option to add these models into deepcache? Best regards
Anyone else noticed a speed boost while consuming more vram with this commandline args? Model: GGUF Q4 --use-zluda --attention-quad --all-in-fp32 Im about around 3s/it faster than with bfloat16 