stable-diffusion.cpp
stable-diffusion.cpp copied to clipboard
optimize ggml_ext_chunk
0835e5c22727981947eda0f6cfaf16b96b3aed25 broke sd1.5:
| master-408 | 0835e5c |
|---|---|
@wbruna, Oh, you're right, I was only looking at the speed.
Testing each version on SD1.5: when compared with 59ebdf0, #1079 seems almost as fast on Vulkan, and around 9% slower on ROCm. The ggml_ext_chunk suggested above is ~3-4% slower on both:
| version | vulkan | rocm |
|---|---|---|
| 59ebdf0 | 2.65s/it | 2.34s/it |
| 347710f (and current master) | 3.65s/it | 3.44s/it |
| ggml_ext_chunk above | 2.75s/it | 2.41s/it |
| #1079 | 2.69s/it | 2.54s/it |
