Wagner Bruna comments

Results 84 comments of


                                            Wagner Bruna

[Feature] Apple STARFlow and STARFlow-V

Nice. Calling that license "open source" is a bit of a stretch, though 🙂

[Feature] NPU support for backend apart from the existing cpu, vulkan, opencl etc.

I suggest you open a request on the [llama.cpp](https://github.com/ggml-org/llama.cpp) project, since ggml (our backend library) is developed primarily there.

[Bug] Z-Image render time

> Sometime, it goes up to 50s/it. And if launch the exact same command, it go back to 8 to 9s/it Are you low on VRAM? Memory pressure could cause...

[Feature Request] (Long Term): Detailed documentation of stable-diffusion.cpp and its features

I can confirm that disabling the line @Green-Sky pointed to results in bad (pure noise) images, unless `--guidance` is kept _very_ low (under 0.001 or so). And looks like it's...

[Feature Request] (Long Term): Detailed documentation of stable-diffusion.cpp and its features

Most of the information about `cfg-scale`and negatives is not specific to Chroma; perhaps a better place would be a 'general guidelines' file or section? The `cfg-scale` info applies to any...

[Feature Request] (Long Term): Detailed documentation of stable-diffusion.cpp and its features

Forgot to mention: for Chroma, `diffusion-fa` doesn't work without `chroma-disable-dit-mask`. I don't know if there's a _backend_ that always requires `chroma-disable-dit-mask` (plain Vulkan works fine for me; ROCm needs `clip-on-cpu`,...

[Feature Request] (Long Term): Detailed documentation of stable-diffusion.cpp and its features

> Let me see if I understand this... `heun`, as an "n-step second-order" sampler type at `--steps 8`, takes the same amount of time to process as a "first step"...

[Bug] Performance regression

It happens on my card, too; both with Vulkan and ROCm. Average per-step cost on a cfg 6, 20 step gen; the models only have f16 weights. SDXL 1024x1024: |...

optimize ggml_ext_chunk

0835e5c22727981947eda0f6cfaf16b96b3aed25 broke sd1.5: | master-408 | 0835e5c | |---|---| | | |

optimize ggml_ext_chunk

Testing each version on SD1.5: when compared with 59ebdf0, #1079 seems almost as fast on Vulkan, and around 9% slower on ROCm. The `ggml_ext_chunk` suggested above is ~3-4% slower on...