consistencydecoder icon indicating copy to clipboard operation
consistencydecoder copied to clipboard

Significant performance problems, see profiler screenshot

Open felix-red-panda opened this issue 1 year ago • 3 comments

image_2023-11-06_23-07-38

Running the consistency decoder takes several seconds and most of this time is spent in a stalled state and reducing the number of diffusion steps leads to no meaningful speed increase. The default SD1.5 decoder is ~100x faster running the code example in the readme.

I'm on Pytorch 2.0.1 on Linux kernel 6.1 with an RTX 3060

felix-red-panda avatar Nov 06 '23 22:11 felix-red-panda