Leverage deepspeed for much faster inference time
https://github.com/microsoft/DeepSpeed Deepspeed delivers considerable inference time reduction for both single gpu and cluster gpus. It is the state of the art of inference optimization and is very easy to use, it's only a few lines of configuration!
Moreover it is exposed via Huggingface Accelerate https://github.com/CompVis/stable-diffusion/issues/180#issuecomment-1288059199
Needs to be optional as DeepSpeed does not work properly on Windows.
note Deepspeed Mii would be better https://github.com/microsoft/DeepSpeed-MII
see the benchmark for stablediffusion (V1?): https://github.com/microsoft/DeepSpeed-MII/tree/main/examples/benchmark/txt2img
Hey @dmarx, would I be able to try adding this enhancement? I haven't really contributed to open source projects, but I would love to get involved and start building something
Anyone tried doing it locally? Any guides/pointers would be appreciated. Don't want to use hugging face.