stable-diffusion-webui [Performance 4/6] Precompute is_sdxl

[Performance 4/6] Precompute is_sdxl_inpaint flag

Open huchenlei opened this issue 1 year ago • 2 comments

Description

According to https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/716#discussioncomment-9348247 , the check of whether the model is sdxl inpaint is calling state_dict on every sampling step. state_dict is a very expensive function that costs ~40ms. This overhead is for all inference regardless of model type, which is dumb.

This PR precomputes is_sdxl_inpaint flag so that we do not call state_dict on every sampling step.

Original PR that introduce this change: https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14390

Screenshots/videos:

Checklist:

[x] I have read contributing wiki page
[x] I have performed a self-review of my own code
[x] My code follows the style guidelines
[x] My code passes tests

May 15 '24 20:05 huchenlei

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

May 16 '24 04:05 Panchovix

Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)

There are 2 more PRs to come, but they are not as straightforward. So they might take longer to prepare. I am also having all performance fix merged to https://github.com/huchenlei/stable-diffusion-webui/tree/all_perf so you don't need to patch these PRs one by one.

May 16 '24 14:05 huchenlei

stable-diffusion-webui stable-diffusion-webui copied to clipboard

[Performance 4/6] Precompute is_sdxl_inpaint flag

Description

Screenshots/videos:

Checklist:

stable-diffusion-webui
stable-diffusion-webui copied to clipboard