stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Performance 4/6] Precompute is_sdxl_inpaint flag
Description
According to https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/716#discussioncomment-9348247 , the check of whether the model is sdxl inpaint is calling state_dict on every sampling step. state_dict is a very expensive function that costs ~40ms. This overhead is for all inference regardless of model type, which is dumb.
This PR precomputes is_sdxl_inpaint flag so that we do not call state_dict on every sampling step.
Original PR that introduce this change: https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14390
Screenshots/videos:
Checklist:
- [x] I have read contributing wiki page
- [x] I have performed a self-review of my own code
- [x] My code follows the style guidelines
- [x] My code passes tests
Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)
Just wanted to comment that all these performance PRs are amazing! I get pretty similar speeds vs Forge on a RTX 4090. (It seems that A1111 with these PRs actually generate a tad bit faster vs Forge, but the former takes a bit more time to start generating)
There are 2 more PRs to come, but they are not as straightforward. So they might take longer to prepare. I am also having all performance fix merged to https://github.com/huchenlei/stable-diffusion-webui/tree/all_perf so you don't need to patch these PRs one by one.