stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Performance 3/6] Disable nan check by default
Description
According to https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/716#discussioncomment-9349044 , nan check has ~20ms/it overhead. The overhead is large enough that option should only be used for debugging purpose.
Screenshots/videos:
Checklist:
- [x] I have read contributing wiki page
- [x] I have performed a self-review of my own code
- [x] My code follows the style guidelines
- [x] My code passes tests
can the nan check only enable for VAE?
nan check is not great but disabling that has a lot of implications for example the VAE fallback will no longer work
long term wise it may be desirable to load VAE as bfloat16 instead
As it is now it will break automatically switching to full precision VAE.
Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.
I pushed 547778b10f25def4e040b81942a2b23295567de3 to dev with this change.
Also what tool is being used here for those performance visualizations? I'd like that too.
Edit: it's torch's profiler visualized in chrome https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html
Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.
I pushed 547778b to dev with this change.
I think checking only a single element is a better way to handle this. Thanks for doing that!
Seems like that doesn't help - there's still those large delays caused by checking even single item.
But I changed the nan checking to only happen once after all steps are done in 6214aa7d2a84aa2a12962706579a2dba3470fb51, so this is not an issue.