stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Performance 3/6] Disable nan check by default

Open huchenlei opened this issue 1 year ago • 3 comments

Description

According to https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/716#discussioncomment-9349044 , nan check has ~20ms/it overhead. The overhead is large enough that option should only be used for debugging purpose.

Screenshots/videos:

image

Checklist:

huchenlei avatar May 15 '24 20:05 huchenlei

can the nan check only enable for VAE?

SLAPaper avatar May 17 '24 15:05 SLAPaper

nan check is not great but disabling that has a lot of implications for example the VAE fallback will no longer work

wfjsw avatar May 18 '24 16:05 wfjsw

long term wise it may be desirable to load VAE as bfloat16 instead

wfjsw avatar May 18 '24 20:05 wfjsw

As it is now it will break automatically switching to full precision VAE.

AUTOMATIC1111 avatar Jun 08 '24 07:06 AUTOMATIC1111

Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.

I pushed 547778b10f25def4e040b81942a2b23295567de3 to dev with this change.

AUTOMATIC1111 avatar Jun 08 '24 09:06 AUTOMATIC1111

Also what tool is being used here for those performance visualizations? I'd like that too.

Edit: it's torch's profiler visualized in chrome https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html

AUTOMATIC1111 avatar Jun 08 '24 09:06 AUTOMATIC1111

Can we maybe just get the needed performance improvement by checking a single element instead of the whole tensor? Since there are batch norms, single values becoming NaN dooms the whole tensor to become all NaNs.

I pushed 547778b to dev with this change.

I think checking only a single element is a better way to handle this. Thanks for doing that!

huchenlei avatar Jun 08 '24 14:06 huchenlei

chrome_bsdXm8iifO

Seems like that doesn't help - there's still those large delays caused by checking even single item.

AUTOMATIC1111 avatar Jun 09 '24 13:06 AUTOMATIC1111

But I changed the nan checking to only happen once after all steps are done in 6214aa7d2a84aa2a12962706579a2dba3470fb51, so this is not an issue.

AUTOMATIC1111 avatar Jun 09 '24 15:06 AUTOMATIC1111