stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: Add Facebook's Token Merging feature for faster inference time

Open Yardanico opened this issue 3 years ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Implement https://github.com/facebookresearch/ToMe which allows for faster image inference time

Proposed workflow

Maybe a CLI option? From what I read it decreases accuracy by a bit, so some people won't want to have it enabled.

Additional information

See code in https://github.com/facebookresearch/ToMe/pull/7 and https://github.com/Birch-san/stable-diffusion/

Yardanico avatar Nov 06 '22 01:11 Yardanico

From the results by others and me it seems to speedup inference by ~20-25% at 512x512 which is quite significant, and allows for generation of much bigger images.

ghost avatar Nov 08 '22 02:11 ghost

I did an extremely quick and dirty patch of webui and stable-diffusion repo with code from https://github.com/Birch-san/stable-diffusion to check ToMe. I also hard-coded doggettx's attention into the code because I'm not good enough to figure out how to add this properly - so for anyone using xformers, xformers will be faster for you than this patch.

https://gist.github.com/Yardanico/081e7e23ea1d51dd70f1a75a6df8b876 if you want to try.

I'm getting 25% speed increase on my RX6700XT from 6it/s to 7.5it/s, and I can also generate bigger resolutions while it being faster.

There is some accuracy loss though, but it largely depends on the prompt.

ghost avatar Nov 08 '22 09:11 ghost

very interesting work, hope we can enjoy this feature soon (with xformers if possible).

And note that ToMe is drafting a stable diffusion suport (with examples and code "coming soon"), also ref: https://github.com/facebookresearch/ToMe/issues/4

ultranity avatar Nov 10 '22 09:11 ultranity

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd It seems that it's intuitive to support it since there is only one line of code to apply it to a model:

import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns

Update: There is already a PR working on this, see below

Update again: I implement a extension to use ToMe (https://github.com/SLAPaper/a1111-sd-webui-tome), but it seems only gives a ~13% speed up when using batch size 8

SLAPaper avatar Apr 01 '23 19:04 SLAPaper

Update: there is already a ToMe implementation for Stable Diffusion: https://github.com/dbolya/tomesd It seems that it's intuitive to support it since there is only one line of code to apply it to a model:

import tomesd

# Patch a Stable Diffusion model with ToMe for SD using a 50% merging ratio.
# Using the default options are recommended for the highest quality, tune ratio to suit your needs.
tomesd.apply_patch(model, ratio=0.5)

# However, if you want to tinker around with the settings, we expose several options.
# See docstring and paper for details. Note: you can patch the same model multiple times.
tomesd.apply_patch(model, ratio=0.9, sx=4, sy=4, max_downsample=2) # Extreme merging, expect diminishing returns

I'll try myself to find a proper place to add the line of code and edit comment if making some progress

Working on this in AUTOMATIC1111/stable-diffusion-webui#9256

papuSpartan avatar Apr 01 '23 19:04 papuSpartan