stable-diffusion-webui-forge Cache IPAdapter instances to avoid expensive KV extraction on every generation

trafficstars

Hi all, I'm not deeply familiar with the code I've touched here, so I'm very open to feedback on this PR.

Description

Currently apply_ipadapter reconstructs an IPAdapter instance on every invocation, even when reusing the same model. This is an expensive operation primarily due to the call to To_KV(), and to a lesser degree due to the calls to init_proj*().

This PR caches the IPAdapter() keyed off of the IPAdapter model filename if running with --always-high-vram.

Why?

This has a significant performance impact on Deforum, which I'm currently porting to Forge as per https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/96).

Without this change, Forge is slower than A1111 on a simple Deforum run with IPAdapter enabled, despite having higher it/s. With this change, it is substantially faster that A1111.

The attached 120 frame Deforum settings file runs as follows on my 3090 / i5-4590 @ 3.30GHz:

A1111: 3 min. 32.1 sec.
Forge without this change: 4 min. 28.5 sec.
Forge with this change: 1 min. 47.5 sec

deforum_settings.txt

Outside of Deforum, this also benefits runs with batch size>1 or any repeated gens using IPAdapter (shaves a few seconds off the initialisation time before you see the it/s gauge).

Screenshots/videos:

n/a

Checklist:

[x] I have read contributing wiki page
[x] I have performed a self-review of my own code
[x] My code follows the style guidelines
[ ] My code passes tests (not passing on clean checkout in my env)

Feb 20 '24 03:02 rewbs

I think that it's working, now when I use IpAdapters (in this example I go for instantID so I get 2 IpAdapters) and I generate images over and over with the same model, it starts without much delay, here's my logs

2024-02-22 00:14:15,746 - ControlNet - INFO - Using preprocessor: InsightFace (InstantID)
2024-02-22 00:14:15,746 - ControlNet - INFO - preprocessor resolution = 1024
2024-02-22 00:14:15,915 - ControlNet - INFO - Current ControlNet IPAdapterPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\ip-adapter_instant_id_sdxl.bin
2024-02-22 00:14:15,915 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE
2024-02-22 00:14:15,919 - ControlNet - INFO - Using preprocessor: instant_id_face_keypoints
2024-02-22 00:14:15,920 - ControlNet - INFO - preprocessor resolution = 1024
D:\stable-diffusion-webui-forge\venv\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
  P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
Automatic Memory Management: 0 Modules in 0.00 seconds.
2024-02-22 00:14:16,621 - ControlNet - INFO - Current ControlNet ControlNetPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\control_instant_id_sdxl.safetensors
2024-02-22 00:14:18,030 - ControlNet - INFO - IPAdapter: Using cached layers for ip-adapter_instant_id_sdxl.bin.
2024-02-22 00:14:18,056 - ControlNet - INFO - ControlNet Method InsightFace (InstantID) patched.
2024-02-22 00:14:18,160 - ControlNet - INFO - ControlNet Method instant_id_face_keypoints patched.
To load target model SDXL
To load target model ControlNet
Begin to load 2 models
unload clone 3
unload clone 2
Moving model(s) has taken 0.14 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.78it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.75it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.85it/s]

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag. https://github.com/lllyasviel/stable-diffusion-webui-forge/issues/266#issuecomment-1957089948

Feb 21 '24 23:02 BadisG

Thanks for trying this out!!

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Those are unrelated and are not expected to be changed by this PR.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag.

Not sure I fully understand: this change isn't about checkpoints, it's about caching some of the data derived from the ipadapter models. Perhaps unloading previous checkpoints is a separate concern we can tackle under a different PR.

Feb 23 '24 22:02 rewbs

will take a look soon

Feb 24 '24 06:02 lllyasviel

hi we are going to close PRs before forge's recent major revision if we missed some important PRs, please consider reopen (if that is not already on our todo list

Aug 01 '24 19:08 lllyasviel

stable-diffusion-webui-forge stable-diffusion-webui-forge copied to clipboard

Cache IPAdapter instances to avoid expensive KV extraction on every generation

Description

Why?

Screenshots/videos:

Checklist:

stable-diffusion-webui-forge
stable-diffusion-webui-forge copied to clipboard