stable-diffusion-webui
stable-diffusion-webui copied to clipboard
Use Spandrel for upscaling and face restoration architectures
Description
This PR yeets most of the copy-pasted or otherwise vendored model architectures in favor of just using Spandrel.
- Not converted is LDSR; it doesn't exist in Spandrel.
- There's still some more cleanup that could be done – there are multiple implementations of tiled inference right now, for one, and the model loading/downloading/... code is kind of a mess (should continue where I left off with #10823), but I'll hold off on that for this PR.
- A follow-up PR would add support for HAT models in about 42 lines of code. (Got a POC already.)
Screenshots/videos:
No visual changes. This seems to Work On My Machine but it'd be lovely if someone else tried this out too.
Checklist:
- [x] I have read contributing wiki page
- [x] I have performed a self-review of my own code
- [x] My code follows the style guidelines
- [ ] My code passes tests
Oh yeah, I've been using this PR for a couple days now; it works.
@gel-crabs Thanks for trying it out! I (force-)pushed this branch to update spandrel to a newer version, as well as add experimental support for HAT upscalers, if you want to try that out. (You'll need to bring your own models and put them in models/HAT/
.)
@gel-crabs Thanks for trying it out! I (force-)pushed this branch to update spandrel to a newer version, as well as add experimental support for HAT upscalers, if you want to try that out. (You'll need to bring your own models and put them in
models/HAT/
.)
It works! Admittedly it has issues with deepcache where it adds black splotches to the image during hires fix, but otherwise working.
I tried to hack in support for DAT as well by copying hat_model.py and replacing HAT with DAT, but it just made the image go full black.
Edit: It actually has nothing to do with deepcache, or any extensions at all. I'm going to try testing with different models.
I tried with a different 4x HAT upscaler and it gives full black images, so the HAT support doesn't seem to be working correctly.
I'm generally not pumped about adding new dependencies, but this removes a lot of code we just copy pasted, so that seems nice.
Some questions:
- what's with
__init__.py
? - what's with commented code in
webui.py
? - for tests, on the new machine (which is always the case for github servers), it looks to me that it will download the model. Maybe those testscould be disabled by default? Also since you're not actually checking any changes in faces, we could reuse the existing
img2img_basic.png
instead of adding a new pic. - what happens when you put a checkpoint in a wrong dir? Say, ESRGAN checkpoint into swinir dir. Or a codeformer model into ESRGAN dir?
- did you test all models you converted to use spandrel?
I'm generally not pumped about adding new dependencies, but this removes a lot of code we just copy pasted, so that seems nice.
I think this actually leads to less dependencies in total (I'll run the numbers later). The Spandrel folks seem nice and responsive too. :)
- what's with
__init__.py
?
Autogenerated by PyCharm when refactoring code. Will yeet, my bad.
- what's with commented code in
webui.py
?
Also accidentally added to this PR (since I was tired of having a gazillion WebUI tabs get auto-opened), my bad. Will yeet.
- for tests, on the new machine (which is always the case for github servers), it looks to me that it will download the model.
I can also add an actions/cache
action so we cache the models/
directory (like Spandrel's tests do).
Also since you're not actually checking any changes in faces, we could reuse the existing
img2img_basic.png
instead of adding a new pic.
Since we do facexlib
to detect faces and only act on the face patches, using an image that doesn't have any faces will not exercise the code that would actually run the Spandrel model 😁
I'll add a simple "output image was different" check!
- what happens when you put a checkpoint in a wrong dir? Say, ESRGAN checkpoint into swinir dir. Or a codeformer model into ESRGAN dir?
Good question - since Spandrel auto-detects the model arch from the checkpoint, it'd happily load it, and maybe fail with a parameter error down the line when we try to call the architecture with kwargs it doesn't get. I can add isinstance
checks to see we loaded the correct model (and warn and fail if so) instead of just blindly forging ahead.
- did you test all models you converted to use spandrel?
I did, on my machine (Macbook).
Looks like SwinIR x2 is not working now. I get this in any model:
File "...\modules\images.py", line 286, in resize_image
res = resize(im, width, height)
File "...\modules\images.py", line 278, in resize
im = upscaler.scaler.upscale(im, scale, upscaler.data_path)
File "...\modules\upscaler.py", line 65, in upscale
img = self.do_upscale(img, selected_model)
File "...\extensions-builtin\SwinIR\scripts\swinir_model.py", line 48, in do_upscale
img = upscaler_utils.upscale_2(
File "...\modules\upscaler_utils.py", line 181, in upscale_2
output = tiled_upscale_2(
File "...\modules\upscaler_utils.py", line 149, in tiled_upscale_2
].add_(out_patch)
RuntimeError: The size of tensor a (2560) must match the size of tensor b (1280) at non-singleton dimension 3
@wcde Thanks, I'll take a peek – what's your SwinIR tile size and overlap setting, and the size of the image you're trying to upscale?
In code hardcoded scale to 4. Should be something like that:
img = upscaler_utils.upscale_2(
img,
model,
tile_size=shared.opts.SWIN_tile,
tile_overlap=shared.opts.SWIN_tile_overlap,
scale=model.scale,
desc="SwinIR",
)
Second problem - model is loaded with dtype devices.dtype
, but in upscale_2
input casted to fp32:
tensor = pil_image_to_torch_bgr(img).float()
Which give:
RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same
@wcde In fairness, scale
has always been hard-coded to 4 unless I overlooked something:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/cf2772fab0af5573da775e7437e6acdca424f26e/extensions-builtin/SwinIR/scripts/swinir_model.py#L63
I'll take a look at the half
issue, thanks for pointing it out.
I guess it will happen with a lot of extensions after updating. Maybe it should be mentioned in changelog?