tomesd icon indicating copy to clipboard operation
tomesd copied to clipboard

support for SDXL

Open wzq728 opened this issue 2 years ago • 5 comments

Thanks for your nice work! I want to know if tome support SDXL? And if it is, how to use it.

wzq728 avatar Jul 25 '23 09:07 wzq728

I haven't looked into it. How does SDXL differ from normal SD? If it's similar, there's probably a way to get it to work.

dbolya avatar Aug 08 '23 09:08 dbolya

I haven't done any detailed tests, but wrapping a huggingface pipeline.unet seems to work without crashing for training and inference, and produces images that are ok A bustling Parisian café scene in the 1920s  Jazz musicians, flapper girls, and intellectuals in conversation  Oil painting, canvas and oil paints  Warm, dimly lit ambiance

this is with r=0.5 at 672x672

theAdamColton avatar Sep 20 '23 00:09 theAdamColton

Does it speed it up? I think the default behavior of the diffusers implementation is to do nothing when wrapping the wrong thing, so it might not actually be doing anything.

dbolya avatar Sep 20 '23 02:09 dbolya

import tomesd
from diffusers import StableDiffusionXLPipeline, StableDiffusionPipeline
import torch
import time

pipeline = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16").to("cuda")

batch_size = 4
resolution = 896
trials = 2

tt = 0
for _ in range(trials):
    st = time.time()
    pipeline(prompt="Laundromat Stories: Inside a laundromat on a rainy day. People load clothes into washing machines and read magazines while waiting. Charcoal drawing, chiaroscuro, dramatic
 lighting from overhead fluorescents.", num_inference_steps=20, num_images_per_prompt=batch_size, width = resolution, height=resolution)
    tt += time.time() - st
print("SDXL no tomesd: avg time", tt/trials)

pipeline = tomesd.apply_patch(pipeline, ratio=0.75, max_downsample = 4)

tt = 0
for _ in range(trials):
    st = time.time()
    pipeline(prompt="Laundromat Stories: Inside a laundromat on a rainy day. People load clothes into washing machines and read magazines while waiting. Charcoal drawing, chiaroscuro, dramatic
 lighting from overhead fluorescents.", num_inference_steps=20, num_images_per_prompt=batch_size, width = resolution, height=resolution)
    tt += time.time() - st
print("SDXL w/ tomesd: avg time", tt/trials)

I get around a 12% speedup on a 3090: 18.9267s vs 16.891s

theAdamColton avatar Sep 21 '23 16:09 theAdamColton