onediff icon indicating copy to clipboard operation
onediff copied to clipboard

dynamic batch size failed

Open HydrogenQAQ opened this issue 11 months ago • 5 comments

Describe the bug

A clear and concise description of what the bug is.

Hello,when I use different batch sizes for inference of T2I, I get a problem like the following:

image

Your environment

OS

OneDiff git commit id

0.12.1.dev0

OneFlow version info

Run python -m oneflow --doctor and paste it here.

0.9.1.dev20240124+cu121

How To Reproduce

Steps to reproduce the behavior(code or script):

import argparse

from diffusers import (
    EulerAncestralDiscreteScheduler,
    StableDiffusionPipeline,
)

from diffusers.utils import load_image
import numpy as np
import torch

import cv2
from PIL import Image

parser = argparse.ArgumentParser()
parser.add_argument("--base", type=str, default="/root/models/stable-diffusion-v1-5/")

parser.add_argument("--prompt", type=str, default=None)
parser.add_argument("--height", type=int, default=512)
parser.add_argument("--width", type=int, default=512)
parser.add_argument("--n_steps", type=int, default=20)
parser.add_argument("--seed", type=int, default=300)
parser.add_argument("--warmup", type=int, default=1)
parser.add_argument("--run", type=int, default=10)
parser.add_argument(
    "--use_onediff", type=(lambda x: str(x).lower() in ["true", "1", "yes"]), default=True
)

args = parser.parse_args()

print("build pipeline")
pipe = StableDiffusionPipeline.from_pretrained(args.base, torch_dtype=torch.float16)
pipe.safety_checker = None
pipe.requires_safety_checker = False
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

# onediff
if args.use_onediff:
    from onediff.infer_compiler import oneflow_compile

    pipe.unet = oneflow_compile(pipe.unet)

    # pipe.vae.encoder = oneflow_compile(pipe.vae.encoder)
    pipe.vae.decoder = oneflow_compile(pipe.vae.decoder)

    # pipe.controlnet = oneflow_compile(pipe.controlnet)

# generate image
generator = torch.manual_seed(args.seed)

for i in tqdm(range(args.run), desc="Pipe processing", unit="i"):
    if i == 0:
        args.prompt = ["111",
                       "222",
                       ]
    else:
        args.prompt = ["111",
                       "222",
                       "333",
                       "444",
                       ]
    start_t = time.time()
    run_image = pipe(
        args.prompt,
        height=args.height,
        width=args.width,
        num_inference_steps=args.n_steps,
        generator=generator,
        guidance_scale=7.5,
    ).images[0]
    torch.cuda.synchronize()
    end_t = time.time()
    costs.append(end_t - start_t)
    print(f"e2e {i} ) elapsed: {end_t - start_t} s")

The complete error message

Additional context

Add any other context about the problem here.

HydrogenQAQ avatar Mar 20 '24 11:03 HydrogenQAQ

Thanks, got this. Dynamic batch size is not well supported for now. We will look into it the next week.

strint avatar Mar 22 '24 08:03 strint

@HydrogenQAQ sorry i can not reproduce the erro with your script. con you tell us version of diffusers in your env? Or maybe you can update oneflow and onediff then rerun you script to check if the error still occurs

clackhan avatar Mar 25 '24 09:03 clackhan

OK, I'll try it, thanks

HydrogenQAQ avatar Mar 28 '24 07:03 HydrogenQAQ

What I mean is that dynamic batch size will cause recompilation

HydrogenQAQ avatar Mar 28 '24 07:03 HydrogenQAQ

@HydrogenQAQ Dose the problem still exist?

If so, we need an example to reproduce this.

strint avatar Apr 15 '24 15:04 strint