Depth-Anything ControlNet result very bad.

ControlNet result very bad.

Open ShenZheng2000 opened this issue 8 months ago • 2 comments

Here is the inference script I used for controlnet image to image translation. Note that I already download your config.json and diffusion_pytorch_model.safetensors and put them into controlnet.

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import torch

base_model_path = "runwayml/stable-diffusion-v1-5"
controlnet_path = "controlnet"

controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    base_model_path, controlnet=controlnet, torch_dtype=torch.float16
)

# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# remove following line if xformers is not installed or when using Torch 2.0.
# pipe.enable_xformers_memory_efficient_attention() # NOTE: comment for now because torch<2.0
# memory optimization.
pipe.enable_model_cpu_offload()

control_image = load_image("bdd100k/images/100k/train_day/0a0a0b1a-7c39d841.jpg")
# prompt = "turn this into a night driving scene"
prompt = "day to night"

# generate image
generator = torch.manual_seed(0)
image = pipe(
    prompt, num_inference_steps=20, generator=generator, image=control_image
).images[0]
image.save("./output.png")

However, the result is very bad (Screenshot below).

Jun 20 '24 03:06 ShenZheng2000

Depth-Anything Depth-Anything copied to clipboard

ControlNet result very bad.

Depth-Anything
Depth-Anything copied to clipboard