diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Support AnimatediffSDXL and controlnets

Open Sundragon1993 opened this issue 1 year ago • 4 comments

What does this PR do?

Integrate AnimatediffSDXL and Controlnet pipeline by combining the StableDiffusionXLControlNetAdapterPipeline and AnimateDiffSDXLPipeline.

The code is executable, even though the output isn't as good as expected.

Uploading animation_sdxl_control.gif…

Sundragon1993 avatar Jun 24 '24 03:06 Sundragon1993

animation_sdxl_control

Sundragon1993 avatar Jun 24 '24 03:06 Sundragon1993

Could you provide an example that doesn't involve a human subject, please?

sayakpaul avatar Jun 24 '24 08:06 sayakpaul

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

What about a gecko in 1024x1024 res? animation_gecko_1024

Sundragon1993 avatar Jun 24 '24 08:06 Sundragon1993

@Sundragon1993 I think you have to run make style && make quality on the PR and push up the changes.

DN6 avatar Jul 05 '24 07:07 DN6

@Sundragon1993 I have found that using beta_schedule="linear", timestep_spacing="linspace", steps_offset=1 works bad for the AnimateDiff SDXL Controlnet pipeline at times. They seem to only be needed for AnimateDiff SD1.5 and SDXL. Could you report examples after removing those lines and just using default DDIMScheduler? Will also need you to rewrite the example code in the PR/community README to something that works out-of-the-box.

I've been using the following code for testing:

Code
import requests
from io import BytesIO

import imageio
import torch
from controlnet_aux.processor import Processor
from diffusers import ControlNetModel, DiffusionPipeline, DDIMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
from PIL import Image


cache_dir = "/raid/aryan/"

# Load the motion adapter
adapter = MotionAdapter.from_pretrained(
    "a-r-r-o-w/animatediff-motion-adapter-sdxl-beta",
    torch_dtype=torch.float16,
    cache_dir=cache_dir,
)
controlnet1 = ControlNetModel.from_pretrained(
    "thibaud/controlnet-openpose-sdxl-1.0",
    torch_dtype=torch.float16,
    cache_dir=cache_dir,
).to("cuda:1")
controlnet2 = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16,
    cache_dir=cache_dir,
).to("cuda:1")

model_id = "SG161222/RealVisXL_V4.0"
# model_id = "stabilityai/stable-diffusion-xl-base-1.0"
scheduler = DDIMScheduler.from_pretrained(
    model_id,
    subfolder="scheduler",
    cache_dir=cache_dir,
)
pipe = DiffusionPipeline.from_pretrained(
    model_id,
    # controlnet=[controlnet1, controlnet2],
    controlnet=controlnet2,
    motion_adapter=adapter,
    scheduler=scheduler,
    torch_dtype=torch.float16,
    variant="fp16",
    custom_pipeline="./examples/community/animatediff_controlnet_sdxl.py",
    cache_dir=cache_dir,
).to("cuda:1")

# Enable memory savings
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()


def load_video(file_path: str):
    images = []

    if file_path.startswith(('http://', 'https://')):
        # If the file_path is a URL
        response = requests.get(file_path)
        response.raise_for_status()
        content = BytesIO(response.content)
        vid = imageio.get_reader(content)
    else:
        # Assuming it's a local file path
        vid = imageio.get_reader(file_path)

    for frame in vid:
        pil_image = Image.fromarray(frame)
        images.append(pil_image)

    return images

# video = load_video("dance.gif")
video = load_video("teddy.gif")[:16]

# p1 = Processor("openpose_full")
# cn1 = [p1(frame) for frame in video]

p2 = Processor("canny")
cn2 = [p2(frame) for frame in video]

# Generate the output
output = pipe(
    prompt="an adorable bear playing a guitar in space, realistic, high quality",
    negative_prompt="bad quality, worst quality, jpeg artifacts, ugly",
    num_inference_steps=25,
    guidance_scale=8,
    # width=768,
    # height=1152,
    width=1536,
    height=960,
    num_frames=16,
    conditioning_frames=cn2,
    # conditioning_frames=[cn1, cn2],
)

# Extract frames and export to GIF
frames = output.frames[0]
export_to_gif(frames, "animation.gif")
input teddy.gif
generated with sdxl base 1.0
generated with realvis sdxl 4.0

I don't know what's causing the colors to be off but I do not have the bandwidth to debug this for the time being. Could you take a look and verify if the pipeline is synced with your local changes? The PR looks mostly good but if you could clean up the comments, it would be nice (not a blocker though as long as forward pass works for community pipelines).

a-r-r-o-w avatar Jul 10 '24 22:07 a-r-r-o-w

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 14 '24 15:09 github-actions[bot]

not sure what is the status of this pr?

vladmandic avatar Sep 14 '24 15:09 vladmandic

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Oct 09 '24 15:10 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Nov 04 '24 15:11 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 11 '24 15:12 github-actions[bot]

Pinging @DN6 and @a-r-r-o-w!

sayakpaul avatar Dec 12 '24 04:12 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 05 '25 15:01 github-actions[bot]