diffusers TextToVideoSDPipeline outputs blank video

Describe the bug

I am encountering an issue when using the TextToVideoSDPipeline. It is generating blank videos when I run the model in Replicate. The model is running on A100 (80GB) hardware.

Reproduction

pipe = TextToVideoSDPipeline.from_pretrained(
            "cerspense/zeroscope_v2_576w",
            torch_dtype=torch.float32,
        )
pipe.enable_sequential_cpu_offload()

# memory optimization
pipe.unet.enable_forward_chunking(chunk_size=1, dim=1)
pipe.enable_vae_slicing()

pipe.scheduler = DPMSolverMultistepScheduler.from_config(self.pipe.scheduler.config)

video_frames = pipe(
            prompt="astronaut riding a horse on mars, beautiful, 8k, perfect, award winning, national geographic",
            negative_prompt="very blue, dust, noisy, washed out, ugly, distorted, broken", 
            num_frames=24, 
            num_inference_steps=25,
            guidance_scale=12.5,
            width=576,
            height=320,
        ).frames[0]
    
video_path = export_to_video(video_frames, fps=24)
return Path(video_path)

Logs

No response

System Info

Cog yaml file build: gpu: true cuda: "12.1" python_version: "3.11.1" system_packages: - "libgl1-mesa-glx" python_packages: - "diffusers==0.27.0" - "torch==2.2.1" - "ftfy==6.1.1" - "scipy==1.9.3" - "transformers==4.38.1" - "accelerate==0.27.2" - "huggingface-hub==0.20.3" - "numpy==1.25.1" - "opencv-python"

Who can help?

No response

Apr 02 '24 09:04 rafationgson

Hi @rafationgson, What do you mean exactly as blank videos, only black frames? Why are you using .enable_sequential_cpu_offload()? Your GPU seems relatively decent. Do you have other processes that need to consume GPU VRAM? Could you replace it with .enable_model_cpu_offload() or .to('cuda')?

Apr 02 '24 10:04 tolgacangoz

Hi @standardAI, yes black frames. Actually there are no other processes that need to consume GPU VRAM. I thought I needed to enable the offload.

I replaced it with to .to('cuda') and used torch.float16, but I am still encountering black frames.

Apr 02 '24 11:04 rafationgson

What about .to('cuda') and torch.float32 without offloading?

Apr 02 '24 11:04 tolgacangoz

What about .to('cuda') and torch.float32 without offloading?

Tried this too and issue still persists.

Apr 02 '24 11:04 rafationgson

In what way are you exactly displaying video, via which function? What is your OpenCV version? Also, self.pipe is not another unrelated pipeline, right?

Apr 02 '24 12:04 tolgacangoz

I am trying to display the video via the predict function of the Predictor class of my custom Cog model in Replicate. Yes self.pipe is the same pipeline, I forgot to remove it when I put the reproduction code above. As for my OpenCV version I believe it is the latest version: 4.9.0.80 since I didn't specify a version.

Apr 02 '24 14:04 rafationgson

I can't reproduce this issue https://colab.research.google.com/drive/1Ul37s8OefIJ-RkpyNkPAqOc7P0gzRL78?usp=sharing

Apr 03 '24 06:04 yiyixuxu

@rafationgson can you check to see if the individual frames of the video are blank as well?

Apr 04 '24 06:04 DN6

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

May 02 '24 15:05 github-actions[bot]

@rafationgson is this still an issue?

May 03 '24 23:05 yiyixuxu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

diffusers diffusers copied to clipboard

TextToVideoSDPipeline outputs blank video

Describe the bug

Reproduction

Logs

System Info

Who can help?

diffusers
diffusers copied to clipboard