SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Shenggan Cheng¹, Yuanxin Wei², Lansong Diao³, Yong Liu¹, Bujiao Chen³, Lianghua Huang³,
Yu Liu³, Wenyuan Yu³, Jiangsu Du², Wei Lin³^†, Yang You¹^†

¹National University of Singapore, ²Sun Yat-sen University, ³Alibaba Group

(† Corresponding authors.)

GitHub Repo stars

Introduction

SRDiffusion is a novel video diffusion inference framework that reduces computation costs through Sketching-Rendering Cooperation between large and small models:

The large model handles high-noise steps, preserving semantic and motion fidelity (Sketching).
The small model processes low-noise steps to refine visual details (Rendering).

SRDiffusion achieves up to 3× speedup on Wan and 2× speedup on CogVideoX, with minimal quality degradation. And it is complementary to other optimization techniques.

SRDiffusion for Wan 2.1

Follow the Wan2.1 setup guide to set up the environment and download the 14B and 1.3B models.

Run the following command to generate a video:

python ./wan/srd_generate.py --task t2v-14B --size 832*480 \
    --ckpt_dir ./Wan2.1-T2V-14B --rendering_ckpt_dir ./Wan2.1-T2V-1.3B \
    --offload_model True --t5_cpu \
    --self_diff_threshold 0.01 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

SRDiffusion for CogVideoX

Follow the CogVideo setup guide to prepare the environment.

Run the following command to generate a video:

python cli_demo_srd.py --seed 42 --num_frames 49 --fps 8 \
    --self_diff_threshold 0.01 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

Acknowledgement

This repository is built upon Wan2.1 and CogVideoX. We sincerely thank the contributors of these projects for their excellent work.

Contributing

For contribution guidelines, please refer to CONTRIBUTING.md.

Contributors

SRDiffusion is developed by Alibaba Group and NUS HPC-AI Lab. This work is supported by Alibaba Innovative Research(AIR).

License

SRDiffusion is licensed under the Apache License (Version 2.0). See the LICENSE file for more details. This project also includes third-party test cases released under other open-source licenses. Please refer to the NOTICE file for more information.

Citation

If you find SRDiffusion useful for you, please consider starring the project ⭐ and citing it using the following BibTeX entry:

@misc{cheng2025srdiffusion,
      title={SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation}, 
      author={Shenggan Cheng and Yuanxin Wei and Lansong Diao and Yong Liu and Bujiao Chen and Lianghua Huang and Yu Liu and Wenyuan Yu and Jiangsu Du and Wei Lin and Yang You},
      year={2025},
      eprint={2505.19151},
      archivePrefix={arXiv},
      primaryClass={cs.GR},
      url={https://arxiv.org/abs/2505.19151}, 
}

SRDiffusion
SRDiffusion copied to clipboard

Metadata

SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Introduction

SRDiffusion for Wan 2.1

SRDiffusion for CogVideoX

Acknowledgement

Contributing

Contributors

License

Citation

← Metadata

Owner

Metadata

SRDiffusion SRDiffusion copied to clipboard

Metadata

SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation

Introduction

SRDiffusion for Wan 2.1

SRDiffusion for CogVideoX

Acknowledgement

Contributing

Contributors

License

Citation

← Metadata

Owner

Metadata

SRDiffusion
SRDiffusion copied to clipboard