Text2Video-Zero
Text2Video-Zero copied to clipboard
[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators
Text2Video-Zero
This repository is the official implementation of Text2Video-Zero.
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi
Our method Text2Video-Zero enables zero-shot video generation using (i) a textual prompt (see rows 1, 2), (ii) a prompt combined with guidance from poses or edges (see lower right), and (iii) Video Instruct-Pix2Pix, i.e., instruction-guided video editing (see lower left).
Results are temporally consistent and follow closely the guidance and textual prompts.
News
- [03/23/2023] Paper Text2Video-Zero released!
- [03/25/2023] The first version of our huggingface demo (zero-shot text-to-video generation, Video Instruct Pix2Pix) released!
Code
Will be released soon!
Results
Text-To-Video
![]() |
![]() |
![]() |
![]() |
| "A cat is running on the grass" | "A panda is playing guitar on times square | "A man is running in the snow" | "An astronaut is skiing down the hill" |
![]() |
![]() |
![]() |
![]() |
| "A panda surfing on a wakeboard" | "A bear dancing on times square | "A man is riding a bicycle in the sunshine" | "A horse galloping on a street" |
![]() |
![]() |
![]() |
![]() |
| "A tiger walking alone down the street" | "A panda surfing on a wakeboard | "A horse galloping on a street" | "A cute cat running in a beatiful meadow" |
![]() |
![]() |
![]() |
![]() |
| "A horse galloping on a street" | "A panda walking alone down the street | "A dog is walking down the street" | "An astronaut is waving his hands on the moon" |
Text-To-Video with Pose Guidance
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "A bear dancing on the concrete" | "An alien dancing under a flying saucer | "A panda dancing in Antarctica" | "An astronaut dancing in the outer space" |
Text-To-Video with Edge Guidance
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "White butterfly" | "Beautiful girl | "A jellyfish" | "beautiful girl halloween style" |
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "Wild fox is walking" | "Oil painting of a beautiful girl close-up | "A santa claus" | "A deer" |
Text-To-Video with Edge Guidance and Dreambooth specialization
![]() ![]() |
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "anime style" | "arcane style | "gta-5 man" | "avar style" |
Video Instruct Pix2Pix
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "Replace man with chimpanze" | "Make it Van Gogh Starry Night style" | "Make it Picasso style" |
![]() ![]() |
![]() ![]() |
![]() ![]() |
| "Make it Expressionism style" | "Make it night" | "Make it autumn" |
BibTeX
If you use our work in your research, please cite our publication:
@article{text2video-zero,
title={Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators},
author={Khachatryan, Levon and Movsisyan, Andranik and Tadevosyan, Vahram and Henschel, Roberto and Wang, Zhangyang and Navasardyan, Shant and Shi, Humphrey},
journal={arXiv preprint arXiv:2303.13439},
year={2023}
}



























































