CVPR23_LFDM
CVPR23_LFDM copied to clipboard
The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"
!!! Check out our new CVPR 2024 paper designed for text-conditioned image-to-video generation
LFDM
The pytorch implementation of our CVPR 2023 paper Conditional Image-to-Video Generation with Latent Flow Diffusion Models.
data:image/s3,"s3://crabby-images/0b1c5/0b1c52752708fd27e097effcc9110462daaec8c3" alt=""
Updates
[Updated on 07/08/2023] Added multi-GPU training codes for MHAD dataset.
[Updated on 05/12/2023] Released a testing demo for NATOPS dataset.
[Updated on 03/31/2023] Added the illustration of training a LFDM for NATOPS dataset.
[Updated on 03/27/2023] Added the illustration of training a LFDM for MHAD dataset.
[Updated on 03/27/2023] Released a testing demo for MHAD dataset.
[Updated on 03/26/2023] Added the illustration of training a LFDM for MUG dataset.
[Updated on 03/26/2023] Now our paper is available on arXiv.
[Updated on 03/20/2023] Released a testing demo for MUG dataset.
Example Videos
All the subjects of the following videos are unseen during the training.
Some generated video results on MUG dataset.
data:image/s3,"s3://crabby-images/d27f4/d27f467a2a413d93b1ec16bae01d6e3bfb4e9077" alt=""
Some generated video results on MHAD dataset.
data:image/s3,"s3://crabby-images/7945a/7945a0446c914bdd94a9b606308a0fc1fa784adc" alt=""
data:image/s3,"s3://crabby-images/dc076/dc076af09ae7aaf4156f37a9bee219dbafd43197" alt=""
Some generated video results on NATOPS dataset.
data:image/s3,"s3://crabby-images/41599/4159944485b76368a4c165e37f712e94c7dd713d" alt=""
Applied LFDM trained on MUG to FaceForensics dataset.
data:image/s3,"s3://crabby-images/276c2/276c247ff060f5277a3fa0e9178dfac5bd4c6dc9" alt=""
Pretrained Models
Dataset | Model | Frame Sampling | Link (Google Drive) |
---|---|---|---|
MUG | LFAE | - | https://drive.google.com/file/d/1dRn1wl5TUaZJiiDpIQADt1JJ0_q36MVG/view?usp=share_link |
MUG | DM | very_random | https://drive.google.com/file/d/1lPVIT_cXXeOVogKLhD9fAT4k1Brd_HHn/view?usp=share_link |
MHAD | LFAE | - | https://drive.google.com/file/d/1AVtpKbzqsXdIK-_vHUuQQIGx6Wa5PxS0/view?usp=share_link |
MHAD | DM | random | https://drive.google.com/file/d/1BoFPQAeOuHE5wt7h-chhYAO-dU0B1p2y/view?usp=share_link |
NATOPS | LFAE | - | https://drive.google.com/file/d/10iyzoYqSwzQ3fZgb6oh3Uay-P7k2A12s/view?usp=share_link |
NATOPS | DM | random | https://drive.google.com/file/d/1lSLSzS_KyGvJ7dW3l5hLJLR9k2k8LoU3/view?usp=share_link |
Demo
MUG Dataset
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Run
python -u demo/demo_mug.py
to generate the example videos. Please set the paths in the code files and config fileconfig/mug128.yaml
if needed. The pretrained models for MUG dataset have released.
MHAD Dataset
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Run
python -u demo/demo_mhad.py
to generate the example videos. Please set the paths in the code files and config fileconfig/mhad128.yaml
if needed. The pretrained models for MHAD dataset have released.
NATOPS Dataset
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Run
python -u demo/demo_natops.py
to generate the example videos. Please set the paths in the code files and config fileconfig/natops128.yaml
if needed. The pretrained models for NATOPS dataset have released.
Training LFDM
The training of our LFDM includes two stages: 1. train a latent flow autoencoder (LFAE) in an unsupervised fashion. To accelerate the training, we initialize LFAE with the pretrained models provided by MRAA, which can be found in their github; 2. train a diffusion model (DM) on the latent space of LFAE.
MUG Dataset
- Download MUG dataset from their website.
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Split the train/test set. You may use the same split as ours, which can be found in
preprocessing/preprocess_MUG.py
. - Run
python -u LFAE/run_mug.py
to train the LFAE. Please set the paths and config fileconfig/mug128.yaml
if needed. - Once LFAE is trained, you may measure its self-reconstruction performance by running
python -u LFAE/test_flowautoenc_mug.py
. - Run
python -u DM/train_video_flow_diffusion_mug.py
to train the DM. Please set the paths and config fileconfig/mug128.yaml
if needed. - Once DM is trained, you may test its generation performance by running
python -u DM/test_video_flow_diffusion_mug.py
.
MHAD Dataset
- Download MHAD dataset from their website.
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Crop the video frames and split the train/test set. You may use the same cropping method and split as ours, which can be found in
preprocessing/preprocess_MHAD.py
. - Run
python -u LFAE/run_mhad.py
to train the LFAE. Please set the paths and config fileconfig/mhad128.yaml
if needed. - Once LFAE is trained, you may measure its self-reconstruction performance by running
python -u LFAE/test_flowautoenc_mhad.py
. - Run
python -u DM/train_video_flow_diffusion_mhad.py
to train the DM. Please set the paths and config fileconfig/mhad128.yaml
if needed. - Once DM is trained, you may test its generation performance by running
python -u DM/test_video_flow_diffusion_mhad.py
.
NATOPS Dataset
- Download NATOPS dataset from their website.
- Install required dependencies. Here we use Python 3.7.10 and Pytorch 1.12.1, etc.
- Segment the video and split the train/test set. You may use the same segmenting method and split as ours, which can be found in
preprocessing/preprocess_NATOPS.py
. - Run
python -u LFAE/run_natops.py
to train the LFAE. Please set the paths and config fileconfig/natops128.yaml
if needed. - Once LFAE is trained, you may measure its self-reconstruction performance by running
python -u LFAE/test_flowautoenc_natops.py
. - Run
python -u DM/train_video_flow_diffusion_natops.py
to train the DM. Please set the paths and config fileconfig/natops128.yaml
if needed. - Once DM is trained, you may test its generation performance by running
python -u DM/test_video_flow_diffusion_natops.py
.
Citing LFDM
If you find our approaches useful in your research, please consider citing:
@inproceedings{ni2023conditional,
title={Conditional Image-to-Video Generation with Latent Flow Diffusion Models},
author={Ni, Haomiao and Shi, Changhao and Li, Kai and Huang, Sharon X and Min, Martin Renqiang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={18444--18455},
year={2023}
}
For questions with the code, please feel free to open an issue or contact me: [email protected]
Acknowledgement
Part of our code was borrowed from MRAA, VDM, and LDM. We thank the authors of these repositories for their valuable implementations.