PVDM icon indicating copy to clipboard operation
PVDM copied to clipboard

Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).

Results 8 PVDM issues
Sort by recently updated
recently updated
newest added

Q1: As latent diffusion uses VAE, why did you modify the structure to autoencoder, is it because of poor VAE performance? Q2: Why design a bottleneck structure here? https://github.com/sihyun-yu/PVDM/blob/17699659148423469c0d1ccdca5e466933b943e1/models/autoencoder/autoencoder_vit.py#L180C1-L190C34

The repo as a few hardcoded things that makes it difficult to use with a different setting, like different resolution or timesteps. I think I managed the resolution problem also...

Maybe there is a memory non reclamation issue in the first_stage_train, resulting in gradual memory growth ![image](https://user-images.githubusercontent.com/21325945/229014533-411e69d6-166c-4231-891d-b55479bea018.png)

such as:Normalize,BasicTransformerBlock,convert_module_to_f16,etc

Excellent work! : ) But I got a bug. When I use the multi-GPU run the first_stage code, my code was block up at this line. I find the issue...

Excuse me ~ How can we do the inference with the checkpoints ?

For training the autoencoder. Is it really possible to train the autoencoder with a 7-8 batch size? As you mentioned in the paper, how can we train the autoencoder with...

Thanks for open-sourcing the great work. However, I tried training the VAE on SkyTimelapse dataset for 150K steps but the R-FVD only get 66.79 while the reported number in the...