Open-Sora icon indicating copy to clipboard operation
Open-Sora copied to clipboard

Questions on the choice of VAE

Open Red-Fairy opened this issue 11 months ago • 3 comments

Dear Authors,

Thanks for your great work!

I've just read your report and come up with some questions regarding the choice of the VAE. You mentioned that VideoGPT yields poor performance, so you chose 2D VAE because 3D sota VAEs like MAGVIT-v1/v2 are not open-sourced.

My question is have you ever tried using other 3D-VAE variants like TATS (Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer)?

Thanks in advance!

Red-Fairy avatar Mar 17 '24 16:03 Red-Fairy