Open-Sora
Open-Sora copied to clipboard
Questions on the choice of VAE
Dear Authors,
Thanks for your great work!
I've just read your report and come up with some questions regarding the choice of the VAE. You mentioned that VideoGPT yields poor performance, so you chose 2D VAE because 3D sota VAEs like MAGVIT-v1/v2 are not open-sourced.
My question is have you ever tried using other 3D-VAE variants like TATS (Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer)?
Thanks in advance!