diffusers
diffusers copied to clipboard
Add Latte: Latent Diffusion Transformer for Video Generation
Model/Pipeline/Scheduler description
Latte is a text2video diffusion transformer (similar to Sora), improving past the DiT and PixArt-alpha text2image models
The implementation is already based on diffusers (see latte_t2v.py), so adding it here should be a straightforward task
Open source status
- [X] The model implementation is available.
- [X] The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
The official repo https://github.com/Vchitect/Latte Model on Huggingface: https://huggingface.co/maxin-cn/Latte Paper: https://arxiv.org/abs/2401.03048v1 Project page: https://maxin-cn.github.io/latte_project/
Thanks for bringing this to our notice. But as far as I understand it from here, the current model suffers from the issue of producing watermarked videos. Maybe let's wait till they release the unwatermarked version? Cc: @DN6
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@sayakpaul Hi, I am the first author of Latte and we have updated the unwatermarked version of the LatteT2V model. We want to integrate Latte into diffusers library, what should I do? The pre-trained LatteT2V models are here and the codes are here.
Ccing @DN6 into this thread for further comments. I am happy to have the model integrated :)
Thanks for integrating Latte and your awesome work maxin!