Open-Sora-Plan
Open-Sora-Plan copied to clipboard
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Hi, We contribute the first dataset featuring 1.67 million unique text-to-video prompts and 6.69 million videos generated from 4 different state-of-the-art diffusion models. We hope it can help your Open-Sora...
Running VAEs and CLIP/T5 embedders is time expensive, and this cost scales up fast when multiple trainings are re-run. As we keep these parts frozen and train only the diffusion...
When I try to train VQVAE on my own data, I find the loss for vqvae training is only reconstruction loss https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/fdc786bc8e52d6386fb32c833eba0b4db286ca7b/opensora/models/ae/videobase/vqvae/trainer_vqvae.py#L11-L19, without the codebook loss like VideoGPT: https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/fdc786bc8e52d6386fb32c833eba0b4db286ca7b/opensora/models/ae/videobase/vqvae/videogpt/videogpt/vqvae.py#L64-L69 Are...
# Updates - Group files with paths; - Add CI docker image support; - Update requirements.txt to latest; - GitHub workflow support; Note: You need set DOCKER_USERNAME and DOCKER_ACCESS_TOKEN in...
This repo: [Fast Training of Diffusion Models with Masked Transformers](https://github.com/Anima-Lab/MaskDiT) suggests using masked transformers architecture for faster DiT training. They claim that > Experiments on ImageNet-256x256 and ImageNet-512x512 show that...
Hello, I noticed you mentioned that the latest code could support training with a latent size of 225x90x90, which seems quite large. However, I couldn't find the corresponding training script...
# Changed * Add trainer_videobase.py, dataset_videobase.py for unifying trainer and dataset of vqvae and causal vqvae. * Add train_ae.py for unifying training vqvae and causal vqvae. * Fix some name...
我需要下载https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/main/docs/Data.md 可以把需要下载的文件放在夸克网盘上分享吗?