Open-Sora-Plan
Open-Sora-Plan copied to clipboard
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
add NaViT support under file opensora/models/diffusion/dit/NaViT.py
hi, this project use VQVAE to compress video into small latent space, and latent embedding dim is `512` or `256`. But in LDM, they usually use very small embedding dim...
from opensora.utils.dataset_utils import is_image_file I can't find a function named is_image_file.
--Borrowed from OpenDiT-- FastSeq boosts performance thanks for improved attention memory communication time https://github.com/NUS-HPC-AI-Lab/OpenDiT?tab=readme-ov-file#fastseq Example diff implementation on top of existing DiT/Latte https://github.com/NUS-HPC-AI-Lab/OpenDiT/pull/92/commits/d7be4992cbe028c7068f35e8463f9ce1003aabb1 Utility functions reference: https://github.com/kabachuha/OpenMMDiT/blob/d7be4992cbe028c7068f35e8463f9ce1003aabb1/opendit/utils/operation.py#L1
Project page: https://piecewise-rectified-flow.github.io/ Github: https://github.com/magic-research/piecewise-rectified-flow/tree/main Claims to be faster than the normal Rectified Flow (used in [Stable Diffusion 3](https://github.com/PKU-YuanGroup/Open-Sora-Plan/issues/43)) I believe it will be a huge quality/soeed win compared to...
# Changed * Code Style: * Rewrite dit modeling, split dit.py to modeling_dit.py and configuration_dit.py * Accelerate w/ Deepspeed training: Support training "dit" on accelerate w/ deepspeed. # How to...
Can you share 'results/011-Latte-XL-122-F128S3-landscope_feature-Gc-Amp/checkpoints/0005000.pt'? ``` accelerate launch --num_processes 1 --main_process_port 29501 --mixed_precision bf16 opensora/sample/sample.py \ --model Latte-XL/122 \ --ae stabilityai/sd-vae-ft-mse \ --ckpt results/011-Latte-XL-122-F128S3-landscope_feature-Gc-Amp/checkpoints/0005000.pt --extras 1 \ --fps 10 --num-frames 128...
> This repo supports training a latent size of 225×90×90 (t×h×w), which means we are able to train 1 minute of 1080P video with 30FPS (2× interpolated frames and 2×...