Open-Sora-Plan icon indicating copy to clipboard operation
Open-Sora-Plan copied to clipboard

[feat] Dataset embeddings/latents caching for more flexible experiments

Open kabachuha opened this issue 11 months ago • 1 comments

Running VAEs and CLIP/T5 embedders is time expensive, and this cost scales up fast when multiple trainings are re-run.

As we keep these parts frozen and train only the diffusion model, we can decide to precompute them only once and store on drive in a form of raw tensors to be reused each training

See for a possible implementation

https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/main/train.py

https://github.com/ExponentialML/Text-To-Video-Finetuning/blob/main/utils/dataset.py

kabachuha avatar Mar 13 '24 06:03 kabachuha

Good job, it is of benfit for training model with large dataset.

LinB203 avatar Mar 13 '24 11:03 LinB203