generative-models icon indicating copy to clipboard operation
generative-models copied to clipboard

Is anyone trying to train their own SVD-based Camera Motion LoRA model?

Open DataAIPlayer opened this issue 9 months ago • 5 comments

I tried using LoRA to fine-tune the U-Net with SVD, and even with a batch size of 1, memory overflow occurs on the A100-80G GPU when the dataset consists of 25-frame videos. And I tried using DeepSpeed, but it was ineffective. Does this mean that model parallel training must be employed, distributing the model parameters across multiple GPUs?

DataAIPlayer avatar May 08 '24 16:05 DataAIPlayer

Why don't you lower your image resolution?? With batch size of 1, 512 x 512 x 25 frames runs for A6000 which has 48G of VRAM.

tykim0507 avatar May 13 '24 13:05 tykim0507

I am experimenting with fine-tuning motion LoRA and need to generate videos at a resolution of 1024x576. Do you mean that training motion LoRA at a lower resolution can achieve camera control effects when inferring at a higher resolution?

DataAIPlayer avatar May 15 '24 03:05 DataAIPlayer

Oh if you must get that resolution, this might not help :( Using xformers and gradient checkpointing might help

tykim0507 avatar May 16 '24 03:05 tykim0507

Is there an open source SVD fine-tuning method?

openchao avatar May 20 '24 03:05 openchao

https://github.com/alibaba/animate-anything/blob/main/train_svd.py

DataAIPlayer avatar May 21 '24 08:05 DataAIPlayer