Any idea to speedup?
Thanks for your interesting work! I have tried to infer a 75 frames video on A100 in 512*768, which will take about 3mins. I also tried to use more cards, however, it only generates more videos :( . Do you have some ideas to speedup?
Hi, multi-machine distributed acceleration is not deployed in the code. To speed things up, you can try changing the value of context_batch_size = 1 from 1 to 8 or some other value greater than 1, but this requires to change the shape of model_kwargs_new by yourself. Thanks.
Hi, we have upload the code for multiple segments parallel denoising to accelerate long video inference. Accelerated inference for multi-GPUs parallelism is not currently supported, and we welcome developers to support multi-GPUs parallelism improvements, which we will incorporate into our code. Thanks for your contribution.
Thank you for your reply and effort! I will try it
@wangxiang1230 is this the change that you released on Jun 26? How much faster will context_batch_size = 8 be?