generative-models
generative-models copied to clipboard
[Question] SVD/SVD-XT performance on older GPU (i.e. without Triton support)
Triton supported only starting from 7.0 CUDA API version.
I inferenced a video on Nvidia A10G (i.e. w/Triton) on 🤗-spaces and it took ~130 seconds for all the 25 steps.
Next i did the same inference on Nvidia Tesla P40 (i.e. without Triton) and each step was taking around 150 seconds.
So while on the paper A10G is not 30 times more powerful than Tesla P40 it makes me wonder:
-
if such performance difference is solely due to using Triton - mb if anyone have capable hardware i'd be curious to see how much performance difference just enabling/disabling Triton would give
-
is there a way to improve the current performance on old non-Triton gpus, like aforementioned Tesla P40.
a bit offtopic, but just for the comparison reference: on Tesla P40 generating short video in same resolution/length with AnimateDiff or SD-HotshotXL takes around 200s for 25 steps - so that makes me think it is smth to do with settings/code, not the hardware