stable-diffusion-webui-tensorrt
stable-diffusion-webui-tensorrt copied to clipboard
No real speedup
I had hoped this would help make generation faster on my P40. Unfortunately it only seems to add a few seconds to generation time. I made the model work from 512-768 and tried converting in both fp16 and fp32. A matter of 21 sec vs 24 sec for tensort to make a 704x704 image.
I will see if there's some difference on my 3090. I at least hope there is. OS was linux.
So far got a bigger speedup from getting rid of that ancient transformers and accelerate version pushed via requirements.
same on V100, same speed as xformers.
Adding shared.sd_model.model.diffusion_model = current_unet
after current_unet.activate()
will enable TrtUnet and speed it up.