diffusers
diffusers copied to clipboard
Implementation for TensorRT
Hi,
I guess that a reverse diffusion process can be performed fastly if an UNet implementation in this library can infer with TensorRT. Would you have the plan to implement for TensorRT?
Hey @UdonDa,
That's a good question! I'm sadly not too familiar with TensorRT but diffusion processes indeed suffer from slow inference often. I'd be happy to allow integrations with TensorRT, do you have an idea of how to do so? Also cc @anton-l
Hi, @patrickvonplaten, @anton-l!
Actually, it is easy to implement it using this library https://github.com/pytorch/TensorRT#python. The library provides a compiler so that we just compile our network instance.
However, I have two concerns. (1) I do not know if the compiled network can estimate an accurate score function. (2) It's necessary to re-write an unet implementation to compile, which is just how to define networks such as resblocks. Now, although I try to use TensorRT, the implementation cannot adapt it. If the modification is applied, the pretrained weight uploaded on a huggingface may not be correctly loaded.
Hey,
Hmm if it requires major code additions, it might be a bit too early to add to the library at this stage. Happy to help if you're interested in adding some code though!
This is the middle of the road, but it is necessary to be able to use tensorrt (through the torch_tensorrt library). Check out this implementation. Converting to torchscript and layer fusion increases the speed by 50%, which is pretty cool. Do you want to try adding it to the official library? https://github.com/cloneofsimo/sd-various-ideas/blob/main/create_jit.ipynb
PhotoRoom made an awesome blog post on exactly how do this: Making stable diffusion 25% faster using TensorRT. They explain exactly what's happening, give all the sample code, and even performance metrics between the two models. They also reference the recent ONNX work in #284. Hope this helps!
Cool also linking this to our current speed-up PRs:
- https://github.com/huggingface/diffusers/pull/532
- https://github.com/huggingface/diffusers/pull/371
- https://github.com/huggingface/diffusers/pull/511
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.