tomesd
tomesd copied to clipboard
Support torch.compile
In #40 there is some discussion about supporting torch.compile and I'd like to create an issue for it in case anyone comes here looking for the same. I am working towards real-time applications, so any speed I can scrape is a massive gain for me. Must break the 1 FPS barrier!
I've made this benchmark below and would love to extend it. In particular I wonder if the speedup would be additive or multiplicative.
| GPU | Model | Optimizations | Speed (it/s) |
|---|---|---|---|
| RTX 3090 | ControlNet(HED+TemporalNet+Depth) | Raw | 5.76 |
| RTX 3090 | ControlNet(HED+TemporalNet+Depth) | TomeSD 37.5% | 6.13 |
| RTX 3090 | ControlNet(HED+TemporalNet+Depth) | Compile reduce-overhead | 6.30 |
| RTX 3090 | ControlNet(HED+TemporalNet+Depth) | Compile max-autotune | 6.50 |