rectified-flow-pytorch
rectified-flow-pytorch copied to clipboard
Feature request: Switch off Unet for DiT
Hello,
I've been reading a lot of the SOTA papers on audio and video generation using Rectified Flows, and it seems most are using Transformers instead of Unets.
Are there any plans to implement such an architecture change? They seem to improve greatly in performance, as in this implementation: https://github.com/cloneofsimo/minRF
Would be great to see it here, as it's a very clear to understand codebase, thanks again for opensourcing it!