rectified-flow-pytorch icon indicating copy to clipboard operation
rectified-flow-pytorch copied to clipboard

Feature request: Switch off Unet for DiT

Open moiseshorta opened this issue 9 months ago • 6 comments

Hello,

I've been reading a lot of the SOTA papers on audio and video generation using Rectified Flows, and it seems most are using Transformers instead of Unets.

Are there any plans to implement such an architecture change? They seem to improve greatly in performance, as in this implementation: https://github.com/cloneofsimo/minRF

Would be great to see it here, as it's a very clear to understand codebase, thanks again for opensourcing it!

moiseshorta avatar Feb 06 '25 13:02 moiseshorta