Sana
Sana copied to clipboard
Sana checkpoint trained with SD-3 VAE
Hi, Thank you for open-sourcing your code and trained models. Could you release the Sana text2image model trained with either SD-XL or SD-3 VAE?
Yes, DC-AE is fast, but ruined image details, we also try f64 - same result( We try realistic, anime - https://imgsli.com/MzI0MDg3
May be take a look at auraflow vae? it's opensourced as i know, comparable to flux/sd3 vae and significantly better DC-AE
Auraflow vae
DC-AE
Or may be train small model for converting from DC-AE to AuraFlow in latent space directly, what do you think, is this possible?
Makes sense. It's surprising that Sana obtains better FID scores with this VAE, despite worse reconstruction results.
From the paper:
although AE-F8C16 exhibits the best reconstruction ability (rFID: F8C16<F16C32<F32C32), we empirically find that the generation results of F32C32 are superior
I wish they'd release checkpoints trained with other VAEs, allowing users to choose the one that works best for their specific dataset when fine-tuning.
@lawrence-cj Do you plan to release Sana trained with f8c4 / f8c16 VAEs?
@lawrence-cj gentle reminder