diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Add attentionless VAE support

Open Gothos opened this issue 1 year ago • 1 comments

What does this PR do?

Exposes the mid_block_add_attention parameter in Encoder and Decoder to enable attentionless VAEs using AutoencoderKL config .

Before submitting

This PR aims to integrate the AuraDiffusion 16-channel VAE into diffusers. Integration requires minimal changes, requiring just an exposure of the Encoder and Decoder (from models/autoencoders/autoencoder_kl.py) classes' mid_block_add_attention parameter to the AutoencoderKL class and thus the VAE config.

Examples of image/reconstructed image pairs are below:

image image

Some metrics: image

make style && make quality fails due to some unrelated changes elsewhere in controlnets, hence this has not been done. I didn't think any new tests were required as this is a minor change, however more than happy to write any if required.

Gothos avatar Jul 02 '24 11:07 Gothos

Looking forward to it!

haofanwang avatar Jul 02 '24 16:07 haofanwang

can confirm this works correctly, thank you!

bghira avatar Jul 06 '24 22:07 bghira

@Gothos can you follow https://github.com/huggingface/diffusers/actions/runs/9824367283/job/27123254814?pr=8769#step:6:1 to resolve the code quality failures?

sayakpaul avatar Jul 07 '24 03:07 sayakpaul

Will do!

On Sun, Jul 7, 2024 at 09:15 Sayak Paul @.***> wrote:

@Gothos https://github.com/Gothos can you follow https://github.com/huggingface/diffusers/actions/runs/9824367283/job/27123254814?pr=8769#step:6:1 to resolve the code quality failures?

— Reply to this email directly, view it on GitHub https://github.com/huggingface/diffusers/pull/8769#issuecomment-2212310277, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWY3A7O6R2TQBAGHQ4OVVGTZLC2W3AVCNFSM6AAAAABKHKRLTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJSGMYTAMRXG4 . You are receiving this because you were mentioned.Message ID: @.***>

Gothos avatar Jul 07 '24 03:07 Gothos

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Should be done now, ran make style && make quality and make fix-copies.

Gothos avatar Jul 07 '24 04:07 Gothos

Would be great to see this PR merged!

donthomasitos avatar Jul 18 '24 08:07 donthomasitos

@yiyixuxu @DN6 a friendly ping here.

sayakpaul avatar Jul 18 '24 10:07 sayakpaul