diffusers
                                
                                 diffusers copied to clipboard
                                
                                    diffusers copied to clipboard
                            
                            
                            
                        Add absorbing diffusion
This PR implements absorbing diffusion, from the Unleashing Transformers paper.
The model is a BERT-like Transformer encoder instead of a U-Net.
No scheduler is added at the moment (as it's not required for inference; the model directly predicts latents, no noise).
Questions:
- I've implemented a Transformer, but it's implementation is quite specific to this paper; vocab_size is -1 for the language modeling head due to use of mask token
- as can be seen, the implementation of the VQ-VAE (actually VQ-GAN) is also quite specific to this paper; no SiLU activations at the end of encoder + decoder, and the number of layers per block of the decoder doesn't need the +1 as is done in vae.py now. Oh and also, it doesn't use quant_conv nor post_quant_conv, so I can't run the model with the forward method. I can only run it by running the forward of the encoder, quantizer and decoder separately.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Ok,
Move self-attention and attention block code into attention.py
Ok, done
Create a scheduler class for the loop (happy to help here)
Yes would be great to help me here
We cannot change VAE the way it was done. Could you maybe just add a config name with a default value (num_decoder_layers)
I have added an attribute decoder_layers_per_block, to be aligned with layers_per_block. However that's of course not ideal as the latter refers to the encoder.
@NielsRogge I can take this PR over if you want . We should create a scheduler class here
Yes, please do so :)
@patrickvonplaten gentle ping - should I rebase with the main branch to fix the conflicts?
Hey @NielsRogge,
Yes this would be super nice - sorry I hope to be able to look into it at the end of this week
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
cc @patrickvonplaten @patil-suraj @anton-l could anyone of you pick this up?
Pipeline is ready.
@NielsRogge, sorry I don't think I'll find bandwidth for this any time soon. As said in the statements above, one of the major things to change here is to add a scheduler class
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Closing for now due to inactivity