diffusers Combined loss term for VQ-VAE (`diffusers.VQModel`)

For training the VQ-VAE component of a latent diffusion model a la CompVis/ldm-celebahq-256 (which uses diffusers.VQModel), is there a combined loss term for each of the losses as described by the authors: reconstruction loss, vq loss, and commitment loss?

I see the vq loss term is collected in VectorQuantizer, but it does not seem to be used anywhere else. https://github.com/huggingface/diffusers/blob/ebc99a77aad647c5d33eb36a33c23f7b3949cb40/src/diffusers/models/autoencoders/vae.py#L726-L730

I'm also open to alternatives to VQModel like AutoEncoderKL, if they can collect the loss terms more easily.

Thank you!

Apr 26 '24 16:04 asy51

For example, Seq2SeqQuestionAnsweringModelOutput has a loss attribute (https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/modeling_outputs.py#L1169) which can be used to train transformers.T5... I'm looking for something similar in VQModel, or other VAE for that matter.

Apr 26 '24 23:04 asy51

yes, in the deep-floyd/IF project we see these; https://github.com/deep-floyd/IF/blob/develop/deepfloyd_if/model/gaussian_diffusion.py#L739 but i can't remember anywhere seeing them in the diffusers project

Apr 27 '24 02:04 bghira

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]