diffusers
diffusers copied to clipboard
`AutoencoderKL.encode` returns diagonal Gaussian instead of hidden states
This is mostly a question.
I find it a little weird that the output of encode in AutoencoderKL is the diagonal Gaussian distribution from which the hidden states are obtained, instead of the hidden states themselves: https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/vae.py#L436. Instead, the hidden states are retrieved during the forward call.
This breaks the assumption (in my mind) that calling decode with the output from encode should work.
I checked the CompVis codebase and they do the same, so I'm intrigued. Is there a reason for this approach that I'm surely missing?