sd-forge-layerdiffuse Question regarding foreground and background LoRA.

In Fig. 3 in your paper, you mentioned two LoRAs, foreground and background LoRA trained on top of the base model.

You also mentioned that when training the base diffusion model (a), all model weights are trainable.

However, it seems that the base diffusion model is a UNet with LoRA layers. If this is the foreground model, where is the "background LoRA"?

Mar 05 '24 02:03 xiankgx

that model is the joint layer model that will be released soon (tomorrow or tomorrow tomorrow )

we are finalizing it because we want it to run on 8GB vram without triggering offload

running two sdxl on par consumes lots of vram and plus attention sharing becomes a nontrivial engineering problem for implementation

please be patient and see also model notes

Mar 05 '24 03:03 layerdiffusion

I have a similar question/confusion.

If I understand Fig.3 correctly, when we do foreground and background generation, two (SDXL + LoRA) branches should run in parallel with attention sharing, thus resulting two RGBA images, one for foreground and one for background.

When we do the trick in 4.2 Conditional Layer Generation, basically one of the branches is inert (not denoising). But now in README, I see the current workflow for foreground-conditioned generation is to first do foreground-to-blend and then do blend-to-background. With "joint layer model" you just mentioned, getting the background from foreground-conditioned generation should be much simpler.

Mar 07 '24 04:03 ifsheldon

Any status on this? Joint SDXL would be a Big Deal. The increased resource requirement isn't essential.

Mar 28 '24 19:03 doctorpangloss