Why conv offsets can be applied to ANY SDXL models?

Open DarrenZhaoFR opened this issue 1 year ago • 1 comments

Hi, awesome work! As you mentioned the safetensors you released, which are basically weight offsets (in my understanding), can be applied to any SDXL models. I can understand that if the base model is the same(params and architecture), applying offsets can finally get the same model. However I tried layer_xl_transparent_conv.safetensors on my own fine-tuned model(unet model params are changed), and it still works pretty good. Is there a theory behind this? Hope maybe you can share some insights. Thanks!

Mar 04 '24 03:03 DarrenZhaoFR

The encoder has been trained to produce latents which do not upset the base model. More details in section 3 of paper: https://arxiv.org/pdf/2402.17113.pdf

Since your model is fine-tuned from the base model, it behaves similarly. If you trained it from scratch, it would not work anymore. Here is a paper that researches this behavior in more detail. https://arxiv.org/pdf/2305.12827.pdf The paper is about LLMs instead of SDXL, but I presume that the same concepts apply here.

Mar 04 '24 07:03 99991