Fooocus icon indicating copy to clipboard operation
Fooocus copied to clipboard

For inpaint_v26.fooocus.patch

Open peki12345 opened this issue 5 months ago • 8 comments

Hi, I'm interested in learning about the training process for the "inpaint_v26.fooocus.patch" used in inpainting or outpainting task. Is there any training code or paper related to this?

peki12345 avatar Jan 18 '24 07:01 peki12345

+1

andriyrizhiy avatar Jan 26 '24 09:01 andriyrizhiy

It's a LORA model for XL, but I don't know the exact training method either

zhangzhengyi12 avatar Jan 30 '24 02:01 zhangzhengyi12

The inpaint effect of fooocus is the best fusion I've ever used, and I'm looking forward to more information from the author about the training!

zhangzhengyi12 avatar Jan 30 '24 02:01 zhangzhengyi12

I am not sure that it is LORA. Usually LORA is much smaller than 1.32GB

andriyrizhiy avatar Jan 30 '24 09:01 andriyrizhiy

I am not sure that it is LORA. Usually LORA is much smaller than 1.32GB

Actually, it's a LORA, it fits the definition of LORA, just adjusting the network weights without altering the network structure. Before merge, the inpaint result was green, but post-merge, the correct result can be achieved. It's quite impressive that this LORA can transform all the t2i_xl model into inpaint_xl model. I'm curious about how the LORA was developed.

peki12345 avatar Jan 31 '24 01:01 peki12345

I am not sure that it is LORA. Usually LORA is much smaller than 1.32GB

Actually, it's a LORA, it fits the definition of LORA, just adjusting the network weights without altering the network structure. Before merge, the inpaint result was green, but post-merge, the correct result can be achieved. It's quite impressive that this LORA can transform all the t2i_xl model into inpaint_xl model. I'm curious about how the LORA was developed.

I'm guessing that the conversion of an arbitrary model into an inpaint model stems in part from the fooocus_inapint_head model, which is a small convolutional network used to compress 9 channels into 4 (since the standard model Unet only has 4 channels of inputs, whereas the repaint model has 9)

In terms of weights, the inpaint_model seems to be quite different from a normal LORA

zhangzhengyi12 avatar Jan 31 '24 07:01 zhangzhengyi12

I am not sure that it is LORA. Usually LORA is much smaller than 1.32GB

Actually, it's a LORA, it fits the definition of LORA, just adjusting the network weights without altering the network structure. Before merge, the inpaint result was green, but post-merge, the correct result can be achieved. It's quite impressive that this LORA can transform all the t2i_xl model into inpaint_xl model. I'm curious about how the LORA was developed.

I feel that inpaint_model_head and inpaint_model should be a kind of tie-in, with the head being responsible for channel compression and the latter being responsible for augmenting the repainting ability of arbitrary models. Maybe the training method is based on a kind of copying, copying the original UNET, freezing the original UNET, letting only the copied part be trained, thus letting him maximize the learning of the repaints, and finally merging the weights during the inference period

zhangzhengyi12 avatar Jan 31 '24 08:01 zhangzhengyi12

is there still no info about this ?

Hetaneko avatar Apr 04 '24 20:04 Hetaneko