ladi-vton icon indicating copy to clipboard operation
ladi-vton copied to clipboard

Problem about training

Open Kangkang625 opened this issue 1 year ago • 4 comments

Hi,thank you for your great work!

I was trying to wrtie train code and do some training, but I was confused by the We first train the EMASC modules, the textual-inversion adapter,and the warping component. Then, we freeze all the weights of allmodules except for the textual inversion adapter and train the proposed enhanced Stable Diffusion pipeline in 4.2, should I first freeze other weights including unet and train textual inversion adapter or should I free other weight and train textual inversion adapter and unet together。

Kangkang625 avatar Jul 31 '23 01:07 Kangkang625

I wonder it too.

snaiws avatar Aug 07 '23 03:08 snaiws

Hi @Kangkang625 Thanks for your interest in our work!!

should I first freeze other weights including unet and train textual inversion adapter or should I free other weight and train textual inversion adapter and unet together

First, you should pre-train the inversion adapter, keeping all the other weights (including the unet) frozen. Then keeping frozen the EMASC and the warping module, you should train the unet and the (pre-trained) inversion adapter together.

I hope this clarify your doubts Alberto

ABaldrati avatar Aug 07 '23 09:08 ABaldrati

Thanks for your answer @ABaldrati ! it's very helpful to my further study,but I still have a little confusion about the unet training.

According to my understanding, the unet should be extended based on the unet of stable diffusion pipeline. Should I extend the unet, initialize the changed part weight randomly and directly freeze it to pre-train the textual inversion adapter ?

Thanks again for your great work and detailed answer!

Kangkang625 avatar Aug 07 '23 09:08 Kangkang625

According to my understanding, the unet should be extended based on the unet of stable diffusion pipeline. Should I extend the unet, initialize the changed part weight randomly and directly freeze it to pre-train the textual inversion adapter ?

When we pre-train the inversion adapter we use the standard Stable Diffusion inpainting model. In this phase we do not extend the unet

ABaldrati avatar Sep 03 '23 16:09 ABaldrati