custom-diffusion icon indicating copy to clipboard operation
custom-diffusion copied to clipboard

What's the purpose of these lines?

Open XavierXiao opened this issue 2 years ago • 3 comments

Thanks for your amazing work! I have one quick question: what is the purpose of these lines, in the modified CrossAttention's forward function? It seems like you disable the gradient of the first token in the embedding? Can you explain a bit?

Thanks!

XavierXiao avatar Feb 01 '23 22:02 XavierXiao

Hi,

Since the first start of the sentence token is always fixed, I noticed a small improvement when detaching it during the training. I guess this helps in better association between the "V* category" and the target image and thus improved generation on inference time prompt.

Thanks.

nupurkmr9 avatar Feb 03 '23 02:02 nupurkmr9

thanks! Another possible issue I spotted is here, where it always assume the --freeze_model is 'crossattn_kv', and if I set this argument to 'crossattn' this line will disregard it.

XavierXiao avatar Feb 03 '23 23:02 XavierXiao

Ohh yeah. Thanks so much for catching it!! I have corrected it now.

nupurkmr9 avatar Feb 04 '23 12:02 nupurkmr9