DreamArtist-sd-webui-extension
DreamArtist-sd-webui-extension copied to clipboard
one shot training results are extremely poor, I cannot reproduce the results of the paper
When I train with a picture from the paper
my training results are as follows with the prompt "a painting of city, art by ch_fg4; ch_fg-neg" :
but the results in the paper are like this:
My training parameters are as follows:
Is there something wrong with my training process causing this poor result?
Thanks!
I would try it with fewer tokens and no initialisation text. Could also try lower CFG.
What might be an issue is the fact reconstruction is broken and there's been no update to fix it, though I don't know. I'll give it a shot later and try a few things, see if I can get anything vaguely similar.
Hm, I wonder if it's the lack of reconstruction? I checked the paper, and it was pretty clear--three positive tokens, three negative tokens, 0.0025 learning rate, and a mysterious gamma that I think must be CFG as it was set as 5. 2-8k steps. So, I set it up to train that way, as well as a comparison with a much higher learning rate and EMA below 1 (and 6 negative tokens), and it's been kind of nonfunctional. The one with the really high learning rate sort of got something:
But nothing on the paper results. Those all seem to generate generic cities, e.g.:
Graphing the loss and looking at the vectors (using https://github.com/Zyin055/Inspect-Embedding-Training), it's definitely learning something, but I have no clue what:
set learning rate to 0.003 wolud be better, 0.0003 is too small
You can also try the improved version of DreamArtist++ with lora added for better results HCP-Diffusion
This might be a shot in the dark, but if you still have the embeddings could you try using the negative embedding in the positive and the positive in the negative?
Recently I found that my embeds that previously looked bad actually worked pretty well in reverse and successfully captured elements of the training images much better than the non-reversed prompt. I don't know why this happened several times and I don't know if I made a bad edit to the script. As far as I can tell all I've done is changed it to not use the entire positive prompt as the negative prompt with only the trained embedding swapped for the negative version, and instead now it should be using only the negative embedding without the positive prompt from the prompt template.