rinongal comments

Results 79 comments of


                                            rinongal

Any way to force training into ignoring background and focus just on a subject ?

I wouldn't say expected, no. LDM does avoid this problem successfully, so it's not some universal property of text encoders. There almost certainly a way to detect and avoid this...

Any way to force training into ignoring background and focus just on a subject ?

@TaleirOfDeynai The per_image_tokens approach wasn't helpful in any of our experiments, so I wouldn't count on it too much. Tuning the specific prompts to your set is a good way...

Any way to force training into ignoring background and focus just on a subject ?

I'm closing this due to lack of activity. Feel free to reopen if you need further help.

Artifacts arising from 256x256 data

I'm not sure I understood the problem and where you are seeing it. Is the encoder-decoder part (i.e. the reconstruction images in the log dir) creating additional faces in images...

Artifacts arising from 256x256 data

Sorry, seems like I completely missed your followup here. Do you still need help with this issue?

Artifacts arising from 256x256 data

You could try to have a look at https://github.com/AUTOMATIC1111/stable-diffusion-webui. They have an alternative implementation and plenty of optimizations. I think I saw someone say they managed to get it working...

Allowing initializer words which map to >1 token if num_vectors_per_token supports it

That's interesting. Did you use the same number of vectors in the runs that took more time to converge? I'd be happy to experiment with this change and add it...

How about training on more images of a domin? For example, 100~200 images?

It probably depends on the complexity of the domain and how large a batch size you can use. For styles it will likely be fine. We haven't tested on something...

Is there a way to wire this up to ffhq dataset?

The FFHQ models are not text conditioned (or conditioned at all?) so you'll likely have to train a new conditional model for this. I believe there's a CelebA-like set with...

Model training effect is not good

@Luffffffy As others have stated, I'd try to make sure the images are roughly the same angle. Specifically, try to make sure the cat head is facing up (like in...