rinongal

Results 79 comments of rinongal

I wouldn't say expected, no. LDM does avoid this problem successfully, so it's not some universal property of text encoders. There almost certainly a way to detect and avoid this...

@TaleirOfDeynai The per_image_tokens approach wasn't helpful in any of our experiments, so I wouldn't count on it too much. Tuning the specific prompts to your set is a good way...

I'm closing this due to lack of activity. Feel free to reopen if you need further help.

I'm not sure I understood the problem and where you are seeing it. Is the encoder-decoder part (i.e. the reconstruction images in the log dir) creating additional faces in images...

Sorry, seems like I completely missed your followup here. Do you still need help with this issue?

You could try to have a look at https://github.com/AUTOMATIC1111/stable-diffusion-webui. They have an alternative implementation and plenty of optimizations. I think I saw someone say they managed to get it working...

That's interesting. Did you use the same number of vectors in the runs that took more time to converge? I'd be happy to experiment with this change and add it...

It probably depends on the complexity of the domain and how large a batch size you can use. For styles it will likely be fine. We haven't tested on something...

The FFHQ models are not text conditioned (or conditioned at all?) so you'll likely have to train a new conditional model for this. I believe there's a CelebA-like set with...

@Luffffffy As others have stated, I'd try to make sure the images are roughly the same angle. Specifically, try to make sure the cat head is facing up (like in...