textual_inversion
textual_inversion copied to clipboard
How about training on more images of a domin? For example, 100~200 images?
In the paper, the training images is about 3-5 number of one object or 3-5 number of one style images. How about training on more images of a domin? For example, 100~200 images? So, the trained model can do image generation lilke pix2pix or cyclegan. Is this possible?
It probably depends on the complexity of the domain and how large a batch size you can use. For styles it will likely be fine. We haven't tested on something with a lot of variability (e.g. training on all Pokemon).
With that said, if you have access to something like 1k or more images, you probably want to look at a full fine-tuning method like this one.
Closing due to lack of activity. Feel free to reopen if you need more help.