OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

Clearer handling of cropping and resolutions:

Open MartinoCesaratto opened this issue 10 months ago • 4 comments

Describe your use-case.

Right now the quick start guide suggests that I shouldn't really bother about resizing my dataset images, but it will be handled by Onetrainer if I activate resolution bucketing, but I noticed that when selecting multiple training resolution, if I set batch size to 1 it uses all samples, but if I set it to 2 the number of steps is less than half, so some image is not used anymore.

What's not really clear is what happens, let's make an example:

  • I set training resolutions to 512, 640, 768, 960.
  • I have a 639*641 image, is it always cropped to 512x640, or sometimes to 512x512?
  • I have a 256x320 image, is it upscaled to 512x640 or can sometimes end up at 768x960?

I also noticed that even with crop jitter enabled the preview is static, if I have a 1024*512 image do I get crops of image[0:960,0:512] and [64:1024, 0:512] or the crops are always centered? Will it sometimes be cropped to resolutions different from 960x512?

What would you like to see as a solution?

I have 5 proposals to improve both clarity and training:

  1. Use all images option: when batch size > 1, always try to have batch_size images for every resolution even if it involves using crops with less coverage of the original images
  2. correclty show crop jitter's effect in the preview (assuming righ now it only shows a centered square crop and not what's actually used)
  3. vary scaling option: if possible, also uses samples downscaled to lower resolutions, not only maximum one
  4. when using samples below a set resolution (even if upscaled), optionally add a set tag (for example "low resolution, low quality") to the prompt, same when above certain resolution (for example "high resolution")
  5. allow to set both horizontal and vertical resolution, so that i can set something like "384, 512x512, 768" and have as a set of allowed resolutions "384x384, 384x768, 512x512, 768x768, 768x384"

Have you considered alternatives? List them here.

right now I can probably have multiple copies of each image with different resolutions/aspect ratios/cropping, but would require a lot of them to truly cover each possible crop of each image

MartinoCesaratto avatar Apr 26 '24 13:04 MartinoCesaratto