diffusers
diffusers copied to clipboard
Inquiry About Using Non-Square Images for ControlNet Training
Subject: Inquiry About Using Non-Square Images for ControlNet Training
Dear [Team/Developer],
First of all, thank you for providing the training code for ControlNet. I have recently been utilizing this code to train the ControlNet for an sdxl model, and I am quite satisfied with the results. However, I believe there is room for improvement.
During the training process, I noticed that the program crops images into squares. The images I am using are from interior design, which are rarely square and tend to lose their integrity when cropped. I am wondering if it is possible to train with longer or wider images, such as those with aspect ratios of 3:2 or 2:3. Would it be sufficient to modify the image processing part of the code to accommodate these dimensions?
I look forward to your response and thank you very much for your support!
Best regards
I don't see why you couldn't use non square images for training with controlnet, it's the same as for the unet. I suggest you use the same aspect ratios and sizes than the original SDXL training.
The training scripts in diffusers
are minimal examples, that's why they don't have different bucket resolutions or aspect ratios, it's just to keep the code simple.
Also make sure you need this, I really don't see the need to use the whole images or what does it mean that they lose integrity, the model just needs to understand the features in the image and not the whole composition for most use cases.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.