InvokeAI [enhancement]: Scaled generation on unified canvas

Is there an existing issue for this?

[X] I have searched the existing issues

Contact Details

No response

What should this feature add?

It would be really nice if the unified canvas had a way to work on images that are not at its native resolution. How I image this working is that you put in a scaling factor that effectively shrinks the image on the unified canvas's grid, while retaining the full resolution image data, sort of like photoshop's smart objects.

This might be restricted, for example, to 2x or 4x scales, but ideally it would be completely analog, so that you would add two images to the canvas (once that's possible) and scale one to make the subject match the scale of the subject in the other. Importantly, this would also allow you to scale the source image to match the limited range of scales that stable diffusion can produce, given the size of its training images.

Then, when performing inpainting, and outpainting in particular, the full-size image would be scaled down by that factor before feeding it to stable diffusion, so that it can recognize features and generate comparable-size ones. Then the results from image generation would be upscaled using ESRGAN, and, if necessary, scaled down a bit from their 2x or 4x size to match the resolution of the base image.

Alternatively (and probably a better idea), the generated portion could be maintained as a separate object/layer, which could optionally be upscaled independent of the rest of the image. Each of these layers could have its own internal resolution, and their Z order could be changed to keep the best version of the overlapping regions on top. The grid scaling factor could be changed at any time, independent of the image layers, so that subsequent generations can be done at an appropriate scale for the model and the subject you're trying to render. Then, only when exporting an image from the canvas would all of the layers need to be scaled to matching resolutions.

Alternatives

No response

Aditional Content

No response

Dec 28 '22 22:12 whosawhatsis

Yeah, it's basically "Inpaint at full resolution" from Auto UI. It's probably the biggest thing I'm missing in Invoke as well. In my experience, this gives the best possible inpainting results.

If this is ever going to be implemented, being able to select the model for the final upscaling is also a necessary feature.

For those unfamiliar, as far as I understand, this is how it works:

A minimum size square area that contains the mask + some padding is created.
The area is upscaled or downscaled to 512*512.
The resulting image is used to generate whatever is required.
The result is scaled back to match the original resolution of the input image and pasted into it.

Jan 01 '23 14:01 Alphyn-gunner

There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.

Mar 13 '23 06:03 github-actions[bot]