stable-diffusion-webui
stable-diffusion-webui copied to clipboard
Improved hi-res fix (separate sampler, separate prompt)
Describe what this pull request is trying to achieve.
✅ Separate sampler ✅ Separate prompt (including negative prompt) ✅ Image metadata
Additional notes and description of your changes
The value '---' is the same as choosing the same sampler in both txt2img and hi-res fix. It's also the default value, which means that the workflow will stay the same unless you intentionally chose a different sampler.
Leaving prompt textbox empty will result in using the same prompt as in the initial txt2img gen.
The code for conditioning was modified to be able to handle a prompt that is different from the one that is being used in initial gen txt2img. I'm not sure if it's possible to do it in a better way, but otherwise it works.
Image metadata works by putting the prompt into brackets, replacing commas with semicolons and then removing brackets and replacing the semicolons with commas when pasting the image info using paste button. This was done because otherwise i would need to modify param regex which is not a good idea as it can break the program.
Separate prompt works as expected, but it seems to be useful only in some niche situations as it makes the image look weird.
I would also like to apologize for such a large amount of commits. My PC is pretty weak and i had to push even the slightest changes so i could test them on Google Colab
Environment this was tested in
- OS: Linux (Colab)
- Browser: Chromium
- Graphics card: Tesla T4
Screenshots or videos of your changes
An example of a photo generated with SDE Karras and then processed with M2 Karras (twice as fast hires pass and pretty much the same quality):
pls, add also the ability to set a different prompt for hi-res pass
the new concept of hi-res is great, but feel free to throw it out of txt2img and implement it completely in img2img. i can't see a single good reason why it should be in txt2img at all. i generate 100 images and 99 of them are scaled up for nothing because i already didn't like the source material. so time is the last argument in favour of txt2img hi-res, neither is flexibility. that's what img2img is for, it allows additional prompt, different sampler or even a different model. because the PR just fits thematically i am so free to say that.
i generate 100 images and 99 of them are scaled
@dominikmau, if you didn't get good result with high res fix, doesn't mean others have same results. play around with upscalers, first pass resolutions and second pass resolutions, positive and negative prompt
Converted to draft to add some additional features like separate prompt and metadata
@InvincibleDude, thanks a lot for your work! also, if possible, please add corresponding options to x/y
@InvincibleDude, thanks a lot for your work! also, if possible, please add corresponding options to x/y
Thank you! Sorry, but this pull request isn't related to x/y plot in any way. Maybe i will consider making another pull request that adds those features to x/y
Another idea (it works rly) Automatic creation of an aesthetic embedding from the First Pass and guiding hi-res pass by it. Then delete new ae embedding
i generate 100 images and 99 of them are scaled
@dominikmau, if you didn't get good result with high res fix, doesn't mean others have same results.
you're right, of course, assuming none of you are using sd1.4-2.1 or any model based on it. what is better in the same time?
a) generate 20 images with hi-res without knowing if the images are without errors/deformations or if you like the image at all, etc. (you eat what you get)
b) 100 images and you pick the handful you like from the appearance and scale them up yourself with a hi-res function in img2img which has the same functionality as in txt2img. (you eat what you choose)
what is the advantage of hi-res in txt2img over img2img? tell me one and i'll shut up
i generate 100 images and 99 of them are scaled
@dominikmau, if you didn't get good result with high res fix, doesn't mean others have same results.
you're right, of course, assuming none of you are using sd1.4-2.1 or any model based on it. what is better in the same time?
a) generate 20 images with hi-res without knowing if the images are without errors/deformations or if you like the image at all, etc. (you eat what you get)
b) 100 images and you pick the handful you like from the appearance and scale them up yourself with a hi-res function in img2img which has the same functionality as in txt2img. (you eat what you choose)
what is the advantage of hi-res in txt2img over img2img? tell me one and i'll shut up
I don't think this is a good place for discussing such things (as they are not directly connected to the pull request). There is a tab called "Discussions" that is made specifically for this
Possible to also add a separate setting for batch count/size? Personally I never use the same batch setting when switching to highres fix, and have to switch back and forth which can get tedious.
Maybe should also add the number of hi-res passes. Sometimes I manually do this. This allows for better preservation of some details and textures.
For example: 1.25 upscale, 2 hi res passes. 512x 1st pass, 640x 1st hi-res, 800x 2nd hi-res pass
In combination with the hi-res prompt function, this will be a very powerful tool
Maybe should also add the number of hi-res passes. Sometimes I manually do this. This allows for better preservation of some details and textures.
For example: 1.25 upscale, 2 hi res passes. 512x 1st pass, 640x 1st hi-res, 800x 2nd hi-res pass
In combination with the hi-res prompt function, this will be a very powerful tool
Rn i'm not planning on adding anything else. The pull request is already pretty big
This seems like a great idea but it seems to to not unload the extra networks from the original prompt or load new ones from the highres prompt.
This seems like a great idea but it seems to to not unload the extra networks from the original prompt or load new ones from the highres prompt.
I'll look into this. But it shouldn't be like that as i didn't change any code related to this. Plus, during using my own fork i didn't have any issues with memory leaks
@AUTOMATIC1111 pls, merge it, its sota kekw
Converted to draft to fix extra networks
i can't see a single good reason why it should be in txt2img at all.
txt2img is source image agnostic, so you could load the prompt and start generating the image in one click. Makes sharing the prompts much easier.
Seems like it's working as intended now - hypernets and LORAs are loading even when in hrfix prompt
Can't wait for this, thanks!
after recent changes hires sampler and hires prompt doesn't saved in image info
only used embedding telling about that there is second prompt :)
after recent changes hires sampler and hires prompt doesn't saved in image info
only used embedding telling about that there is second prompt :)
Everything works fine for me (latest main repo commit with this pull request applied)
@InvincibleDude, info about 2pass prompt and sampler is in info when "Upscale by
" is used.
when using "Resize width to
" or/and "Resize height to
" info about 2pass prompt and sampler is missing
@InvincibleDude, info about 2pass prompt and sampler is in info when "
Upscale by
" is used. when using "Resize width to
" or/and "Resize height to
" info about 2pass prompt and sampler is missing
Ok. I'll try to fix this tommorow
@InvincibleDude, info about 2pass prompt and sampler is in info when "
Upscale by
" is used. when using "Resize width to
" or/and "Resize height to
" info about 2pass prompt and sampler is missing
Done - everything works now as intended. Thanks for telling me!
@InvincibleDude, thanks a lot! now everything is correct :)
2 side-notes:
🔹 hotkey ctrl+up/down not working in second pass prompt window (not critical)
🔹 when reading image info in PNG Info
and second pass [sampler,positive prompt,negative prompt] is not present in image info and then sending to txt2txt and second pass info exists from previous generation that info not cleared
1 addition:
cfg for second pass (not critical)
Waiting it so much
Would it be possible to unload the loras during the hires pass that the hires prompt does not re-specify, if it is not empty? Unless I tested it wrong, it looks like they stay loaded in hires prompt. Also, are loras handled if only specified in the hires prompt?
Would it be possible to unload the loras during the hires pass that the hires prompt does not re-specify, if it is not empty? Unless I tested it wrong, it looks like they stay loaded in hires prompt. Also, are loras handled if only specified in the hires prompt?
Yup, they are staying in second prompt. If i remembder correctly it would require some drastic changes to code. I will try to find the way how to do so without changing it too much, but rn i'm taking a rest so i'm not sure when i'll start doing this
i dunno why a1 wont merge it mb it possible to convert pr to extention?
Once the conflicts have been resolved in this PR, if you don't want to wait for it to be merged, you can merge it locally:
git remote add invincible-dude https://github.com/InvincibleDude/stable-diffusion-webui.git
git pull invincible-dude improved-hr-conflict-test
To revert these changes:
git reset --hard HEAD~60
git pull origin master
git remote remove invincible-dude
Well, there are some things that i would like to clarify on the state of this PR:
- I already have a commit to fix the conflict. The only problem is that i forgot to push it.
- I'm not sure if I'm just dumb, but i don't think there is a way to make the prompt for second pass work without extra networks.
- IMHO this PR's code is kinda dirty and ugly. I'll probably reimplement it as a script so the code is cleaner, because the dirtiness of the code seems like the reason for AUTO1111 not to accept.
UPD. sry for closing the pr, it was a misclick