sd-dynamic-prompts
sd-dynamic-prompts copied to clipboard
Incorrect prompts when using loras within combinations on batch size > 1
Given this prompt: {<lora:A:1>|<lora:B:0.{1|9}>}
, the parser will return either <lora:A:1>
, <lora:B:0.1>
or <lora:B:0.9>
on each image within the same bach.
However, lets say that the first image received <lora:A:1>
- all the other images on the same batch will still be using that even if the parser returned something else. So we might get images that show <lora:B:0.1>
in their metadata even though they were actually created with <lora:A:1>
instead.
The same applies for weights: if <lora:B:0.1>
was picked first then any image that shows <lora:B:0.9>
was actually created with 0.1 weights.
Due to the way SD works I'm not sure its possible to actually make this work as one would want - but at least the incorrect prompts should be fixed. Also, although I haven't tested yet, this probably also occurs with the other types of external networks.
This works for me, but I'm using webui and extension from before the whole gradio update mess. Changing lora using {} as you described or inside wildcards both take effect correctly. Can someone confirm if this broke? Not really willing to update yet especially if this is a new issue.
Also, since there are other issues with batches and dynamic prompts such as incorrect seeds for modifying prompts, or block weights not being applied, you can do generate forever (right click on generate) or xyz plot (-1 seeds on 2 axis, disable layout, include sub images, same effect as batches without any of the bugs) instead.
I should've given more details.
I'm using ported A1111 UI to work with DirectML on AMD (windows): https://github.com/lshqqytiger/stable-diffusion-webui-directml. Its essentially the same as the native A1111 (same features and they all seem to work so far) but it is possible that could be the cause of this issue. Also I'm running with --lowvram
which splits some stuff and loads/unloads them only when necessary to keep VRAM usage low - this could also be a cause.
On top of that, I have the Additional Networks and Locon extensions both which have an impact on LoRAs - but then again I was not using any of their settings, just directly adding the loras to the prompt.
I'll try the native A1111 on CPU using a commit before the last buggy update with both those extensions disabled and update this issue with my findings.
EDIT: @scrumpyman Thank you for your time on this. I did thought about generating forever or manually editing the UI config to allow much bigger batch counts and doing so with a size of 1 BUT the problem is speed. The difference in speed when doing batch size 1 vs 3 is minimal for me so it would be a waste of time to do batches of 1 - that's all this boils down to really, although the incorrect prompts are also not welcome. I'll check if I still have these issues on native clean A1111 with just this extension enabled asap.
I've just finished testing it on a clean, safe version of native A1111 (commit a9fed7c364061ae6efb37f797b6b522cb3cf7aa2) without any other extension except for this one - the issue still occurs.
Since I had to do it on CPU (AMD card here) I tested with low steps, low res images but its pretty easy to tell anyway:
Notice how the second image generated in the batch size = 2 is not using the right lora.
My bad, I was doing batch count, not batch size. That indeed doesn't work, but I wouldn't expect it to ever work since it would need to load and use 2 different models at the same time. Even if it was possible it would probably cancel out any speed benefit from using batch size.
Edit: found confirmation on this probably not being fixable while looking for something else https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/7789#issuecomment-1428188741
A feature request occured to me that could also be a possible fix for this issue so I'll post here:
Detect loras (maybe other types of external networks as well) within combinations, wildcards, etc... randomize only on the first prompt and port the results of those randomizations for all other prompts within the same batch. (every randomization that does not contain external networks as a possibility should still be regenerated on all the other prompts)
This would allow us to do lora combinations throughout multiple batchs while fixing the incorrect prompts per batch at the same time.
I'm not sure I understand your suggestion, but the core of the problem is that when you apply a lora, you are modifying the weights of the whole 2 or 4+gb model in memory, I think. If you want to do a batch size of multiple images generating at the same time with different loras or lora weights, you'd have to double/triple/etc the entire vram/ram usage, and even if you could do it, I think the speed would suffer from each generation accessing a different model.
Again just to confirm, this is for batch size, the one that generates multiple images at the same time. Batch count should work fine for the most part.
@scrumpyman As you said, its not really possible to make multiple different loras working on individual images within the same batch - and even if it was it would very easily cause out of memory errors. So this issue is not about getting that to work - but rather fixing the metadata prompts that show incorrect loras as a result of this.
When you make a batch size of 2 you are sending 2 prompts. Both run through a code that parses combinations, etc... My suggestion is to detect if a combination (,etc...) has at least 1 lora as a possibility and if so the choice given to the first prompt of that batch must persist throughout all other images of that batch.
Example:
{0.25::<lora:A:1>|0.25::<lora:B:0.5|0.5::by __artist__}, {tag1|tag2},
Lets say the first returned prompt was:
<lora:A:1>, tag1,
Then every other prompt for that batch must straight out replace {0.25::<lora:A:1>|0.25::<lora:B:0.5|0.5::by __artist__},
with <lora:A:1>
before the parsing takes place (or the parser could just force the choice somehow). Images of that batch will all have lora:A:1 and either tag1 or tag2.
Additionally even if the choice for the first prompt was not a lora it doesnt change anything as the choice must still be ported nonetheless.
Oh, hm. That seems like a hacky fix for a very specific issue though. Detecting if batch size is on, detecting if extra network is in a choice, saving all these choices and forcibly applying them in other prompts of the same "batch size" but not "batch count" if both are selected.
And what if you have {tagA lora:A:0.5 | tagB loraB:0.5}, maybe with lots of tags with the lora, would it always be desired to freeze that choice in a batch? It probably would be better for most things, since in something like a character list with character loras and associated trigger words and description tags, you wouldn't want it to have the wrong lora applied, but it might not be so for every kind of choice. What if the lora is the same, or just has slightly different weights? Maybe add as an option.
I just ran into this as well. @GreenLandisaLie's suggestion seems like the best short term fix here. As they note, this is already exactly what happens, the problem is the other items in the batch write the wrong thing into the image's metadata.
Oh, hm. That seems like a hacky fix for a very specific issue though. Detecting if batch size is on, detecting if extra network is in a choice, saving all these choices and forcibly applying them in other prompts of the same "batch size" but not "batch count" if both are selected.
And what if you have {tagA lora:A:0.5 | tagB loraB:0.5}, maybe with lots of tags with the lora, would it always be desired to freeze that choice in a batch? It probably would be better for most things, since in something like a character list with character loras and associated trigger words and description tags, you wouldn't want it to have the wrong lora applied, but it might not be so for every kind of choice. What if the lora is the same, or just has slightly different weights? Maybe add as an option.
The point here is that by definition, we cannot change the lora within a batch. It's not technically feasible with how batches work, which is running the same loaded model in parallel on multiple images. So while the ideal fix would be to make it work as expected, the best that can really be done with how everything presently works is to just reflect what actually happens in the PNG info (and maybe update the docs to warn about this)
There are still multiple ways to make PNG info match the used prompt in batch size:
- freeze choices that could output extra networks in their entirety (but then, what if for example you use a huge complex nested wildcard that can output a lora 3 wildcard layers deep with 0.01% probability, is that always frozen?)
- only freeze the extra networks (effectively what happens already) but correctly write PNG info to reflect the behavior, parsing out extra networks for image 2+ and writing the ones in effect from the first image's choice, at the end or beginning of the other choices.
I don't think this should be added if it's not an option. And it seems too obscure of an issue to add an option for.
- only freeze the extra networks (effectively what happens already) but correctly write PNG info to reflect the behavior, parsing out extra networks for image 2+ and writing the ones in effect from the first image's choice, at the end or beginning of the other choices.
This seems like the best fix, and what I was imagining. You're right that if the choices are mixed in with non-lora components, then trying to freeze them together with the loras just changes it to a different set of unexpected behavior.
I don't think this should be added if it's not an option. And it seems too obscure of an issue to add an option for.
Agree assuming you're referring to (1) for this? For (2) there's no behavioral change to configure.
There's also the issue where you might have a choice that really doesn't make sense if a lora from option 1 is used with the tags from option 2, in which case freezing the entire choice might be preferable. What I meant is there should be an option to select fix 1 or 2. But I guess implementing fix 2 would at least be an improvement on current behavior.
I made a childish modification to my dynamic_prompting.py file to allow freezing combinations that use @{ | }@
per batch and it works but then I ran into an issue.
After a couple of batches I get out of memory crashes because SD is not unloading the external networks used in the previous batches so they keep stacking on memory - at least that's my theory.
I'm not sure if this is something that can be solved through an extension and if its not then Im no longer sure how to solve this issue.
EDIT: using my modified .py I can set the batch count to 1 (and batch size > 1, ofc) then press 'Generate Forever' and it works flawlessly. Its interesting that SD does unload the external networks properly on this option but not when setting the batch count to a value bigger than 1.
Some kind of general option to make a variant only change per batch would be nice to have. Possibly only an option to have the same start seed per batch (though that might be out of scope).