ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Nested prompt weights are not multiplied together.

Open nothings opened this issue 1 year ago • 9 comments

Expected Behavior

A prompt like "((foo))" receives weight 1.1 * 1.1 A prompt like "((foo):1.1) receives weight 1.1 * 1.1 A prompt like "((foo:1.1))" receives weight 1.1 * 1.1. A prompt like "((foo:1.0):0.0)" receives weight 1.0 * 0.0 = 0.0

Actual Behavior

A prompt like "((foo))" receives weight 1.1 * 1.1 (correct) A prompt like "((foo):1.1) receives weight 1.1 * 1.1 (correct) A prompt like "((foo:1.1))" receives weight 1.1 (wrong) A prompt like "((foo:1.0):0.0)" receives weight 1.0 (wrong)

Steps to Reproduce

See above.

Debug Logs

no

Other

Bug is in line 270 and 273 of sd1_clip.py: https://github.com/comfyanonymous/ComfyUI/blob/9230f658232fd94d0beeddb94aed093a1eca82b5/comfy/sd1_clip.py#L270 https://github.com/comfyanonymous/ComfyUI/blob/9230f658232fd94d0beeddb94aed093a1eca82b5/comfy/sd1_clip.py#L273

The (foo) case multiplies by current_weight (because weight is initialized to current_weight, then uses *=), the (foo:0.5) case does not. Replacing the second line with weight = current_weight * float(x[xx+1:]) should fix it, though the code should ideally be refactored to handle both cases the same way.

nothings avatar Aug 26 '24 04:08 nothings

This issue has been mentioned before in the past. It's not a bug, but intentional.

Also, if it were to be changed, it would cause problems with the reproducibility of all existing workflows, so it should not be modified.

Personally, I think it could be applied at the custom node level.

ltdrdata avatar Aug 26 '24 05:08 ltdrdata

If it's intentional, then it's a documentation bug because it needs to be documented in user-facing documentation. I shouldn't have to grovel through 15 files to find these two lines of code to be sure I'm not crazy. You also shouldn't make me waste my time like that because you don't document it. Also, if it's intentional, there should be a comment in the code to that effect.

nothings avatar Aug 26 '24 05:08 nothings

If it's intentional, then it's a documentation bug because it needs to be documented. I shouldn't have to grovel through 15 files to find these two lines of code to be sure I'm not crazy.

Yes, it's true that ComfyUI features have not been well documented up until now. The documentation process will be carried out gradually, step by step.

ltdrdata avatar Aug 26 '24 05:08 ltdrdata

I shouldn't have to grovel through 15 files to find these two lines of code to be sure I'm not crazy. You also shouldn't make me waste my time like that because you don't document it.

What's with the hostility? Why go on a twitter rant bashing people just working on OSS? You make a good point, but nobody here did anything to personally harm you.

SkleKng avatar Aug 26 '24 21:08 SkleKng

I'm not sure why it's relevant, but you brought it up, so: I didn't go on a Twitter rant bashing people "just working on OSS". I reiterated to my friends on Twitter why every time I waste my time trying to report bugs on OSS, I'm reminded of why I never report bugs, because doing so is a waste of time. Which it is! As can be seen in this thread! Which I anonymized when I tweeted it. The fact that a bunch of people RTd it and apparently spam-liked my post in this thread is not my fault. Didn't ask for it, didn't want it. Those likes don't belong in this thread, your question doesn't belong in this thread, this reply doesn't belong in this thread.

PS: If you say "we knew about this bug but we didn't tell anybody about it", you are literally announcing to the world that you chose a plan that you knew would lead to people wasting their time debugging it again. Literally chose it. It's not "hostile" to point out that it is anti-collaborative to take actions that make potential collaborators waste their time and which turn them off from contributing to the project. Again, none of this belongs in this thread, and I'm unsubscribing from it.

nothings avatar Aug 26 '24 23:08 nothings

I'm not sure why it's relevant, but you brought it up, so: I didn't go on a Twitter rant bashing people "just working on OSS". I reiterated to my friends on Twitter why every time I waste my time trying to report bugs on OSS, I'm reminded of why I never report bugs, because doing so is a waste of time. Which it is! As can be seen in this thread! Which I anonymized when I tweeted it. The fact that a bunch of people RTd it and apparently spam-liked my post in this thread is not my fault. Didn't ask for it, didn't want it. Those likes don't belong in this thread, your question doesn't belong in this thread, this reply doesn't belong in this thread.

PS: If you say "we knew about this bug but we didn't tell anybody about it", you are literally announcing to the world that you chose a plan that you knew would lead to people wasting their time debugging it again. Literally chose it. It's not "hostile" to point out that it is anti-collaborative to take actions that make potential collaborators waste their time and which turn them off from contributing to the project. Again, none of this belongs in this thread, and I'm unsubscribing from it.

There's a difference between "point[ing] out that it is anti-collaborative to take actions that make potential collaborators waste their time and which turn them off from contributing to the project" and posting about the same missing documentation on Twitter for eighteen hours straight. Telling people that they are "incapable of understanding and empathy" over an issue being marked wontfix is usually considered hostile (granted, the person you said that to was rather rude). Same with saying "I shouldn't have to grovel through 15 files to find these two lines of code to be sure I'm not crazy. You also shouldn't make me waste my time like that because you don't document it", that is usually considered hostile. Anyways, I'll take a crack at adding some documentation for that :)

thetacola avatar Aug 27 '24 00:08 thetacola

wontfix means that there will be no changes to the implementation in question, and the issue of appropriate documentation is a separate matter that needs to be improved.

I hope that no one's feelings are hurt any further regarding this matter.

ltdrdata avatar Aug 27 '24 01:08 ltdrdata

This feature is documented in the README as follows:

You can use () to change emphasis of a word or phrase like: (good code:1.2) or (bad code:0.8). The default emphasis for () is 1.1. To use () characters in your actual prompt escape them like ( or ).

The docs say nothing about nesting. It should be documented as follows. First, parentheses without argument can be used without argument to pump up the weight. The more you have, the bigger the weight gets:

token_weights('foo bar baz', 1.0) -> [
    ('foo bar baz', 1),
]

token_weights('foo (bar) baz', 1.0) -> [
    ('foo ', 1),
    ('bar', 1.1),
    (' baz', 1),
]

token_weights('foo ((bar)) baz', 1.0) -> [
    ('foo ', 1),
    ('bar', 1.21),
    (' baz', 1),
]

token_weights('foo (((bar))) baz', 1.0) -> [
    ('foo ', 1),
    ('bar', 1.331),
    (' baz', 1),
]

token_weights('foo ((((bar)))) baz', 1.0) -> [
    ('foo ', 1),
    ('bar', 1.4641),
    (' baz', 1),
]

The parser implements this internally by saying weight *= 1.1 when parentheses are encountered. Since successive multiplications of 1.0 would yield 1.0. So that intent is clear.

Next, a :float argument may also be specified to explicitly set a weight. This will be used as-is without interpretation.

token_weights('foo ((((lol (cat:666) attack)))) baz', 1.0) -> [
    ('foo ', 1),
    ('lol ', 1.4641),
    ('cat', 666),
    (' attack', 1.4641),
    (' baz', 1),
]

Now let's say I want to specify an explicit weight for the lol/attack tokens. I can do that as follows:

token_weights('foo ((((lol (cat:666) attack:100)))) baz', 1.0) -> [
    ('foo ', 1),
    ('lol ', 100),
    ('cat', 666),
    (' attack', 100),
    (' baz', 1),
]

Now here's where things get a little screwy. If I put the :100 on the outside of the parentheses:

token_weights('foo ((((lol (cat:666) attack):100))) baz', 1.0) -> [
    ('foo ', 1),
    ('lol ', 110),
    ('cat', 666),
    (' attack', 110),
    (' baz', 1),
]

Then what's happening is the parentheses on the inside will boost the weight. That's because parentheses by default should be thought of as weight amplifiers. If you document that mental model, then I think prolific engineers like @nothings who normally have outstanding judgement, will have a better happier experience with this project.

jart avatar Aug 27 '24 05:08 jart

engineers like @nothings

please don't tag me into this thread, i unsubscribed from it for a reason, tagging me sends me a notification that i don't want. i don't know why feckin weirdoes are obssessing about twitter on a bug report for comfyui on github and writing scurrilous lies about my behavior in this thread, but i don't really want to know, don't want to see it, bye. some mod should just delete all these off-topic replies (not @jart's, which is not off-topic), but what do i know, maybe comfyui prefers having this nonsense around

nothings avatar Aug 27 '24 06:08 nothings