ComfyUI
ComfyUI copied to clipboard
Execution Model Inversion
This PR inverts the execution model -- from recursively calling nodes to using a topological sort of the nodes. This change allows for modification of the node graph during execution. This allows for two major advantages:
1. The implementation of lazy evaluation in nodes. For example, if a
"Mix Images" node has a mix factor of exactly 0.0, the second image
input doesn't even need to be evaluated (and visa-versa if the mix
factor is 1.0).
2. Dynamic expansion of nodes. This allows for the creation of dynamic
"node groups". Specifically, custom nodes can return subgraphs that
replace the original node in the graph. This is an incredibly
powerful concept. Using this functionality, it was easy to
implement:
a. Components (a.k.a. node groups)
b. Flow control (i.e. while loops) via tail recursion
c. All-in-one nodes that replicate the WebUI functionality
d. and more
All of those were able to be implemented entirely via custom nodes,
so those features are *not* a part of this PR. (There are some
front-end changes that should occur before that functionality is
made widely available, particularly around variant sockets.)
The custom nodes associated with this PR can be found at: https://github.com/BadCafeCode/execution-inversion-demo-comfyui
Note that some of them require that variant socket types ("*") be enabled.
If anyone can test and report if it works or not with their most complex workflows that would be very helpful.
All errors I get on this one seem to be from custom nodes. The missing input and invalid image file errors happen regardless due to how this workflow is built Interaction OpenPose.json
got prompt
ERROR:root:Failed to validate prompt for output 2003:
ERROR:root:* ImageReceiver 3135:
ERROR:root: - Custom validation failed for node: image - Invalid image file: ImgSender_temp_uvqrp_00001_.png [temp]
ERROR:root: - Custom validation failed for node: link_id - Invalid image file: ImgSender_temp_uvqrp_00001_.png [temp]
ERROR:root: - Custom validation failed for node: save_to_workflow - Invalid image file: ImgSender_temp_uvqrp_00001_.png [temp]
ERROR:root: - Custom validation failed for node: image_data - Invalid image file: ImgSender_temp_uvqrp_00001_.png [temp]
ERROR:root: - Custom validation failed for node: trigger_always - Invalid image file: ImgSender_temp_uvqrp_00001_.png [temp]
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3862:6:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3133:5:
ERROR:root:* (prompt):
ERROR:root: - Return type mismatch between linked nodes: a, INT != INT,FLOAT,IMAGE,LATENT
ERROR:root: - Return type mismatch between linked nodes: b, FLOAT != INT,FLOAT,IMAGE,LATENT
ERROR:root:* MathExpression|pysssss 3133:5:
ERROR:root: - Return type mismatch between linked nodes: a, INT != INT,FLOAT,IMAGE,LATENT
ERROR:root: - Return type mismatch between linked nodes: b, FLOAT != INT,FLOAT,IMAGE,LATENT
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3059:11:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3822:16:
ERROR:root:* MathExpression|pysssss 3133:4:
ERROR:root: - Return type mismatch between linked nodes: a, FLOAT != INT,FLOAT,IMAGE,LATENT
ERROR:root: - Return type mismatch between linked nodes: b, INT != INT,FLOAT,IMAGE,LATENT
ERROR:root:* ImpactConditionalBranch 3822:15:
ERROR:root: - Required input is missing: tt_value
ERROR:root:* ImpactConditionalBranch 3822:14:
ERROR:root: - Required input is missing: tt_value
ERROR:root:* KSampler 3822:3:
ERROR:root: - Required input is missing: latent_image
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3678:6:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3244:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 2322:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3861:6:
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 3133:4:
ERROR:root:* (prompt):
ERROR:root: - Return type mismatch between linked nodes: a, FLOAT != INT,FLOAT,IMAGE,LATENT
ERROR:root: - Return type mismatch between linked nodes: b, INT != INT,FLOAT,IMAGE,LATENT
ERROR:root:Output will be ignored
Exception in thread Thread-8 (prompt_worker):
Traceback (most recent call last):
File "threading.py", line 1045, in _bootstrap_inner
File "threading.py", line 982, in run
File "C:\AI\ComfyUI_windows_portable\ComfyUI\main.py", line 111, in prompt_worker
e.execute(item[2], prompt_id, item[3], item[4])
File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 470, in execute
execution_list.add_node(node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 110, in add_node
self.add_strong_link(from_node_id, from_socket, unique_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 136, in add_strong_link
super().add_strong_link(from_node_id, from_socket, to_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 87, in add_strong_link
self.add_node(from_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 110, in add_node
self.add_strong_link(from_node_id, from_socket, unique_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 136, in add_strong_link
super().add_strong_link(from_node_id, from_socket, to_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 87, in add_strong_link
self.add_node(from_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 110, in add_node
self.add_strong_link(from_node_id, from_socket, unique_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 136, in add_strong_link
super().add_strong_link(from_node_id, from_socket, to_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 87, in add_strong_link
self.add_node(from_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 110, in add_node
self.add_strong_link(from_node_id, from_socket, unique_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 136, in add_strong_link
super().add_strong_link(from_node_id, from_socket, to_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 87, in add_strong_link
self.add_node(from_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 110, in add_node
self.add_strong_link(from_node_id, from_socket, unique_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 136, in add_strong_link
super().add_strong_link(from_node_id, from_socket, to_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 87, in add_strong_link
self.add_node(from_node_id)
File "C:\AI\ComfyUI_windows_portable\ComfyUI\comfy\graph.py", line 108, in add_node
is_lazy = "lazy" in input_info and input_info["lazy"]
^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable
started trying to test this. first the job-iterator extension has to be disabled. it also is not compatible with rgthree's nodes even if you disable the executor stuff in it because it will always try to patch the executor regardless. i commented out that part and rgthree didn't seem to be causing problems after that.
those problems are basically expected, stuff that messes with the executor is not going to be compatible with these changes.
>> info 363 VAE -- None None None
Exception in thread Thread-6 (prompt_worker):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "/raid/vantec/ai/models/sd/ComfyUI/main.py", line 111, in prompt_worker
e.execute(item[2], prompt_id, item[3], item[4])
File "/raid/vantec/ai/models/sd/ComfyUI/execution.py", line 470, in execute
execution_list.add_node(node_id)
File "/raid/vantec/ai/models/sd/ComfyUI/comfy/graph.py", line 109, in add_node
is_lazy = "lazy" in input_info and input_info["lazy"]
^^^^^^^^^^^^^^^^^^^^
TypeError: argument of type 'NoneType' is not iterable
this is more of an issue. i added a debug print after line 107 which calls self.get_input_info
: print(">> info", unique_id, input_name, "--", input_type, input_category, input_info)
the method can return None
however the downstream code doesn't check for that.
>> info 363 VAE -- None None None
seems like it failed fetching info for the standard VAE node. maybe other weird stuff is going on here but there definitely should be more graceful handling for methods that can return None
.
don't want to seem negative, i definitely appreciate the work you put into these changes and a better approach to execution is certainly very welcome and needed!
edit: did some more digging, the issue seemed to be an incompatibility with the use everywhere nodes (was using that to broadcast the VAE so maybe it wasn't a standard VAE node after all).
are these changes expected to be compatible with use everywhere?
started trying to test this. first the job-iterator extension has to be disabled. it also is not compatible with rgthree's nodes even if you disable the executor stuff in it because it will always try to patch the executor regardless. i commented out that part and rgthree didn't seem to be causing problems after that.
those problems are basically expected, stuff that messes with the executor is not going to be compatible with these changes.
>> info 363 VAE -- None None None Exception in thread Thread-6 (prompt_worker): Traceback (most recent call last): File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() File "/usr/lib/python3.11/threading.py", line 982, in run self._target(*self._args, **self._kwargs) File "/raid/vantec/ai/models/sd/ComfyUI/main.py", line 111, in prompt_worker e.execute(item[2], prompt_id, item[3], item[4]) File "/raid/vantec/ai/models/sd/ComfyUI/execution.py", line 470, in execute execution_list.add_node(node_id) File "/raid/vantec/ai/models/sd/ComfyUI/comfy/graph.py", line 109, in add_node is_lazy = "lazy" in input_info and input_info["lazy"] ^^^^^^^^^^^^^^^^^^^^ TypeError: argument of type 'NoneType' is not iterable
this is more of an issue. i added a debug print after line 107 which calls
self.get_input_info
:print(">> info", unique_id, input_name, "--", input_type, input_category, input_info)
the method can return
None
however the downstream code doesn't check for that.>> info 363 VAE -- None None None
seems like it failed fetching info for the standard VAE node. maybe other weird stuff is going on here but there definitely should be more graceful handling for methods that can return
None
.don't want to seem negative, i definitely appreciate the work you put into these changes and a better approach to execution is certainly very welcome and needed!
edit: did some more digging, the issue seemed to be an incompatibility with the use everywhere nodes (was using that to broadcast the VAE so maybe it wasn't a standard VAE node after all).
are these changes expected to be compatible with use everywhere?
ping @ali1234, @rgthree.
I see that compatibility of existing custom nodes may be compromised with the new structure.
More important than immediate compatibility breaking is verifying whether each extension can provide compatibility patches for the new structure.
Job iterator only patches the executor to let it run multiple times in a loop. This pull request should make it obsolete.
Thanks for the ping and opportunity to fix before breaking. The execution optimization was meant to be forward compatible, but it did assume methods weren't being removed.
I just pushed https://github.com/rgthree/rgthree-comfy/commit/6aa039235f3564e8474ff9732abb90fef861753d which is forwards compatible with this PR by no longer attempting to patch the optimization to ComfyUI's execution if the recursive methods don't exist, as is the case in this PR. I've patched this PR in and ensured it.
Question: I assume this PR will render the rgthree optimization obsolete? Currently, the patch I provide reduces iterations and times by 1000s of percent (from 250,496,808 to just 142, and 158.13 seconds to 0.0 as tested on my machine).
Also, I noticed the client API events have changed (client-side progress bar shows 100% complete, even though the workflow is still running). Are there more details on breaking changes?
Crashed for the following nodes while attempt to create info['input_order']
:
ttN pipeLoader
ttN pipeLoaderSDXL
Potentially problematic line: https://github.com/TinyTerra/ComfyUI_tinyterraNodes/blob/main/tinyterraNodes.py#L1534
It looks like the problematic key is my_unique_id
, which result in "UNIQUE_ID"
string, therefore .keys()
call will fail.
[FIXED]
Assertion crash during execution. Looks like there's some incorrect assumption when calling BasicCache.set_prompt
. The following are simplified call hierarchy:
https://github.com/guill/ComfyUI/blob/36b2214e30db955a10b27ae0d58453bab99dac96/execution.py#L457 https://github.com/guill/ComfyUI/blob/36b2214e30db955a10b27ae0d58453bab99dac96/comfy/caching.py#L141
Then the rest:
CacheKeySetInputSignature.add_keys
CacheKeySetInputSignature.get_node_signature
CacheKeySetInputSignature.get_immediate_node_signature
IsChangedCache.get
get_input_data
cached_output = outputs.get(input_unique_id)
BasicCache._get_immediate
Crash at https://github.com/guill/ComfyUI/blob/36b2214e30db955a10b27ae0d58453bab99dac96/comfy/caching.py#L175
The root cause is kinda hard to explain, but take this workflow for instance (workflow embedded in the image):
This should run fine if every is at it is (notice that the node id order MUST BE 10, 8, 9
such that the middle node always have the lowest id). But the moment when I change the source code of Concat Text_O
to this:
class concat_text_O:
"""
This node will concatenate two strings together
"""
@ classmethod
def INPUT_TYPES(cls):
return {"required": {
"text1": ("STRING", {"multiline": True, "defaultBehavior": "input"}),
"text2": ("STRING", {"multiline": True, "defaultBehavior": "input"}),
"separator": ("STRING", {"multiline": False, "default": ","}),
}}
RETURN_TYPES = ("STRING",)
FUNCTION = "fun"
CATEGORY = "O/text/operations"
@ staticmethod
def fun(text1, separator, text2):
return (text1 + separator + text2,)
@classmethod
def IS_CHANGED(cls, *args, **kwargs):
return float("NaN")
Comparing with the https://github.com/omar92/ComfyUI-QualityOfLifeSuit_Omar92/blob/ebcaad0edbfbd8783eb6ad3cb979f23ee3e71c5e/src/QualityOfLifeSuit_Omar92.py#L1189, the new code added IS_CHANGED
, which is typical for some nodes. However now IsChangedCache.get
is forced to call get_input_data
, which at that point BasicCache.cache_key_set
is not initialized yet, hence the assertion crash.
[FIXED]
Another bug when a node is being added but does not get executed. This occurs when the node has all of it inputs being optional (with may includes link inputs, not value inputs) AND when the same output being connected to two nodes: the process node
and the prompt expand node
. When this node get added to expand
by prompt expand node
, it somehow ignore such node and will no longer execute and this includes all subsequent process node
.
Not sure how to process with this. The only way I could think is adding a way to force add a node to execution_list
through the same expand
dict by checking certain keys with the current codebase.
I managed to trigger an error like this by running out of VRAM mid-generation and then trying to run another generation.
Traceback (most recent call last):
File "/home/sd/.conda/envs/sd/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/home/sd/.conda/envs/sd/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "/home/sd/git/ComfyUI/main.py", line 111, in prompt_worker e.execute(item[2], prompt_id, item[3], item[4])
File "/home/sd/git/ComfyUI/execution.py", line 476, in execute
self.handle_execution_error(prompt_id, dynamic_prompt.original_prompt, current_ou
tputs, executed, error, ex)
File "/home/sd/git/ComfyUI/execution.py", line 438, in handle_execution_error
"current_outputs": error["current_outputs"],
~~~~~^^^^^^^^^^^^^^^^^^^
KeyError: 'current_outputs'
ComfyUI then seems to get stuck unable to do anything.
I use ComfyBox as a frontend to ComfyUI. Didn't yet try reproducing this without it.
I fixed it by changing the code to use error.get("current_outputs", [])
instead. I don't know why current_outputs
is missing in this case, but using get makes ComfyUI recover from the OOM without having to restart.
Update: I've managed to test the PR now, and apart from having to disable rgthree, my flows seem to work fine. My fear that mapping over lists might be broken seems to be unfounded.
Original: I'm not able to test this at the moment, but does this preserve the behaviour that a node will map over lists given as inputs? Looking at the change it appears like it might have removed it.
For example, using the impact pack read prompts from file followed by unzip prompts will give you lists of +ve and -ve prompts, which a CLIP prompt encode will turn into lists of conditioning, which a ksampler will turn into lists of latents.
Also really useful for shmoo-ing parameters over ranges with CR Float / Integer range list.
ERROR:root:Failed to validate prompt for output 197:
ERROR:root:* (prompt):
ERROR:root: - Return type mismatch between linked nodes: model, * != MODEL
ERROR:root: - Return type mismatch between linked nodes: latent_image, * != LATENT
ERROR:root:* Efficient Loader 190:
ERROR:root: - Return type mismatch between linked nodes: positive, * != STRING
ERROR:root:* KSampler (Efficient) 197:
ERROR:root: - Return type mismatch between linked nodes: model, * != MODEL
ERROR:root: - Return type mismatch between linked nodes: latent_image, * != LATENT
ERROR:root:Output will be ignored
ERROR:root:Failed to validate prompt for output 9:
ERROR:root:* (prompt):
ERROR:root: - Return type mismatch between linked nodes: images, * != IMAGE
ERROR:root:* SaveImage 9:
ERROR:root: - Return type mismatch between linked nodes: images, * != IMAGE
ERROR:root:Output will be ignored
invalid prompt: {'type': 'prompt_outputs_failed_validation', 'message': 'Prompt outputs failed validation', 'details': 'Return type mismatch between linked nodes: model, * != MODEL\nReturn type mismatch between linked nodes: latent_image, * != LATENT\nReturn type mismatch between linked nodes: images, * != IMAGE', 'extra_info': {}}
suppose i already have a way to distributed-compute whole workflows robustly and transparently inside python. with these changes, is it a small lift to distribute pieces of the graph / individual nodes? the idea would be to make it practicable to integrate many models together performantly - each individual model may take up the entire VRAM, but distributed on different machines. this is coming from the POV of having a working implementation.
if the signature of node execution were async, it would be a very small lift to parallelize individual node execution among consumers (aka workers). it would bring no changes to the current blocking behavior in ordinary 1-producer-1-consume.
Update: I've managed to test the PR now, and apart from having to disable rgthree, my flows seem to work fine. My fear that mapping over lists might be broken seems to be unfounded.
@WeeBull Yeah, the intention is that functionality is maintained. I tested it via some contrived test cases, but it's not functionality I use much in my actual workflows. Good to hear that it seems to work for you!
@Seedsa As mentioned in the PR summary, this PR does not enable "*"-type sockets. You'll have to manually re-enable those via a patch just as you would on mainline. @comfyanonymous How do you feel about a command-line argument that enables "*" types so people don't have to circulate another patch file for it after this PR?
@doctorpangloss That should be a good bit easier with this architecture than the old one, though it's likely any old working code won't transfer over unchanged. In this execution model, nodes can already return ExecutionResult.SLEEPING
to say "do other nodes that don't require my result and come back to me later". You could fairly easily kick off whatever remote process you want in that first call to Execute
, add an artificial prerequisite to the node (that gets removed when you get the result of the remote process), and sleep in the meantime.
There might be a way to wrap the Execute
call in such a way that normal Python async semantics do that automatically for you, but I'd have to do some research. Python isn't really my area of expertise. 😅
@Trung0246 I believe the issue with ttN pipeLoader
is actually a bug in the node. The declaration of "my_unique_id": "UNIQUE_ID"
should be inside of the "hidden"
category for it to actually function. Right now, it's doing nothing at all (and the my_unique_id
argument will always be None
). Honestly, an error with a callstack might be preferable to the current situation where it silently ignores that argument. I can add additional checks if people disagree though.
I'm unable to reproduce any of your other issues. Any chance you could upload some workflows? (I'm not sure what node pack the 'process' or 'prompt_expand' nodes are from.)
Pasting a code block that @Trung0246 gave me on Matrix here so that it's documented:
class TautologyStr(str):
def __ne__(self, other):
return False
class ByPassTypeTuple(tuple):
def __getitem__(self, index):
if index > 0:
index = 0
item = super().__getitem__(index)
if isinstance(item, str):
return TautologyStr(item)
return item
class StubNode:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"text": ("STRING", {
"default": "TEST",
"multiline": False
}),
},
"optional": {
"_stub_in": (TautologyStr("*"), ),
"_stub": ("STUB_TYPE", )
},
"hidden": {
"_id": "UNIQUE_ID",
"_prompt": "PROMPT",
"_workflow": "EXTRA_PNGINFO"
}
}
RETURN_TYPES = ByPassTypeTuple(("*", "*"))
RETURN_NAMES = ("_stub_out", "_stub_out_all")
INPUT_IS_LIST = True
OUTPUT_IS_LIST = (True, True)
# OUTPUT_NODE = True
FUNCTION = "execute"
CATEGORY = "_for_testing"
def execute(self, **kwargs):
return (kwargs.get("text", ["???"]), kwargs.get("_stub_in", ["STUB"]))
I believe all the issues reported in this PR so far have been addressed. Please let me know if you encounter any new ones (or I'm wrong about the existing reports being resolved).
Getting this error in a complex workflow - working on identifying the source so I can upload a minimal test case:
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 313, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 191, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 166, in map_node_over_list
results.append(getattr(obj, func)(**input_dict))
TypeError: LoadImage.load_image() got an unexpected keyword argument 'upload'
Getting this error in a complex workflow - working on identifying the source so I can upload a minimal test case:
ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last): File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 313, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 191, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "D:\Projects\ai\comfyui\ComfyUI\execution.py", line 166, in map_node_over_list results.append(getattr(obj, func)(**input_dict)) TypeError: LoadImage.load_image() got an unexpected keyword argument 'upload'
Looks like just the Load Image node will cause this
Edit: I verified this is still happening after merging in the latest of master branch, so something is breaking LoadImage node
I've fixed the issue with LoadImage
. It was a result of a change I made to support undeclared inputs.
Now, undeclared inputs should function just like they do on master
-- undeclared inputs that are socket connections will be passed through to the execution function, but undeclared inputs that are literals will be ignored. I'm not sure that's actually the behavior we want long-term (it seems rather arbitrary), but it's at least consistent with what exists today.
I tested my most complex workflows and everything is working great! Thanks @guill for the great work!
but undeclared inputs that are literals will be ignored
Hm I tried to add this change and it immediately crash in my workflow with some nodes expecting literal undeclared input to not drop.
I think at this point it should be better to just detect if the class node function have something like **kwargs
then do not drop I suppose.
Or maybe introduce another class member constant like NOT_IDEMPOTENT
to keep this behavior.
just tried out the latest revision:
fails due to
ERROR:root:Failed to validate prompt for output 145:
ERROR:root:* SimpleMath+ 54:
ERROR:root: - Return type mismatch between linked nodes: a, FLOAT != INT,FLOAT
using --enable-variants
doesn't help here because it's a comma separated list of types, not a wildcard type.
there also seem to be some issues with compatibility and https://github.com/TinyTerra/ComfyUI_tinyterraNodes :
[ERROR] An error occurred while retrieving information for the 'ttN pipeLoader' node.
Traceback (most recent call last):
File "/path/ComfyUI/server.py", line 421, in get_object_info
out[x] = node_info(x)
^^^^^^^^^^^^
File "/path/ComfyUI/server.py", line 399, in node_info
info['input_order'] = {key: list(value.keys()) for (key, value) in obj_class.INPUT_TYPES().items()}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/ComfyUI/server.py", line 399, in <dictcomp>
info['input_order'] = {key: list(value.keys()) for (key, value) in obj_class.INPUT_TYPES().items()}
^^^^^^^^^^
AttributeError: 'str' object has no attribute 'keys'
same issue with the ttN pipeLoaderSDXL
node from that collection. those errors appear in the console just reloading the web interface, i am not actually even using those nodes in the workflow i was trying to test.
edit: changed the math around in my workflow to avoid the node i mentioned and was able to successfully complete it (and it's a fairly complex one too). i think fixing the comma-separated type nodes is pretty important though since those are pretty common. (i'd also say --enable-variants
should be the default. is there a case where anyone would actually want to disable it?)
Just tried this and have a lot of errors on all sorts of nodes.
Exception when validating inner node: '<' not supported between instances of 'dict' and 'int'
Seems this is related to using **kw as the arg bucket:
class ConstantNode(JOVImageMultiple):
@classmethod
def INPUT_TYPES(cls) -> dict:
return deep_merge_dict(IT_REQUIRED, IT_WH, IT_RGBA_A)
def run(self, **kw) -> tuple[torch.Tensor, torch.Tensor]:
wihi = parse_tuple(Lexicon.WH, kw)
Seems it expects a list of args, not the dict it has.
@blepping The issue in TinyTerra is a bug in those nodes. The declaration of UNIQUE_ID
is outside of the hidden
group when it must be inside it. With this PR, that quiet error just became a loud error -- I think that's actually a good thing since the node isn't actually doing what the author thinks it is.
I'm looking into the comma-separated list of types as that likely is something I should support.
@comfyanonymous The remaining backward-incompatibilities -- the JOVConstantNode
issue and the multi-type (FLOAT,INT
) issues -- are both a result of the fact that optional inputs are now validated by the back-end when they weren't previously. Could you weigh in on how we want to solve this?
In the case of JOVConstantNode
, the issue is that one of the input types is a vector (specifically VEC2
), but the declared min
and max
types are single integers (32
and 8192
). I could change the back-end to only validate min
and max
for INT
and FLOAT
types, but that feels like it's reducing robustness for the sake of a slightly weird declaration.
For the multi-type inputs, I could update the back-end validation to support the Litegraph concept of comma-delimited types pretty easily, but I'm not sure that's how we want to handle typing in the long-run. Node packs can resolve this issue themselves the same way they would currently resolve it for non-optional inputs (by having a type that overrides __ne__
).
There are likely other node packs with invalid combinations that simply weren't caught previously due to the lack of validation of optional inputs. There are a couple directions we could go:
- Revert to having no validation on optional input paths. This seems like a long-term detriment to ComfyUI just to keep things fully backwards-compatible.
- Make individual fixes to support all the existing cases that work only because they happen to be on optional inputs (i.e. special-casing min/max, adding explicit support for polymorphism in the validator, etc.).
- Make the back-end validation less thorough -- e.g. make min/max a purely front-end concern and ignore mismatched types (and assume they'll be implicitly cast). I think this is a reasonable route, but it's a significant design departure I would want your approval on.
- Ask node authors to update their optional inputs in the same way they would if they happened to be required inputs.
@comfyanonymous The remaining backward-incompatibilities -- the
JOVConstantNode
issue and the multi-type (FLOAT,INT
) issues -- are both a result of the fact that optional inputs are now validated by the back-end when they weren't previously. Could you weigh in on how we want to solve this?In the case of
JOVConstantNode
, the issue is that one of the input types is a vector (specificallyVEC2
), but the declaredmin
andmax
types are single integers (32
and8192
). I could change the back-end to only validatemin
andmax
forINT
andFLOAT
types, but that feels like it's reducing robustness for the sake of a slightly weird declaration.For the multi-type inputs, I could update the back-end validation to support the Litegraph concept of comma-delimited types pretty easily, but I'm not sure that's how we want to handle typing in the long-run. Node packs can resolve this issue themselves the same way they would currently resolve it for non-optional inputs (by having a type that overrides
__ne__
).There are likely other node packs with invalid combinations that simply weren't caught previously due to the lack of validation of optional inputs. There are a couple directions we could go:
- Revert to having no validation on optional input paths. This seems like a long-term detriment to ComfyUI just to keep things fully backwards-compatible.
- Make individual fixes to support all the existing cases that work only because they happen to be on optional inputs (i.e. special-casing min/max, adding explicit support for polymorphism in the validator, etc.).
- Make the back-end validation less thorough -- e.g. make min/max a purely front-end concern and ignore mismatched types (and assume they'll be implicitly cast). I think this is a reasonable route, but it's a significant design departure I would want your approval on.
- Ask node authors to update their optional inputs in the same way they would if they happened to be required inputs.
- makes the most sense to me:
- It is fully backwards compatible (adding validation breaks things, but taking it away from the server side does not)
- Any validation that the server was doing can be moved to the default UI (if needed).
- Any client using the comfyUI api would have access to min/max metadata and would be responsible for enforcing it (or not).
- It allows for more hacking on nodes by letting people bypass min/max and try new things not expected (like negative ControlNet weights)
- It allows for more hacking on nodes by letting people bypass min/max and try new things not expected (like negative ControlNet weights)
I agree with this. The min/max spec of many parameters is usually just a hint to the user about sane values, not anything that truly prevents breakage.
If the world were to explode when a user uses values outside a node's limits, the node can just have an assert in it to enforce them.
Edit: This was a client issue where the client used null instead of 'None' string.
Found a weird setup that looks like it is sometimes as issue.
In efficiency nodes lora stacker (and others), it marks the dynamic lists as all required. However, it also lists 'None' as the default of lora, but that breaks in some clients:
Documenting conversation from Matrix: The error posted by @ricklove is a result of CushyStudio saving fields that exist as 'None'
in the default UI as null
. This previously wasn't caught during validation because this node was only connected to output via an optional socket (through the Efficient Loader
node), so it wasn't validated at all.
This falls under the category of "quiet errors that have become loud errors". It can either be fixed in CushyStudio or by adding a custom VALIDATE_INPUTS
function that allows null
values.
After talking it over with @comfyanonymous in Matrix, here's the solution I've implemented:
Argument Validation Skipping
When a custom VALIDATE_INPUTS
function exists, any inputs that exist as an argument to that function will not go through standard validation passes. So if someone doesn't want the min
/max
to be validated by the back-end, they can make a function like:
@classmethod
def VALIDATE_INPUTS(width_height):
return True
and the back-end will no longer automatically validate the width_height
input.
(In retrospect I realize that Joviex can't use VALIDATE_INPUTS
at all because his input names are emoji which can't be used as variable names in Python, so he might just have to change min
/max
to vec2_min
and vec2_max
.)
Socket Input Type Checking
When a custom VALIDATE_INPUTS
function takes an argument named input_types
, the value of that argument will be a dictionary containing a mapping from input names to the socket output types being used at those inputs. Note that this will only exist for socket connections and not for literals. When this argument exists, standard socket type checking is skipped.
Because this removes the last impediment to true variant types being implemented entirely in custom nodes, I've removed the --enable-variants
command-line option that I had previously added. In the unit tests, I've also added an example @VariantSupport()
decorator that enables variant usage on a node (both "*"
support and "INT,FLOAT"
support for both input and output sockets).
Example usage of validating types (without just using @VariantSupport
) is in the unit tests and looks like:
# Example is for an input of type `IMAGE,FLOAT`
@classmethod
def VALIDATE_INPUTS(cls, input_types, input1=None, input2=None):
# Validate literals
if input1 is not None:
if not isinstance(input1, (torch.Tensor, float)):
return f"Invalid type of input1: {type(input1)}"
# Validate socket connections
if 'input1' in input_types:
if input_types['input1'] not in ["IMAGE", "FLOAT"]:
return f"Invalid type of input1: {input_types['input1']}"
return True
something really strange is going on in the latest version. i ran my workflow from the broken seed initialization when add noise is turned off pull: #2841
and got
i thought comfy somehow had snuck in a fix but... it actually only ran two of the four samplers and duplicated the output latent? no really sure what's going on but it works as expected on master
. hopefully github didn't eat the metadata from that image but if you need any more information or the actual metadata YAML, please let me know.
edit: the problem also occurs if you just connect the empty latent directly to the samplers (the noise injection bit was actually to try to fix it but it didn't make any difference)