ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Simple changes to massively simplify ComfyUI in basic use-cases

Open oxysoft opened this issue 2 years ago • 19 comments

Hi, I see that ComfyUI is getting a lot of ridicule on socials because of its overly complicated workflow. Users are now starting to doubt that this is really optimal. When I see the basic T2I workflow on the main page, I think naturally this is far too much. 7 nodes for what should be one or two, and hints of spaghetti already!! This cannot be taken lightly, this is a drastic realization.


Let me paint a picture: I was recently stranded at a bar in Montréal around 1 AM. Already I had trouble managing the zippers in my bag, and now I had to get home. I downloaded Uber for the first time, and it turns out that the app is a lot like ComfyUI: a labyrinth of bad mobile UI screens. Well you see the problem is that I was very tired and high and drunk. Eventually I managed to make it home, but now right before I pass out at 3 AM I have a sudden spark of creativity, and when I get to work I'm greeted with this monstrosity:

comfyui_screenshot

As an artist, I represent your target demographic, and it's imperative the UX be designed as if the user could only use 1% of their brain at any time.


Thankfully, just a few tweaks massively improve the situation and make ComfyUI much more comfy to use for most people:

  1. Make the save node optional and add a toggle to automatically save all unused latent or image outputs.
  2. Allow wiring latents to image inputs and implicitly VAE decode
  3. You could drastically reduce spaghetti by holding state along traversal and automatically resolving inputs that have nothing passed in, for example the VAE loaded with the model could be inferred when the node is connected above. For example in the image above, you wouldn't have to pipe VAE from Load Checkpoint to VAE Decode. The way this would work is you would drag a connection from the header of the node (you can already see a dot next to the name, imagine that on the right side of the header as well) so it carries all of the inputs/outputs and auto-assigns them by name, and does so by overriding in a dictionary.
  4. Eliminate the Empty Latent Image and make it implicit when no latent is provided.
  5. Detect implicit globals to present when there is missing data. In the overwhelming majority of cases, the user doesn't need to load more than one checkpoint. This also solves a problem created by making the Empty Latent Image: if you remove these two nodes, ComfyUI would detect the following implicit globals to assign: ckpt_name, width, height, batch_size.

To a node purist, these seem like a misuse of the node-based approach to software UX, but in the context of a sane user workflow these are simple compromises that help manage the complexity and overhead, much like syntax sugar in programming languages. I hope you will consider these changes, I think everyone would be extremely excited to hear about it!

oxysoft avatar Aug 06 '23 18:08 oxysoft