ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Make default hash lib configurable without code changes via CLI argument

Open teward opened this issue 1 year ago • 5 comments

This is an extension to #3465 in which we add an option via CLI arguments to specify the comparison hashing function used in the operation of compare_image_hash.

While it uses an eval(), the eval is safe because we use a CLI argument that only allows one of 4 strings to specifically be selected, thereby preventing unsafe input. The other option is to write a parser around that that parses the string and returns the hash function, but an eval in line for this case is safe.

Since there's no way to specify the hash function in the fix for #3465, this makes it configurable, which allows the use of other common (and known to be part of hashlib) hashing functions.

teward avatar Jul 04 '24 18:07 teward

Is there a reason why you would want to use other hash functions?

comfyanonymous avatar Jul 04 '24 18:07 comfyanonymous

Is there a reason why you would want to use other hash functions?

In some cases, people may have resource availability in CPU issues that make using slower hash functions unfeasible. This was mentioned in https://github.com/comfyanonymous/ComfyUI/issues/3465#issuecomment-2209421557 stating "It can be swapped out for whatever hash function you want", but that requires a code diversion that will break updates/pulls to server.py in local trees/branches

By making it configurable this lets people who want to use weaker and faster hash functions the option to do so.

Personally, I'd prefer using SHA512 sums and only SHA512 sums, but not everyone has powerful enough CPUs to rapidly process and compute those hashes.

teward avatar Jul 04 '24 18:07 teward

Is there a reason why you would want to use other hash functions?

I can't think of one unless there are issues with a specific function on specific platforms.

shawnington avatar Jul 04 '24 18:07 shawnington

Is there a reason why you would want to use other hash functions?

In some cases, people may have resource availability in CPU issues that make using slower hash functions unfeasible. This was mentioned in #3465 (comment) stating "It can be swapped out for whatever hash function you want", but that requires a code diversion that will break updates/pulls to server.py in local trees/branches

By making it configurable this lets people who want to use weaker and faster hash functions the option to do so.

Personally, I'd prefer using SHA512 sums and only SHA512 sums, but not everyone has powerful enough CPUs to rapidly process and compute those hashes.

In the case of this function, you are only hashing images, and only when you load a new image, and only if there is a name collision. Even on a very slow system where you run into an extreme edge case of having say 100 images with the same name, hashing completes very fast.

I personally have no opposition to having the ability to select a hash function through cli though, as this is not the only node to use hashlib.

shawnington avatar Jul 04 '24 19:07 shawnington

FYI I just took @shawnington's recommendation and moved hasher() into node_helpers so it can be used elsewhere. It also changes the argument to be default-hashing-function so we can use the helper library in node_helpers for other cases of hashing function selection for such default 'hashing' function selection in other nodes. This can be used, then, by many other nodes even custom nodes to default to specific hashing functions based on CLI selection. Which in turn defaults to sha256 because of the configuration of arguments.

Working from a laptop without my IDE, so feel free to squash commits at merge time, 'cause i'm stuck using the GitHub editor.

teward avatar Jul 04 '24 19:07 teward

@mcmonkey4eva updates made, should address your concern about eval. My 'switch' is a dict with key-values instead of long if statements chained together for switches. That's just me though. :)

teward avatar Jul 16 '24 01:07 teward

  File "/mnt/fast_ssd/stable_diff/comfy_ui/node_helpers.py", line 37
    return hashfuncs[args.default_hashing_function]
                                                   ^
IndentationError: unindent does not match any outer indentation level

I get an error.

comfyanonymous avatar Jul 16 '24 14:07 comfyanonymous

Interesting. Let me check the indentation

teward avatar Jul 16 '24 15:07 teward

Interesting. Let me check the indentation

Fixed, looks like when I was using the github editor I didn't notice a missing space in my typing to bring things back into line indentation wise. Fixed!

teward avatar Jul 16 '24 15:07 teward