Make default hash lib configurable without code changes via CLI argument
This is an extension to #3465 in which we add an option via CLI arguments to specify the comparison hashing function used in the operation of compare_image_hash.
While it uses an eval(), the eval is safe because we use a CLI argument that only allows one of 4 strings to specifically be selected, thereby preventing unsafe input. The other option is to write a parser around that that parses the string and returns the hash function, but an eval in line for this case is safe.
Since there's no way to specify the hash function in the fix for #3465, this makes it configurable, which allows the use of other common (and known to be part of hashlib) hashing functions.
Is there a reason why you would want to use other hash functions?
Is there a reason why you would want to use other hash functions?
In some cases, people may have resource availability in CPU issues that make using slower hash functions unfeasible. This was mentioned in https://github.com/comfyanonymous/ComfyUI/issues/3465#issuecomment-2209421557 stating "It can be swapped out for whatever hash function you want", but that requires a code diversion that will break updates/pulls to server.py in local trees/branches
By making it configurable this lets people who want to use weaker and faster hash functions the option to do so.
Personally, I'd prefer using SHA512 sums and only SHA512 sums, but not everyone has powerful enough CPUs to rapidly process and compute those hashes.
Is there a reason why you would want to use other hash functions?
I can't think of one unless there are issues with a specific function on specific platforms.
Is there a reason why you would want to use other hash functions?
In some cases, people may have resource availability in CPU issues that make using slower hash functions unfeasible. This was mentioned in #3465 (comment) stating "It can be swapped out for whatever hash function you want", but that requires a code diversion that will break updates/pulls to server.py in local trees/branches
By making it configurable this lets people who want to use weaker and faster hash functions the option to do so.
Personally, I'd prefer using SHA512 sums and only SHA512 sums, but not everyone has powerful enough CPUs to rapidly process and compute those hashes.
In the case of this function, you are only hashing images, and only when you load a new image, and only if there is a name collision. Even on a very slow system where you run into an extreme edge case of having say 100 images with the same name, hashing completes very fast.
I personally have no opposition to having the ability to select a hash function through cli though, as this is not the only node to use hashlib.
FYI I just took @shawnington's recommendation and moved hasher() into node_helpers so it can be used elsewhere. It also changes the argument to be default-hashing-function so we can use the helper library in node_helpers for other cases of hashing function selection for such default 'hashing' function selection in other nodes. This can be used, then, by many other nodes even custom nodes to default to specific hashing functions based on CLI selection. Which in turn defaults to sha256 because of the configuration of arguments.
Working from a laptop without my IDE, so feel free to squash commits at merge time, 'cause i'm stuck using the GitHub editor.
@mcmonkey4eva updates made, should address your concern about eval. My 'switch' is a dict with key-values instead of long if statements chained together for switches. That's just me though. :)
File "/mnt/fast_ssd/stable_diff/comfy_ui/node_helpers.py", line 37
return hashfuncs[args.default_hashing_function]
^
IndentationError: unindent does not match any outer indentation level
I get an error.
Interesting. Let me check the indentation
Interesting. Let me check the indentation
Fixed, looks like when I was using the github editor I didn't notice a missing space in my typing to bring things back into line indentation wise. Fixed!