memoize icon indicating copy to clipboard operation
memoize copied to clipboard

`str` keys are inconsistent and unpredictable

Open omerfarukdogan opened this issue 3 years ago • 1 comments

__str__ methods are not meant to be unique and should not be used to determine equality of objects. Currently cache keys are generated with str function which uses __str__ methods of objects. This problem may lead to duplicate method calls or incorrectly cached results.

For example, let's consider the dict type from Python Standard Library.

>>> x = {"a": "1", "b": "2"}
>>> y = {"b": "2", "a": "1"}
>>> x == y
True
>>> str(x) == str(y)
False

These two objects are equal but their string representations are different. Therefore, they will not be memoized by the library.

For another example, let's say that we have the following class:

class MyClass:
  def __init__(self, x, y):
    self.x = x
    self.y = y
  def __str__(self):
    return str(f"MyClass(x={self.x})")

In this case, cache is not stable and will return incorrect results.

>>> x = MyClass("1", "1")
>>> y = MyClass("1", "555")
>>> x == y
False
>>> str(x) == str(y)
True

I think the default KeyExtractors provided by the library should not have this unstable behavior. As a workaround, I created a custom KeyExtractor and returned the arguments themselves in format_key method (by not complying to the str return type hinting).

omerfarukdogan avatar Jan 19 '22 09:01 omerfarukdogan

I'd appreciate suggestions for a better default KeyExtractor. The current one (based on str) is far from foolproof, and I'm unsure about what would truly be foolproof.

Alternatively, perhaps there shouldn't be a default KeyExtractor, forcing the user to make an informed decision on how to implement keys? However, this approach might make usages more verbose (even in scenarios where str would work just fine)

zmumi avatar Jan 12 '24 15:01 zmumi

closed due to inactivity

zmumi avatar May 04 '24 16:05 zmumi