memoize `str` keys are inconsistent and unpredictable

`str` keys are inconsistent and unpredictable

Open omerfarukdogan opened this issue 3 years ago • 1 comments

__str__ methods are not meant to be unique and should not be used to determine equality of objects. Currently cache keys are generated with str function which uses __str__ methods of objects. This problem may lead to duplicate method calls or incorrectly cached results.

For example, let's consider the dict type from Python Standard Library.

>>> x = {"a": "1", "b": "2"}
>>> y = {"b": "2", "a": "1"}
>>> x == y
True
>>> str(x) == str(y)
False

These two objects are equal but their string representations are different. Therefore, they will not be memoized by the library.

For another example, let's say that we have the following class:

class MyClass:
  def __init__(self, x, y):
    self.x = x
    self.y = y
  def __str__(self):
    return str(f"MyClass(x={self.x})")

In this case, cache is not stable and will return incorrect results.

>>> x = MyClass("1", "1")
>>> y = MyClass("1", "555")
>>> x == y
False
>>> str(x) == str(y)
True

I think the default KeyExtractors provided by the library should not have this unstable behavior. As a workaround, I created a custom KeyExtractor and returned the arguments themselves in format_key method (by not complying to the str return type hinting).

Jan 19 '22 09:01 omerfarukdogan

I'd appreciate suggestions for a better default KeyExtractor. The current one (based on str) is far from foolproof, and I'm unsure about what would truly be foolproof.

Alternatively, perhaps there shouldn't be a default KeyExtractor, forcing the user to make an informed decision on how to implement keys? However, this approach might make usages more verbose (even in scenarios where str would work just fine)

Jan 12 '24 15:01 zmumi

closed due to inactivity

May 04 '24 16:05 zmumi

memoize memoize copied to clipboard

`str` keys are inconsistent and unpredictable

memoize
memoize copied to clipboard