python-nnf
python-nnf copied to clipboard
Very slow equality checks for certain sentences
Sentences where many nodes have multiple parents can be slow to compare to each other.
This happens when two sentences are identical or almost identical, but exist as separate objects. Each equality check naively compares the children of nodes, but that means nodes can be checked a lot of times.
For example, using examples/socialchoice.py
:
$ python3 -m timeit -s 'from socialchoice import lin; s = lin(range(10)); t = lin(range(10))' 's == t'
1 loop, best of 5: 7.47 sec per loop
There's no straightforward way to solve this, because the equality checks use frozenset
equality. The problem is solved in other methods with a temporary cache around a function defined inside the method, but comparing frozenset
s just invokes the __eq__
methods of the children again.
Two potential solutions:
- Enable a global cache when
__eq__
is called, and clear the cache when the call to__eq__
that enabled the cache is finished. This probably interacts badly with threads, but only by potentially keeping objects in memory longer than necessary or discarding the cache too early, without outright incorrect behavior. It also adds at least a little overhead to typical__eq__
calls. - Stop using
frozenset
equality, and implement all of the equality checks from scratch. This could make performance much worse for typical cases.
The problem only occurs when the objects are constructed separately. If an object is compared to itself Python is smart enough to notice and skip the lengthier check.
Solving it might not be worth it, unless a concrete case where it occurs pops up.