traits icon indicating copy to clipboard operation
traits copied to clipboard

chained cached properties get evaluated too often

Open burnpanck opened this issue 9 years ago • 3 comments

When a cached property depends on an intermediate cached property, the intermediate property gets immediately evaluated when it's dependencies change. This is suboptimal, as cached properties are typically expensive to compute and therefore should only be computed when needed. In particular, during object construction, when the object is unlikely to be in a valid state anyway, properties are likely to be called several times. Take the following example:

class Test(HasTraits):
    a = Int
    b = Int
    c = Property(depends_on='a,b')
    d = Property(depends_on='a,c')
    e = Property(depends_on='c,d')

    @cached_property
    def _get_c(self):
        print('Test._get_c')
        return self.a*self.b

    @cached_property
    def _get_d(self):
        print('Test._get_d')
        return self.a*self.c

    @cached_property
    def _get_e(self):
        print('Test._get_e')
        return self.c+self.d

t = Test(a=1,b=2)

This will print

Test._get_c
Test._get_d
Test._get_c
Test._get_d

so two calls each, without really needed at all.

The reason is of course the same as in issue #94: The traits notification invalidating the chained property's cache triggers the computation of the old value of the intermediate property. However, contrary to the general case, here one could solve this by always traversing the dependency chain to it's end, i.e. in the example above, both d and e should depends_on='a,b' instead. However, it would be nice if traits would do this for us automatically, as in some cases these properties could be on different objects, and encapsulation should not require us to know if these traits on the other object are actually properties or not, or on what traits these depend.

burnpanck avatar Jul 14 '14 15:07 burnpanck

Unfortunately, it's the last point that makes this difficult for Traits to do internally, too. The dependency chain (well, graph) is generally distributed between several objects, and the graph mutates in the middle, too (when something depends_on=['foo.x'] and foo is changed, the x listeners on the old foo object are removed and x listeners are added to the new foo instance).

You are welcome to look deeper and propose a concrete solution, but it's not clear to me that there is one. There may be an opportunity to optimize the entirely-local case at the expense of possibly creating variant behavior between the local and distributed cases.

rkern avatar Jul 14 '14 15:07 rkern

I guess the general idea would be that whenever traits wants to install a trait listener on some HasTraits instance and the target trait is a cached property, it would instead recurse to the traits dependencies instead. This would then also work in the depends_on='foo.x' case, since the change handler for foo, who is responsible to install the notification on the new foo's x trait would do the recursion whenever the new foo's x is a property.

Of course, there is the danger that the full tree traversal itself could get too expensive. A possible solution would be a separate "property's dependencies changed" notification list which would not trigger the property's getter itself.

On the other hand, the implementation of a separate notification list that does not trigger getters (or equivalently a list that knows if there is any interest in old values) is a more general solution that would also solve issue #94 at the same time. So maybe I should have commented there instead...

burnpanck avatar Jul 14 '14 15:07 burnpanck

I just closed #1457 as a duplicate of this issue; there's a solution proposed in the second message of that issue, but I don't think it address the cache invalidation issue.

mdickinson avatar Apr 19 '21 08:04 mdickinson