traits
traits copied to clipboard
chained cached properties get evaluated too often
When a cached property depends on an intermediate cached property, the intermediate property gets immediately evaluated when it's dependencies change. This is suboptimal, as cached properties are typically expensive to compute and therefore should only be computed when needed. In particular, during object construction, when the object is unlikely to be in a valid state anyway, properties are likely to be called several times. Take the following example:
class Test(HasTraits):
a = Int
b = Int
c = Property(depends_on='a,b')
d = Property(depends_on='a,c')
e = Property(depends_on='c,d')
@cached_property
def _get_c(self):
print('Test._get_c')
return self.a*self.b
@cached_property
def _get_d(self):
print('Test._get_d')
return self.a*self.c
@cached_property
def _get_e(self):
print('Test._get_e')
return self.c+self.d
t = Test(a=1,b=2)
This will print
Test._get_c
Test._get_d
Test._get_c
Test._get_d
so two calls each, without really needed at all.
The reason is of course the same as in issue #94: The traits notification invalidating the chained property's cache triggers the computation of the old
value of the intermediate property. However, contrary to the general case, here one could solve this by always traversing the dependency chain to it's end, i.e. in the example above, both d
and e
should depends_on='a,b'
instead. However, it would be nice if traits would do this for us automatically, as in some cases these properties could be on different objects, and encapsulation should not require us to know if these traits on the other object are actually properties or not, or on what traits these depend.
Unfortunately, it's the last point that makes this difficult for Traits to do internally, too. The dependency chain (well, graph) is generally distributed between several objects, and the graph mutates in the middle, too (when something depends_on=['foo.x']
and foo
is changed, the x
listeners on the old foo
object are removed and x
listeners are added to the new foo
instance).
You are welcome to look deeper and propose a concrete solution, but it's not clear to me that there is one. There may be an opportunity to optimize the entirely-local case at the expense of possibly creating variant behavior between the local and distributed cases.
I guess the general idea would be that whenever traits wants to install a trait listener on some HasTraits
instance and the target trait is a cached property, it would instead recurse to the traits dependencies instead. This would then also work in the depends_on='foo.x'
case, since the change handler for foo
, who is responsible to install the notification on the new foo
's x
trait would do the recursion whenever the new foo
's x
is a property.
Of course, there is the danger that the full tree traversal itself could get too expensive. A possible solution would be a separate "property's dependencies changed" notification list which would not trigger the property's getter itself.
On the other hand, the implementation of a separate notification list that does not trigger getters (or equivalently a list that knows if there is any interest in old
values) is a more general solution that would also solve issue #94 at the same time. So maybe I should have commented there instead...
I just closed #1457 as a duplicate of this issue; there's a solution proposed in the second message of that issue, but I don't think it address the cache invalidation issue.