rascal
rascal copied to clipboard
Enhance memo key with lastModified date for `loc` parameters of memoized functions
Sometimes expensive operations map file locations to intermediate models:
@memo MyModel myFunction(loc sourceFile);
It can be very practical to @memo such a function, but the cache is invalidated when the contents of the file change, not when the value of the loc representation changes. Currently, the Rascal implementation of @memo will always return the old cached value even though it is principally out of date.
The proposal is to include the lastModified long timestamp value of the resource that the loc is pointing to, in process of storage and lookup of memoized values.
One trick would be to simply include the timestamp in the value for lookup, as in the tuple <loc, timestamp> would become the new key for storage. But this is not so nice, since all the old values of the file would remain in the cache. I'd prefer a solution where on lookup the modification time of the stored loc is compared to the current modification time and the existing cache entry would be cleared when a new file is available.
For nested locs, as in @memo MyModel myFunction(list[loc] locs) the behavior should remain unchanged. It is the list of locs that is the cache entry, not the nested locs. Of course, this does require some good explanation in the docs.
@DavyLandman your opinion on this idea?
I've solves this in 2 ways. myFunction(loc, datetime) or add the datetime as a fragment of the loc.
ok great, I'll use that in the meantime!
But both have the unfortunate situation that the old value remains in the cache until the GC finally comes to get it. Is that an issue, or not?
Okay, my remark was a bit too short. I'll lengthen it.
Yes indeed, cache invalidation is a problem with this scheme. So I always pair it with a config to expire stuff in a reasonable time. For example:
@memo={expireAfter(minutes=2), maximumSize(100)}
So I don't like the direction of your proposal for 2 reasons:
- it's not value equality. it gives a special meaning to a "bare" loc. And maybe that's not always what you want, for example, what if you don't care about the file changing.
- it offers not opportunity for reducing the
lastModifiedcall. If we move that internally, while we also need it at the caller side, we might be callinglastModifiedmore than needed, for a hot path, this can be slow.
We could think about a special addition to util::Memo where we can say : @memo={invalidateLastModified(arg=0)} or something. Which would make this behavior opt in. We would still pay for the lastModified double invokes, but only if you request it.
Right. good point on the implicit exceptional behavior. The special configuration option seems nice, but then again it is as much code to simply call lastModified yourself on an extra argument.
BTW, what does @memo do with keywordParameters?
I think it calls IValue.equals.
And the extra invalidate option would help us clear out memory before GC or expireAcces options.