apcu
apcu copied to clipboard
Expunge of expired entries
It is my understanding (from the documentation) that a cache entry is expunged when entry is next read after the ttl has expired.
This is not very effective in the case where an application caches an object (e.g. a database query result) with no guarantee that the same query will be run again in the forseeable future. These stale orphan objects gradually fill the cache leading to an unnecessarily high level of fragmentation.
I am not aware that there is any system of garbage collection currently built into apcu. This seems to be a strange omission.
I am thinking of a routine that steps through the cache entries deleting them if stale, maybe checking one entry on every add or store operation.
Currently expired cache entries are removed if either:
- The cache has run full. Depending on configuration this will result in either a full reset, or an attempt to free expired cache entries.
- While storing a cache entry, an expired cache entry is encountered. They don't need to have the same key, but do need to have the same hash (otherwise it will not be encountered).
I think adding some form of GC might make sense, but it's not immediately clear which load profile it is supposed to improve. Seeing high fragmentation (esp. given the somewhat odd notion of "fragmentation" used by apc.php) is not in itself something problematic for performance.
I agree that this is not strictly a performance issue, and I am not too concerned about fragmentation, but it is disconcerting to see the cache used percentage slowly creeping up because the stale entries are counted as 'used'.
So I suppose my main point is the lack of garbage collection makes it hard to monitor the real utilisation of the cache.
I started to hack the apc.php monitoring script to report the amount of memory used by expired entries, but soon realised that viewing the "User Cache Entries" "All" reads all the entries, and in fact performs (an inefficient) garbage collection. So you can see the effect that garbage collection would have on reported utilisation without changing any code.
I started to hack the apc.php monitoring script to report the amount of memory used by expired entries, but soon realised that viewing the "User Cache Entries" "All" reads all the entries, and in fact performs (an inefficient) garbage collection. So you can see the effect that garbage collection would have on reported utilisation without changing any code.
apc.php uses apcu_cache_info(), which is a read-only operation. It will not GC any entries and also doesn't do any TTL checks, so expired entries will also be returned.
apc.php uses apcu_cache_info(),
Are you sure that is true for the list of user cache entries?
Yeah, the whole script is based on essentially apcu_cache_info + apcu_sma_info. It uses apcu_fetch to fetch the values of user cache entries if requested, but that will not remove anything either.
Sorry about my misunderstanding, I checked again, and the expired items were in fact still in the cache.
That did let me do some analysis, on the server I am monitoring, the cache has been running for 24 hours and 3% of the entries are expired, but those 3% are using 35% of the used space. I will check again in a couple of days and see how the proportions change.