Cache's compute misreports the removal cause of a collected entry
Guava Version
33.4.9-SNAPSHOT
Description
I tried to run the pre-release in Caffeine's test suite to verify the recent fix for cache computations (thanks @eamonnmcmanus). Other than different interpretations of stats this is pretty good, except that it uncovered a mistaken assumption about reference clean up. Here is the github action run.
After the compute locks it runs preWriteCleanup() which performs an amortized drain of the pending work. However it does not perform all that might have built up since it is on a caller's thread, e.g. a large cache might evacuate a lot of data. After this is performed, an existing entry may be found that is pending eviction but still present. In this test it returns null, so which leads to the mapping removal, and the listener reports this as an EXPLICIT removal cause with a null value.
/**
* Maximum number of entries to be drained in a single cleanup run. This applies independently to
* the cleanup queue and both reference queues.
*/
// TODO(fry): empirically optimize this
static final int DRAIN_MAX = 16;
@GuardedBy("this")
void drainValueReferenceQueue() {
Reference<? extends V> ref;
int i = 0;
while ((ref = valueReferenceQueue.poll()) != null) {
@SuppressWarnings("unchecked")
ValueReference<K, V> valueReference = (ValueReference<K, V>) ref;
map.reclaimValue(valueReference);
if (++i == DRAIN_MAX) {
break;
}
}
}
The null value in the removal notification should only occur for a COLLECTED cause, so the test suite notices this during the validation and Guava logs the rejection as Exception thrown by removal listener.
java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:222)
at com.github.benmanes.caffeine.cache.testing.RemovalListeners.validate(RemovalListeners.java:53)
at com.github.benmanes.caffeine.cache.testing.RemovalListeners$ConsumingRemovalListener.onRemoval(RemovalListeners.java:90)
at com.github.benmanes.caffeine.cache.testing.GuavaCacheFromContext$GuavaRemovalListener.onRemoval(GuavaCacheFromContext.java:528)
at com.google.common.cache.LocalCache.processPendingNotifications(LocalCache.java:1841)
at com.google.common.cache.LocalCache$Segment.runUnlockedCleanup(LocalCache.java:3488)
at com.google.common.cache.LocalCache$Segment.postWriteCleanup(LocalCache.java:3464)
at com.google.common.cache.LocalCache$Segment.compute(LocalCache.java:2295)
at com.google.common.cache.LocalCache.compute(LocalCache.java:4230)
at com.github.benmanes.caffeine.cache.testing.GuavaCacheFromContext$GuavaCache$AsMapView.compute(GuavaCacheFromContext.java:334)
at com.github.benmanes.caffeine.cache.ReferenceTest.compute_nullValue(ReferenceTest.java:1033)
Example
@Test(dataProvider = "caches")
@CacheSpec(population = Population.FULL, values = {ReferenceType.WEAK}, implementation = Implementation.Guava, //, ReferenceType.SOFT},
expireAfterAccess = Expire.DISABLED, expireAfterWrite = Expire.DISABLED,
maximumSize = Maximum.DISABLED, weigher = CacheWeigher.DISABLED,
stats = Stats.ENABLED, removalListener = Listener.CONSUMING)
public void compute_nullValue(Map<Int, Int> map, CacheContext context) {
Int key = context.firstKey();
var collected = getExpectedAfterGc(context,
Maps.filterKeys(context.original(), not(equalTo(key))));
collected.add(new SimpleEntry<>(key, null));
context.clear();
GcFinalization.awaitFullGc();
assertThat(map.compute(key, (k, v) -> {
assertThat(v).isNull();
return null;
})).isNull();
assertThat(context.cache()).whenCleanedUp().isEmpty();
assertThat(context).notifications().withCause(COLLECTED)
.contains(collected).exclusively();
}
Expected Behavior
In other logic, like lockedGetOrLoad, after the pre-write cleanup the entry found in the hash table is validated before usage.
if (value == null) {
enqueueNotification(
entryKey, hash, value, valueReference.getWeight(), RemovalCause.COLLECTED);
} else if (map.isExpired(e, now)) {
// This is a duplicate check, as preWriteCleanup already purged expired
// entries, but let's accommodate an incorrect expiration queue.
enqueueNotification(
entryKey, hash, value, valueReference.getWeight(), RemovalCause.EXPIRED);
Actual Behavior
FAILED: com.github.benmanes.caffeine.cache.ReferenceTest.compute_nullValue(AsMapView, CacheContext{population=FULL, maximumSize=DISABLED, weigher=DISABLED, expiry=DISABLED, expiryTime=FOREVER, afterAccess=DISABLED, afterWrite=DISABLED, refreshAfterWrite=FOREVER, keyStrength=WEAK, valueStrength=WEAK, compute=SYNC, loader=NEGATIVE, isAsyncLoader=false, cacheExecutor=DIRECT, cacheScheduler=DISABLED, removalListener=CONSUMING, evictionListener=DISABLED, initialCapacity=DEFAULT, stats=ENABLED, implementation=Guava, startTime=-2126573393550924486})
value of : cacheContext.context.removalListener
missing : {COLLECTED=[1000=null [COLLECTED]]}
---
expected to contain at least: {COLLECTED=[null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], 1000=null [COLLECTED]]}
but was : {COLLECTED=[null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED], null=null [COLLECTED]]}
cacheContext was : CacheContext{population=FULL, maximumSize=DISABLED, weigher=DISABLED, expiry=DISABLED, expiryTime=FOREVER, afterAccess=DISABLED, afterWrite=DISABLED, refreshAfterWrite=FOREVER, keyStrength=WEAK, valueStrength=WEAK, compute=SYNC, loader=NEGATIVE, isAsyncLoader=false, cacheExecutor=DIRECT, cacheScheduler=DISABLED, removalListener=CONSUMING, evictionListener=DISABLED, initialCapacity=DEFAULT, stats=ENABLED, implementation=Guava, startTime=-2126573393550924486}
at com.github.benmanes.caffeine.cache.ReferenceTest.compute_nullValue(ReferenceTest.java:1039)
Packages
com.google.common.cache
Platforms
No response
Checklist
-
[x] I agree to follow the code of conduct.
-
[x] I can reproduce the bug with the latest version of Guava available.
Hi @chaoren I’d like to contribute this issue Is it okay if I take this issue and prepare an implementation?
The issue occurs in the compute() method in LocalCache.java. After preWriteCleanup() drains the reference queues (potentially evicting entries), an existing entry might still be present in the map but with a null value (due to garbage collection). The current logic doesn't validate the entry post-cleanup like lockedGetOrLoad() does, leading to a removal with RemovalCause.EXPLICIT instead of RemovalCause.COLLECTED when the computed value is null. Am i right with the observation ?
Would this approach of Adding entry validation in compute() after preWriteCleanup(), similar to the check in lockedGetOrLoad(). Specifically, if valueReference.get() == null and valueReference.isActive(), enqueue a collected notification and remove the entry before proceeding with the computation. This may ensure the removal cause is correct without changing broader logic. Is it suppose to be like this?
I was planning to prepare a PR using an approach like this (in compute())
V oldValue = valueReference.get();
RemovalCause cause;
if (oldValue == null && valueReference.isActive()) {
cause = RemovalCause.COLLECTED;
} else {
cause = RemovalCause.EXPLICIT;
}
removeEntry(e, hash, cause);
return null;
It ensures that if the entry was already garbage-collected but still active, the removal cause is correctly reported as COLLECTED instead of EXPLICIT.
Thanks
@bandalgomsu , Hey did you create a Pr for this?
Since there hasn’t been any feedback on the comment, I haven’t opened a PR yet.
Would it be okay if I create a PR?
@bandalgomsu Yes, you can if it passes all test cases and scenarios .