Kache
Kache copied to clipboard
InMemoryKache::getOrPut fails in high concurrency context
Hi :smile_cat:
I noticed I was getting NullPointerExceptions when using SHA256KeyHasher.
I could narrow it down to InMemoryKache - the following test fails:
@Test
fun concurrency() = runTest {
val cache = InMemoryKache<String, String>(maxSize = 1)
(0..100000).map(Int::toString).forEach {
launch { cache.getOrPut(it) { it }!! }
}
}
Stack trace:
java.lang.NullPointerException
at com.mayakapps.kache.InMemoryKacheTest$concurrency$1$2$1.invokeSuspend(InMemoryKacheTest.kt:36)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
at kotlinx.coroutines.test.TestDispatcher.processEvent$kotlinx_coroutines_test(TestDispatcher.kt:24)
at kotlinx.coroutines.test.TestCoroutineScheduler.tryRunNextTaskUnless$kotlinx_coroutines_test(TestCoroutineScheduler.kt:99)
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt$runTest$2$1$workRunner$1.invokeSuspend(TestBuilders.kt:322)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:104)
at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:277)
at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:95)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:69)
at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:48)
at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
at kotlinx.coroutines.test.TestBuildersJvmKt.createTestResult(TestBuildersJvm.kt:10)
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt.runTest-8Mi8wO0(TestBuilders.kt:310)
at kotlinx.coroutines.test.TestBuildersKt.runTest-8Mi8wO0(Unknown Source)
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt.runTest-8Mi8wO0(TestBuilders.kt:168)
at kotlinx.coroutines.test.TestBuildersKt.runTest-8Mi8wO0(Unknown Source)
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt.runTest-8Mi8wO0$default(TestBuilders.kt:160)
at kotlinx.coroutines.test.TestBuildersKt.runTest-8Mi8wO0$default(Unknown Source)
at com.mayakapps.kache.InMemoryKacheTest.concurrency(InMemoryKacheTest.kt:32)
We are seeing similar NPEs in this code.
Uncaught Kotlin exception: kotlin.NullPointerException
at 0 AppEcc 0x10587ad97 kfun:kotlin.Throwable#<init>(){} + 95
at 1 AppEcc 0x105874167 kfun:kotlin.Exception#<init>(){} + 87
at 2 AppEcc 0x105874387 kfun:kotlin.RuntimeException#<init>(){} + 87
at 3 AppEcc 0x1058745a7 kfun:kotlin.NullPointerException#<init>(){} + 87
at 4 AppEcc 0x1058af41f ThrowNullPointerException + 131
at 5 AppEcc 0x105d8f4d7 kfun:com.mayakapps.kache.SHA256KeyHasher.$transformCOROUTINE$0.invokeSuspend#internal + 783
at 6 AppEcc 0x1059afd1f kfun:kotlin.coroutines.native.internal.BaseContinuationImpl#invokeSuspend(kotlin.Result<kotlin.Any?>){}kotlin.Any?-trampoline + 67
SHA256KeyHasher has the following code:
override suspend fun transform(oldKey: String): String = hashedCache.getOrPut(oldKey) {
oldKey.encodeUtf8().sha256().hex().uppercase()
}!! // Since our creation function never returns null, we can use not-null assertion
Notice the comment at the end.
Now look at the following:
override suspend fun getOrPut(key: K, creationFunction: suspend (key: K) -> V?): V? {
get(key)?.let { return it }
creationMutex.withLock {
if (creationMap[key] == null && map[key] == null) {
@Suppress("DeferredResultUnused")
internalPutAsync(key, creationFunction) // <!--- ASYNC
}
}
return get(key) // <!-- DID ASYNC ABOVE FINISH?
}
I know nothing about coroutines in Kotlin, but I see an async call that I'm guessing is not guaranteed to finish before the return get call at the end of the method, resulting in a nasty race condition where sometimes you'll get a value and other times you will not. This seems extremely bad.
I was able to replicate the crash using the unit test provided above:
@Test fun concurrency() = runTest { val cache = InMemoryKache<String, String>(maxSize = 1) (0..100000).map(Int::toString).forEach { launch { cache.getOrPut(it) { it }!! } } }
I found that by replacing the second call to get(key) in the getOrPut(…) function with a new function I called unsafeGet(key: K): V, the problem goes away. This is my unsafeGet(…) implementation:
private suspend fun unsafeGet(key: K): V = (getFromCreation(key) ?: getIfAvailable(key))!!
Since unsafeGet(…) is a copy of get(…), but with evictExpired() removed (and a force unwrap added), this suggests to me that evictExpired() may be triggering a race condition. However, my Kotlin experience is limited, so my confidence in this analysis is not particularly high.
As an aside, by calling get(…) twice, evictExpired() is being called twice. Is this really necessary in a single call to getOrPut(…)? Calling evictExpired() twice seems superfluous and possibly a performance issue.
Thank you for reporting the issue and investigating it, and sorry for being late.
I would like to further investigate and try to fix the issue in the next few days.
Any update on this after several months?