kotlinx.coroutines icon indicating copy to clipboard operation
kotlinx.coroutines copied to clipboard

Wrong Dispatcher used when used with Live Edit

Open acleung opened this issue 1 year ago • 4 comments

Hello:

This is Alan from the Android Studio Live Edit team. I am looking at a reported issue with related coroutine and Live Edit.

Describe the bug

You can find more details in the bug itself but here is a summary:

We noticed that when a coroutine has been dispatched to a worker, the current coroutine is resumed on that worker as well.

Consider this code:

class ContextBug {
    companion object {
        @JvmStatic
        fun Launch() {
            runBlocking() {
                Log.d("DEBUG", ">>>> Should be main: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
                withContext(Dispatchers.IO) {
                    Log.d("DEBUG", ">>> Should be DefaultDispatcher-worker: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
                }
                Log.d("DEBUG", ">>>> Should be main: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
            }
        }
    }
}

A normal execution would yield:

DEBUG                   com.example.myapplication            D  >>>> Should be main: main having context: [BlockingCoroutine{Active}@24fa393, BlockingEventLoop@6719fd0]
DEBUG                   com.example.myapplication            D  >>> Should be DefaultDispatcher-worker: DefaultDispatcher-worker-1 having context: [DispatchedCoroutine{Active}@7f29bc9, Dispatchers.IO]
DEBUG                   com.example.myapplication            D  >>>> Should be main: main having context: [BlockingCoroutine{Active}@24fa393, BlockingEventLoop@6719fd0]

While with Live Edit, it would yield:

DEBUG                   com.example.myapplication            D  >>>> Should be main: main having context: [BlockingCoroutine{Active}@d07c13e, BlockingEventLoop@d225b9f]
DEBUG                   com.example.myapplication            D  >>> Should be DefaultDispatcher-worker: DefaultDispatcher-worker-1 having context: [DispatchedCoroutine{Active}@b85a4ec, Dispatchers.IO]
DEBUG                   com.example.myapplication            D  >>>> Should be main: DefaultDispatcher-worker-1 having context: [BlockingCoroutine{Active}@d07c13e, BlockingEventLoop@d225b9f]

Noticed that Thread.currentThread() is the worker while the currentCoroutineContext() is still pointing the same one that was created for runBlocking().

Provide a Reproducer

  1. Download latest version of Android Studio Canary and install.
  2. Using the Android Studio project wizard, create an "Empty Activity". The default should be a Kotlin / Compose project being created.
  3. Set up an working Android emulator or connect an Android device.
  4. Enable Live Edit: File -> Settings -> Editor -> Live Edit (Push Edit Automatically)
  5. Create a new file ContextBug.kt and paste the above code into it
  6. In the MainActivity.kt of the project, add the following line to the Greeting() method: Redirect().redirect()
  7. Run the program, observe the logcat
  8. Perform any trivial Live Edit in Redirect.kt (adding empty spaces to any of the output strings should trigger live update)
  9. Observe the logcat again and note that even while the Context is still pointing at the runBlocking one, Thread.currentThread is still the worker.

acleung avatar Jan 24 '24 23:01 acleung

Thanks for the reproducer.

Unfortunately, I don't think we can do much here without a deep understanding of how live edit works, how and what exactly it reloads and how it is incorporated into the threading system, especially when live edit does not work along with the debugger.

I see two pieces of solid evidence it is a problem on the Live Edit side:

  • The bug would manifest itself in the environment without bytecode weaving, while in this reproducer it consistently reproduces every time LE is pushed and only then.
  • This snippet of code gives a direct suspect:
runBlocking() {
    Log.d("DEBUG", ">>>> F")
    Log.d("DEBUG", ">>>> Should be main: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
    suspendCoroutineUninterceptedOrReturn<Unit> {
        Log.d("DEBUG", ">>> Proxy: ${it.intercepted()}")
        Unit
    }
    withContext(Executors.newCachedThreadPool().asCoroutineDispatcher()) {
        Log.d("DEBUG", ">>> Should be DefaultDispatcher-worker: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
    }
    Log.d("DEBUG", ">>>> Should be main: ${Thread.currentThread().name} having context: ${currentCoroutineContext()}")
}

It prints >>> Proxy: com.android.tools.deploy.liveedit.ProxyClassHandler@8bd34b3 and kotlinx.coroutines is not prepared for situations where intercepted actual return type is changed from what our CoroutineDispatcher returned (which is an instance of DispatchedContinuation) to anything else. The default mode for kotlinx.coroutines to operate with alien intercepted contunations is "resume it directly in the same very thread as we have no other clues what to do"

qwwdfsad avatar Jan 25 '24 17:01 qwwdfsad

The best pointer I can give you to untie this problem further is to look at the simplified piece of the code:

runBlocking {
    withContext(Dispatchers.IO) {
        2
    }
    Unit
}

and debug step by step how the control is returned from withContext to outer coroutine. It should go through the AbstractCoroutine.resumeWith -> DispatcherCoroutine.afterResume -> intercepted.resumeCancellableWith where the typecheck happens.

It might shed some light on the machinery in order to understand what suitable replacements can be made in ProxyClassHandler

qwwdfsad avatar Jan 25 '24 17:01 qwwdfsad

Thanks for the pointers, Vsevolod. I feel like we are getting really close to the issue.

Would you be able to explain to me how DispatchedContinuation comes into play in the examples? I have a hard time rationalizing how a ProxyClassHandler could suddenly replace a DispatchedContinuation object which is a final object that is not in any class's hierarchy that Live Edit wants to take over.

In that example, ProxyClassHandler@8bd34b3 was a Proxy handler object created for a lambda of the function (ContextBug$Companion$Launch$1 in my compiled output)

Instances of ContextBug$Companion$Launch$1 should be been represented by our proxy object. It is set up with the SuspendLambda super class with all the interfaces set up by getProxyClass so all instance-of checks needed by the Kotlin Coroutine runtime should have been properly handled.

acleung avatar Jan 26 '24 01:01 acleung

So, long story short, DispatchedContinuation wraps the original continuation and intercepts its resume method -- instead of direct resumption right in the stack, it schedules its execution in the corresponding dispatcher and also handles cancellations, thread-locals and other machinery. Mostly because of historical reasons, we have a few places that directly type-check continuations into DispatchedContinuation (related: https://github.com/Kotlin/kotlinx.coroutines/issues/2439), but AFAIR it's rather a performance optimization than something else.

The following emulation of the reproducer works as expected:

runBlocking<Unit> {
    println(Thread.currentThread())
    withContext(object : ContinuationInterceptor {
        override fun <T> interceptContinuation(continuation: Continuation<T>): Continuation<T> {
            val intercepted = Dispatchers.IO.interceptContinuation(continuation)
            return Continuation(intercepted.context) {
                intercepted.resumeWith(it)
            }
        }

        override val key: CoroutineContext.Key<*>
            get() = CoroutineDispatcher
    }) {
        println(Thread.currentThread())
        2
    }
    println(Thread.currentThread())

    Unit
}

which narrows down the suspect even further to proxy classes, probably worth checking them out and see what they are compiled into. Maybe ProxyClass loses the context when wrapping the continuation or maybe it replaces the class that was an instance of ContinuationImpl but now is not and interception fails [1].

[1] -- see https://github.com/JetBrains/kotlin/blob/438c55756f49ad4fe3d02c7502a1b16255848359/libraries/stdlib/jvm/src/kotlin/coroutines/intrinsics/IntrinsicsJvm.kt#L182

qwwdfsad avatar Jan 26 '24 18:01 qwwdfsad

Closing as a third-party problem, feel free to ask any more questions

qwwdfsad avatar Apr 09 '24 08:04 qwwdfsad

No, I think we don't have any more question. We believe that we have identified the exact cause on our end with your help.

Thanks for the explanations.

acleung avatar Apr 09 '24 17:04 acleung