kotlinx.coroutines
kotlinx.coroutines copied to clipboard
Debug API for stepping through a coroutine execution
This PR contains the API that is aimed to solve the problem of debugger stepping inside one coroutine.
The initial task: the debugger should be able to step inside one coroutine.
The problem: for every step / breakpoint inside the coroutine we need to determine whether we are about to stop in the same coroutine as the one we were in during the previous step. To achieve this we require an attribute that could be checked at every step.
What may serve as this attribute:
- Continuation: at the first breakpoint, we save a continuation instance corresponding to the current frame. Before stopping at the next breakpoint inside this coroutine, we check whether the continuation instance at the frame matches the one we saved. If it does, we stop at the breakpoint. In case of stepping into another suspend function we go up the continuation stack, e.g.
suspend fun foo() {
inst1 // the first breakpoint
bar()
inst2
}
suspend fun bar() {
inst1
baz()
inst3
}
suspend fun baz() {
delay(1000)
inst4 // RunToCursor here
}
At the first breakpoint we save the FooContinuation instance, and to determine whether we should stop at baz we'll have to unroll the continuation stack up to the foo frame and compare the continuation instance with the saved one: ((bazContinuation.completion as BarContinuation).completion as FooContinuation)
- The solution from this PR - Thread local: use ThreadLocal to store the coroutine id on every
probeCoroutineSuspendedandprobeCoroutineResumed. Store the id of the current coroutine in the breakpoint. When a breakpoint is reached, retrieve the coroutine id running on the current thread from the ThreadLocal and compare it with the saved id to determine if it's the same coroutine as the one at the previous step.
This solution allows to avoid unrolling the continuation stack in case of stepping into the body of another suspend function. Also, it allows to step in the body of non-suspend function that was invoked from a suspend function, in this case there will be no continuation instance passed to the call.
Issues with this solution:
- it's added as part of the old API that is being replaced, though it's ok to test the solution for now. In case it's used, then this API should become part of the stable API in the new
kotlinx.coroutines.debuggingpackage. - it may be very slow to evaluate the value of ThreadLocal for every debug step. A thousand coroutines may run simultaneously, though we have to stop at the coroutine we need.
On the side of IDEA Debugger we're currently using this API to implement stepping inside one coroutine, as I've described above. The solution seems to work. Maybe we could add this API (as part of the "old" API for now), so that that users could test an initial solution for stepping?
WDYT 👀
Here is the corresponding YT ticket: IDEA-338723
Here I want to clarify more, why debug stepping (F8 or Run to Cursor) through the coroutine execution does not work now.
- Consider, the following example:
fun main() = runBlocking<Unit> {
launch(Dispatchers.Unconfined) {
// Set the breakpoint at this line
println("Started in thread ${Thread.currentThread().name}")
delay(1000)
// Try Run To Cursor to this line
println("Completed in thread ${Thread.currentThread().name}")
}
}
If we try to make Run To Cursor to the line after suspension, the execution won't stop there and the process will end. (The same happens if we just step through the execution by pressing F8)
Why does this happen:
Because of the Unconfined dispatcher, the coroutine starts in the main thread and after delay the continuation is scheduled to the default executor and is executed in another thread.
IDEA debugger will filter out all the breakpoints that are met in threads other than the one we were in at the previous breakpoint (main thread), hence the breakpoint after delay is skipped and the execution ends.
To step through the coroutine execution we need to know for every step(breakpoint), whether we are about to stop in the same coroutine and filter breakpoints not by the thread, but by some coroutine identifier (id or continuation).
- Here is another example with multiple coroutines:
fun main() = runBlocking {
for (i in 1..20) {
launch(Dispatchers.Default) {
// Set breakpoint here
println("Start $i Thread: ${Thread.currentThread()}")
delay(1000 + i*10L)
// Try to Run-To-Cursor here
println("End $i Thread: ${Thread.currentThread()}")
}
}
println("Top Thread: ${Thread.currentThread()}")
}
In this example many coroutines are launched, and we want to step through one coroutine: we set the breakpoint before delay, and run to cursor after delay. But the execution keeps stopping only at the first breakpoint at every iteration and never stops at the next breakpoint.
The reason for that is the same: the debugger skips the breakpoint after delay, because it happens in another thread, and stops at the first breakpoint again, because the next coroutine is launched in the main thread like the previous one.
Ideally, the user can step through the coroutine execution, like it is linear. And for that, we also need to make sure, that we stop in the same coroutine.
To sum up my understanding: when "go to cursor," "step into," or anything like that is used, internally, the debugger places a breakpoint, and when some thread hits that breakpoint, the debugger checks if this is the moment we were interested in or some other unrelated thread hit that breakpoint. In the latter case, the breakpoint is skipped. Right now, the check of "are we interested in the breakpoint" is literally "is it the same thread that entered this before," and the goal is to write a more suitable check. Is everything correct so far?
A question: how is this new API going to be used? Is the check simply "interesting breakpoint == the coroutine stayed the same?"
This would work for println("from"); launch(Dispatchers.Default) { println("to") } and in the cases you outlined: we shouldn't allow stepping from from to to, as the computation is clearly moved to another execution context.
But there are many cases when we move the computation to (formally speaking) another coroutine, but clearly don't consider this as spawning a new computation. What's going to happen in the following cases?
withContext, slow path:
// from here
println("from")
newSingleThreadContext("x").use {
withContext(it) {
// step to here
println("to")
}
}
// and then here
println("final")
withContext, semi-slow path:
// from here
println("from")
withContext(CoroutineName("new name")) {
// step to here
println("to")
}
// and then here
println("final")
coroutineScope:
// from here
println("from")
coroutineScope {
// step to here
println("to")
}
// and then here
println("final")
withTimeout:
// from here
println("from")
withTimeout(1.seconds) {
// step to here
println("to")
}
// and then here
println("final")
Yes, everything you said in the first paragraph is correct)
Is the check simply "interesting breakpoint == the coroutine stayed the same?"
Yes, for every breakpoint we evaluate this field currentThreadCoroutineId and compare the obtained id with the previous coroutine id.
I'll think about your examples now 👀. I'd say that we should be able to step through the execution like it's linear, when we do not spawn a new computation with a coroutine builder.
So, I think, that we should be able to linearly step through the coroutine execution, when there a new coroutine is not created. I checked all the examples above with the current RunToCursor implementation that uses currentThreadCoroutineId -- works fine, the coroutine id is extracted at both points, the coroutine is the same -> the execution stops.
And when the coroutine builder is invoked ( via launch or async), then we just step over the body of the new coroutine for now.
And I'll also mention, why this API helps: it's main point is to be able to step to and from the synchronous (non-suspending) code that was invoked in the coroutine. Consider this example:
fun main() = runBlocking {
// from here
println("from")
coroutineScope {
delay(1000)
bar(56)
}
}
fun bar(i: Int) {
// to here
println("bar $i start")
println("bar $i end")
}
Continuation instance is not available inside non-suspending bar, and knowing the coroutine id that is still the same in the current thread helps.
And another point: Stepping from async to async context may be done via comparing continuation instance, though it may require going up the long continuation stack up to the parent continuation that corresponded to the first breakpoint. And when we save a coroutine id for the current thread, we may avoid traversing the stack.
Playing with the solution, I found this edge case:
runBlocking(Dispatchers.Default) {
println("Start: ${DebugProbesImpl.currentThreadCoroutineId}")
launch(Dispatchers.Unconfined + CoroutineName("A new coroutine")) {
println("In a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}")
delay(1.seconds)
println("Still in a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}")
}
println("End: ${DebugProbesImpl.currentThreadCoroutineId}")
}
prints
Start: 1
In a coroutine: 2
End: 2
Still in a coroutine: 2
Interestingly, this
runBlocking(Dispatchers.Default) {
println("Start: ${DebugProbesImpl.currentThreadCoroutineId}")
launch(CoroutineName("A new coroutine"), start = CoroutineStart.UNDISPATCHED) {
println("In a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}")
delay(1.seconds)
println("Still in a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}")
}
println("End: ${DebugProbesImpl.currentThreadCoroutineId}")
}
prints
Start: 1
In a coroutine: 1
End: 2
Still in a coroutine: 2
Looks like the outer coroutine doesn't signal its suspension or resumption in this case, as it's not suspended or resumed, it just helps running another coroutine. Entering the debugger at the "In a coroutine" line in the first example, we see that two coroutines are RUNNING at the same time on the same thread, and in the second example, the outer coroutine is RUNNING, whereas the inner one is only CREATED.
I disagree conceptually with the way probes behave now: I'd say that, on the "In a coroutine" line, only the inner coroutine is RUNNING in both cases, and the outer is SUSPENDED: the forked-off computation is already running, and the original computation isn't. Unfortunately, I'm not certain that we can ensure the correct probes are emitted.
Right :(. Even if we update currentThreadCoroutineId on coroutine creation as well, then in both examples we'll get:
Start: 1
In a coroutine: 2 // updated by child coroutine in probeCreated
End: 2 // no suspend or resume happened, the parent coroutine just continues it's execution
Still in a coroutine: 2
We should somehow track the boundary of the parent coroutine then 🤔
Of course, to step in these cases we can compare Continuation instances. But this won't solve the problem of stepping into the synchronous context.
The problem is, we'd like to get from Start to End, no?
Yes, I want to make RunToCursor from Start to End in runBlocking: and we won't stop at End now.
Or from In a coroutine to Still in a coroutine in launch.
We may implement the following solution: instead of saving one id of the current coroutine, we may save the stack of active ids, and update it like this:
coroutine created -> push(id)
coroutine suspend -> pop()
coroutine resumed -> push(id)
currentThreadCoroutineId will return the top element of the stack.
For the example above:
runBlocking(Dispatchers.Default) { // parent created -> [1]
println("Start: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 1
launch(Dispatchers.Unconfined + CoroutineName("A new coroutine")) { // child created -> [1, 2]
println("In a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 2
delay(1.seconds) // child suspended -> [1]
// child resumed -> [1, 2]
println("Still in a coroutine: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 2
}
println("End: ${DebugProbesImpl.currentThreadCoroutineId}") // proceed after child suspend, [1] -> prints 1
}
In case no suspends and resumes happen, we need to define, when one coroutine was completed and another coroutine proceeded with execution. For that we may add an extra probeCoroutineCompleted to stdlib. In compiler it'll be invoked in BaseContinuationImpl#resumeWith, and in the debug agent it will pop(id).
E.g.
runBlocking(CoroutineName("Main parent")) { // parent created -> [1]
println("0: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 1
withContext(Dispatchers.Unconfined) {
println("1: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 1 (still in the same coroutine)
launch(CoroutineName("Child") + Dispatchers.Unconfined) { // child created -> [1, 2]
println("2: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 2
// coroutine completed -> [1]
}
println("3: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 1
}
println("4: ${DebugProbesImpl.currentThreadCoroutineId}") // prints 1
}
WDYT?
I'm not sure, if the issue of "2 coroutines RUNNING on the same thread simultaneously" (https://github.com/Kotlin/kotlinx.coroutines/pull/3987#issuecomment-1907882574) is critical for understanding of what's happening 👀
The parent coroutine in the example is actually not suspended, we just let the child coroutine to start execution in the current thread till the first suspension.
For that we may add an extra
probeCoroutineCompletedto stdlib
We do have probeCoroutineCompleted, only created by ourselves (in the same DebugProbesImpl.kt file). Some coroutines won't call it, though.
I have a completely alternative suggestion.
As you said, the only way the result of the new function is used is to compare the two values. So, exposing a Long is a bit dangerous. What if we decide later that a Long is not enough to describe the "current code under the cursor"? We won't be able to easily change the implementation. Also, we're not using the full power a Long gives us: there's no sense in doing arithmetic operations on the result.
A better API, I think, would be something like (naming is not that important yet)
sealed interface PositionInCode {
fun canRunTo(other: PositionInCode)
}
// invoked as old.canRunTo(current)
DebugProbesImpl.currentPosition: PositionInCode
In any case, with this API, even if we're not satisfied with the results, we can change it on the side of the coroutines.
This also frees our thinking a bit: when we don't have to return a Long, what else can we return? Here's a thought: instead of a single identifier, maybe we could return all the identifiers?
How about something like this (writing without checking, treat as pseudocode)?
val currentPosition: PositionInCode get() = PositionInCodeImpl(
capturedCoroutines.asSequence().mapNotNull { owner ->
when {
owner.isFinished() -> null
owner.info.lastObservedThread() != Thread.currentThread() -> null
else -> owner.info.sequenceNumber
}
}.toSet()
)
private class PositionInCodeImpl(private val coroutines: Set<Long>): PositionInCode {
override fun canRunTo(other: PositionInCode) =
other is PositionInCodeImpl && coroutines.interset(other.coroutines).isNotEmpty()
}
I think it's not bad to allow people to step inside the new launch(Dispatchers.Unconfined) coroutine.
And if we end up finding out that no, iterating over all coroutines is too slow for this purpose, then we'll try introducing performance optimizations. For example, keeping a map from threads to coroutines, or something else.
@dkhalanskyjb, can we just mark this implementation as internal and as a subject to change? I would like to implement just same-coroutine solution and look, how it works. For example it may be slow even calling ThreadLocal (evaluation is taking time and we need to invoke it on the every breakpoint). Then it may be needed to optimize: for example, implement mapping by array and hash function by hand, so we can much faster extract the information without calling evaluator. Anyway, Best the enemy of the good, as we know. So some improvements may be done later, I think.
I'd like to implement the solution suggested by @dkhalanskyjb, it gives us more freedom, in case we want to improve performance or change the internal behaviour. And regarding the previous problem: I think too that we can allow stepping into an undispatched coroutine from it's parent.
And on the side of debugger, it won't require a lot of changes to support the new API.
Question: is https://youtrack.jetbrains.com/issue/IDEA-300618 potentially in the scope of this PR?
@lppedd, No, there is another scope of problems. You can track this issue: https://youtrack.jetbrains.com/issue/KT-64309
@steelart,
- If you want to experiment with this PR, you can build and publish it locally and check the results;
- If you want us to publish a release with a solution we're not sure about, then no, just marking it as internal and subject to change won't help. This PR is mostly for IDEA's debugger; if IDEA ships a fix that assumes the presence of this function (
currentThreadCoroutineId(): Long) in version X, we won't be able to update the function gracefully in a way that the debugger in version X supports; so, we'll have a situation where "stepping in coroutines A works in IDEA X and later, stepping in coroutines (A + c) works in IDEA (X + d)," introducing a tight coupling. If we spend a little time thinking about the API now, we can avoid that, and everyone wins.
Anyway, Best the enemy of the good, as we know. So some improvements may be done later, I think.
That's a very solid approach when you work on a product in which it is easy to correct mistakes in a future release, like a website or a GUI app. Because of strict compatibility guarantees, in libraries, it's not so.
Not sure about canRunTo(other: LocationInCoroutine) as it seems to be called only with the current location. For debugger we'll have to do two calls - one to get the current, and another to call canRunTo. It is better to have one.
@gorrus, sorry, I don't understand what you mean; could you explain it?
With currentThreadCoroutineId:
var coroutineFromWhichWeAreStepping: Long = DebugProbesImpl.currentThreadCoroutineId
fun tryStep() {
if (DebugProbesImpl.currentThreadCoroutineId == coroutineFromWhichWeAreStepping) {
breakpoint()
}
}
fun startStepping() {
coroutineFromWhichWeAreStepping = DebugProbesImpl.currentThreadCoroutineId
run()
}
with canRunTo:
var coroutineFromWhichWeAreStepping: PositionInCode = DebugProbesImpl.currentPosition()
fun tryStep() {
if (coroutineFromWhichWeAreStepping.canRunTo(DebugProbesImpl.currentPosition())) {
breakpoint()
}
}
fun startStepping() {
coroutineFromWhichWeAreStepping = DebugProbesImpl.currentPosition()
run()
}
For this use case, there is no difference. When is there a difference, and why is it important?
I'm also wary about hardcoding Thread.currentThread() into currentPosition. This may be unnecessarily limiting. Maybe fun currentPositionInThread(thread: Thread) or something like that? How does the debugger even run anything in the current thread except, well, the code that must run in that thread?
I'm unfamiliar with how debuggers work on the JVM (so maybe that's okay there), and I'm also not sure that, if we decide to expand this API to Native, we can support the same interface there. Maybe it's better to accept a thread explicitly just to be safe. @mvicsokolova, WDYT?
@dkhalanskyjb, Running coroutineFromWhichWeAreStepping.canRunTo(DebugProbesImpl.currentPosition()) will lead to calling currentPosition and then calling canRunTo. Each such call on breakpoint is expensive. So much better to combine them into one call with known parameters. If you want an abstraction (instead of just Long) over the continuation representation in the debugger, then I propose to have some ContinuationRepresentaion class, method DebugProbesImpl#getContinuationRepresentaion(): ContinuationRepresentaion and method ContinuationRepresentaion#canSuspendHere() (or DebugProbesImpl#canSuspendHere(from: ContinuationRepresentaion)). The last one should check that we are inside appropriate coroutine: same, or Dispatchers.Unconfined, or child one, or parent - whatever will be decided as good strategic.
So, the overhead of calling a function is huge, but inside the function, everything run with the normal performance, right?
What about passing parameters? Is PositionInCode.canRunTo(Thread)
worse than PositionInCode.canRunToCurrentThread()?
Could you share some required reading, if any?
Regarding "calling DebugProbesImpl.currentPosition() twice": I think it may be reasonable to provide smth like the function below to avoid the second call to DebugProbesImpl.currentPosition() on the debugger side. At least, we don't have usage scenarios other than checking if we can step from the given position to the current one.
DebugProbesImpl.canRunToCurrentPosition(from: PositionInCode)
And I guess, we could pass a thread reference as a parameter as well:
DebugProbesImpl.canRunToCurrentPosition(from: PositionInCode, currentThread: Thread)
If I understand correctly, only the number of requests and reply packets sent between VM and debugger is critical. Am I right, @steelart?
And also, is there a chance we'll need to check stepping not only to the currentPosition but to some set of positions in the future? Seems, like canRunToCurrentPosition will be used the same way for regular stepping as for RunToCursor.
What about passing parameters? Is
PositionInCode.canRunTo(Thread)worse thanPositionInCode.canRunToCurrentThread()?
No, it is one call that is essential.
@dkhalanskyjb.
So, the overhead of calling a function is huge, but inside the function, everything run with the normal performance, right?
There is no exact specification (about JIT/interpretation in the case, for example), but, yes, the performance will be close enough.
@mvicsokolova
And also, is there a chance we'll need to check stepping not only to the currentPosition but to some set of positions in the future?
The same source location can be used from the independent coroutines. So it is important to check run-time information and it is impossible to do for the "future".
And I guess, we could pass a thread reference as a parameter as well
It would be easy to get the current thread on the coroutine side. It is because the debugger needs to prepare "thread" parameter before passing it. I suspect it may be prepared easy enough, but not sure. And I don't understand what is advantage for the additional parameter. I note, that IDEA debugger and the coroutine code will be in the different processes and any data transferring always requires some preparations.
If I understand correctly, only the number of requests and reply packets sent between VM and debugger is critical.
The most critical is to force JVM start to run some code standing on breakpoint. So it is always better to extract/change variables/fields without actual call.
Seems, like canRunToCurrentPosition will be used the same way for regular stepping as for RunToCursor.
Yes, but there will be the same story: just thread filter on the triggered stepping breabpoints. The same DebugProbesImpl.canRunToCurrentPosition(from: PositionInCode) call will do the work.
Alright, it makes sense, thanks!
One more question: is there a difference for the debugger between calling DebugProbesImpl.canRunToCurrentPosition(from: PositionInCode) and oldPosition.canRunToCurrentPosition()?
No, there should not be any difference
I've updated the API to avoid computing the set of running coroutines at the given location twice on the debugger side. Also added some tests.
The final API naming is in the process of discussion.