kotlinx.coroutines
kotlinx.coroutines copied to clipboard
Coroutine is (currently) hard to debug
I'm making extensive use of coroutines, replacing thread calls and use structured concurrency as suggested. however I found coroutines very hard to debug. using org.jetbrains.kotlinx:kotlinx-coroutines-debug:1.6.0 along with -jdk8, -core both at 1.6.0, which is the latest version as of writing
- https://youtrack.jetbrains.com/issue/KTIJ-21056
- can we always turn on coroutine debug(if the overhead is minimal) for at least coroutine listing/stacktrace? a system that is not observable is almost non-debuggable
- provide a jvmti library for common OSes(linux/macos/...) that dumps a structured(with parent/child relationship) coroutine stacktrace when thread dump is requested. currently we have to use
SIGTRAPbut usingSIGQUITmatches the convention. see https://github.com/openjdk/jdk8u/blob/8d5c7386c619a2602d9731c4adbbb1b01aeb449f/hotspot/src/share/vm/runtime/os.cpp#L320 dumpCoroutinesInfois not structured, no parent/child relation. and the printout doesn't end with line feed, which is an oversight maybe?- there is
Job.childrenbut noJob.parent, is there a reason why? this makes constructing job graph very awkward
Hi,
can we always turn on coroutine debug(if the overhead is minimal) for at least coroutine listing/stacktrace if the overhead is minimal
The "minimal" and "acceptable" overhead is a very application specific term, that depends on application SLA and other various factors, so I suggest you to figure it yourself whether it's acceptable. On other side, we've optimized debugging agent to some reasonable extent. The slowest part is collection of creation stacktraces, that can (probably even "should" for production environments) be completely disabled either programmatically or using a system property if you are running as -javaagent
provide a jvmti library
Could you please elaborate on why you need JVMTI library and why agent JAR is not enough? Both maintaining and shipping native libraries in JVM is a real pain, so the reason to do so has to be significant and unachievable otherwise.
dumpCoroutinesInfo is not structured, no parent/child relation.
In typical systems there are from hundreds to thousands coroutines with arbitrary depths. Enabling parent-child relationship (e.g. by properly nesting stacktrace) will render it unusable, taking into account 10-100+ levels of nesting.
The better solution, IMO, is to provide a proper, stable and convenient customization points, so this is easily achievable manually if necessary
and the printout doesn't end with line feed, which is an oversight maybe?
I think so, not sure what was the original reason
there is Job.children but no Job.parent, is there a reason why? this makes constructing job graph very awkward
Because there wasn't any real demand on that before and adding this originally was a non-trivial task. For now it seems already here, just not exposed in public API. It would be nice if you could file a separate issue with a short explanation of your use-case, so we can fix it separately from debug agent
@qwwdfsad thanks for the reply JVMTI can monitor SIGQUIT(instead of SIGTRAP used by javaagent), thus a coroutine dump can be automatically produced when a thread dump is requested, lessen users' learning curve. If using coroutines, thread dump are less informative and a structured coroutine dump would be better. Currently we have to find java process ID(not that easy if you have multiple instance running with the same commandline arguments) and use another ssh session to send a SIGTRAP
Maintaining a native library is indeed a pain, and that is a reason why bundle it in the library makes it easier for the user. The JVMTI code however should be minimal, just set a flag that can be read inside coroutine debug's java library, and then java code produces a coroutine dump.
But you've made a point too, if this easy debug feature is not of much use, it could indeed burden the library. I'll be searching for other ways to monitor SIGQUIT in the meantime.
I'm closing this as it breaks down into multiple issues -- some of them fixed, some of them are being worked on a separate basis (i.e. #3587), and JVMTI is unlikely to ever implemented by us