kamon-akka prevents ActorSystem from terminating
Under the following circumstances:
-
kamon-bundleis on the classpath andKamon.init()is called at the start of the application -
kanela-agentis used as a Java agent andkamon-akkais on the classpath
an ActorSystem within the application is somehow prevented from terminating correctly. In the second case, if kamon-akka is not on the classpath, there is no issue.
I have reproduced this in a Scastie. If the Scastie is run, it never terminates and times out. If the call to Kamon.init() is removed, the run terminates as expected.
I've also reproduced it running locally (with the kanela-agent), and with jvisualvm I can see that even though the call to terminate() succeeds, threads from the defaut-dispatcher and internal-dispatcher stick around. Since they are not daemon threads, the process won't exit.
I have a workaround which is to set akka.daemonic = true in application.conf. This makes the dispatcher threads into daemon threads, so the process exits as expected, but this doesn't feel great.
This doesn't seem like expected behavior, but let me know if I'm missing something. Thanks!
Oh, I was missing something. Adding Kamon.stop before the as.terminate() fixes things: https://scastie.scala-lang.org/47Gg8GcPTBOrrvXo257S3w. However it only fixes the first case (with kamon-bundle), not the case with kanela-agent). And I would argue it's a bit surprising that even though you start Kamon first, you don't shut it down last.
@dvgica I looked into kamon-akka's reference.conf and tried disabling instrumentations to narrow down the root cause.
I'm using Kamon 2.5.0 (didn't want to switch the otel reporter implementation yet), and Akka 2.6.19. So far I've only tested early attachment using Kamon.init() (haven't tried with Kanela using -javaagent, but hopefully it'll be the same).
Turns out that disabling kamon.instrumentation.akka.instrumentations.SchedulerInstrumentation resolves the issue. I have no theory as to why, nor do I know what functionality I'll be missing without it (we generally avoid Akka's scheduler, but I assume it's used by Akka under the hood) -- so caution is advised for anyone who stumbles across this :)
I found two ways to disable SchedulerInstrumentation. The first is to override kanela.modules.akka.instrumentations with all but the problematic instrumentation (there's no way to "substract" in conf):
kanela.modules.akka.instrumentations = [
"kamon.instrumentation.akka.instrumentations.EnvelopeInstrumentation",
"kamon.instrumentation.akka.instrumentations.SystemMessageInstrumentation",
"kamon.instrumentation.akka.instrumentations.RouterInstrumentation",
"kamon.instrumentation.akka.instrumentations.ActorInstrumentation",
"kamon.instrumentation.akka.instrumentations.ActorLoggingInstrumentation",
"kamon.instrumentation.akka.instrumentations.AskPatternInstrumentation",
"kamon.instrumentation.akka.instrumentations.EventStreamInstrumentation",
"kamon.instrumentation.akka.instrumentations.ActorRefInstrumentation",
"kamon.instrumentation.akka.instrumentations.akka_25.DispatcherInstrumentation",
"kamon.instrumentation.akka.instrumentations.akka_26.DispatcherInstrumentation",
"kamon.instrumentation.akka.instrumentations.akka_26.ActorMonitorInstrumentation",
// zombifies Akka, keep disabled until https://github.com/kamon-io/Kamon/issues/1176 is resolved
// "kamon.instrumentation.akka.instrumentations.SchedulerInstrumentation"
]
The second was to keep the instrumentation, but exclude the only Scheduler implementation that is of any consequence:
// avoids an instrumentation that zombifies Akka
// keep excluded until https://github.com/kamon-io/Kamon/issues/1176 is resolved
kanela.modules.akka.exclude += "^akka.actor.LightArrayRevolverScheduler$"
The latter prevents any instrumentation of the scheduler. It's nicer and looks like a smaller piece of techdebt to keep track of, but might be less precise.
@ivantopo This was probably introduced by #1135, so only Kamon 2.5.0 and later should be impacted. Scheduler.scheduleOnce wasn't instrumented before then. This answers my concern above -- since most of our code was developed before Kamon 2.5.0 we won't be missing anything by disabling Scheduler instrumentation for now.