cats-effect icon indicating copy to clipboard operation
cats-effect copied to clipboard

Using the finalizer from `createDefaultComputeThreadPool` inside IO hangs subsequent `unsafeRunSync`s even on unrelated `IORuntime`s

Open neko-kai opened this issue 3 years ago • 3 comments
trafficstars

The following code always hangs:

import cats.effect.unsafe.{IORuntime, IORuntimeConfig, Scheduler}
import cats.effect.{IO, Resource, Sync}

object App extends App {
  locally {
    lazy val newRuntime: IORuntime = IORuntime.apply(cpuPool, cpuPool, Scheduler.createDefaultScheduler()._1, () => (), IORuntimeConfig())
    lazy val (cpuPool, finalizer) = IORuntime.createDefaultComputeThreadPool(newRuntime)

    IO(finalizer.apply())
      .unsafeRunSync()(IORuntime.global)

    IO.println("abc")
      .unsafeRunSync()(IORuntime.global)
  }
}

https://scastie.scala-lang.org/C5KdSm1xQOq9gkl9vrxNLQ

Note that hang reproduces if the finalizer is called inside IO, as in IO(finalizer.apply()), if finalizer.apply() is moved outside, a different error emerges on subsequent unsafeRunSync:

None.get
java.util.NoSuchElementException: None.get
	at scala.None$.get(Option.scala:627)
	at scala.None$.get(Option.scala:626)
	at cats.effect.IOPlatform.unsafeRunSync(IOPlatform.scala:42)

It does not seem immediately obvious whether cpuPool above and IORuntime.global are connected by global mutable state somehow, but that is probably the case if executing the finalizer of a seemingly unrelated freshly created pool causes execution rejection in the IORunner.global's pool?..

Using IOApp.Simple does not change the result:

lazy val newRuntime: IORuntime = IORuntime.apply(cpuPool, cpuPool, Scheduler.createDefaultScheduler()._1, () => (), IORuntimeConfig())
lazy val (cpuPool, finalizer) = IORuntime.createDefaultComputeThreadPool(newRuntime)

new IOApp.Simple {
  override def run: IO[Unit] = IO(finalizer.apply())
}.main(Array())

new IOApp.Simple {
  override def run: IO[Unit] = IO.println("abc")
}.main(Array())

I could not find a workaround so far other than not calling the finalizer, or using an ordinary FixedThreadPool instead of WorkStealingThreadPool.

neko-kai avatar May 24 '22 02:05 neko-kai

Wow that's absolutely fascinating. Would you mind getting a thread dump of the hang? The only global mutable state that I can think of which would connect unrelated runtimes would be mbeans. Would you mind running with -Dcats.effect.tracing.mode=none just to see?

djspiewak avatar May 24 '22 03:05 djspiewak

@djspiewak Well, in the thread dump the only non-VM thread is the one waiting on unsafeRunSync. Disabling tracing had no effect. The only other global state I can think of is the MBean setup stuff inside createDefaultComputeThreadPool.

"main@1" prio=5 tid=0x1 nid=NA waiting
  java.lang.Thread.State: WAITING
	  at jdk.internal.misc.Unsafe.park(Unsafe.java:-1)
	  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
	  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
	  at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
	  at cats.effect.IOPlatform.$anonfun$unsafeRunTimed$2(IOPlatform.scala:80)
	  at cats.effect.IOPlatform$$Lambda$64.1920387277.apply(Unknown Source:-1)
	  at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:62)
	  at scala.concurrent.package$.blocking(package.scala:124)
	  at cats.effect.IOPlatform.unsafeRunTimed(IOPlatform.scala:80)
	  at cats.effect.IOPlatform.unsafeRunSync(IOPlatform.scala:42)
	  at example.App$.delayedEndpoint$example$App$1(App.scala:15)
	  at example.App$delayedInit$body.apply(App.scala:6)

neko-kai avatar May 24 '22 03:05 neko-kai

Disabling tracing should have disabled the mbeans. That's really interesting. Will investigate more…

djspiewak avatar May 24 '22 03:05 djspiewak

I was able to reproduce the hang as late as v3.3.14. But it no longer hangs by v3.4.0-RC1 and no longer hangs in the current release v3.5.3.

armanbilge avatar Jan 17 '24 05:01 armanbilge