cats-effect icon indicating copy to clipboard operation
cats-effect copied to clipboard

Failed fibers does not log when run with `scala-cli`

Open seigert opened this issue 2 years ago • 6 comments

This script wont log error when run with scala-cli:

//> using scala "3.2.1"
//> using option "-new-syntax"

//> using dep "org.typelevel::cats-effect:3.5.0"

import cats.effect.*

object Main extends IOApp.Simple:
  val run: IO[Unit] = IO
    .raiseError(new Exception("Boom!"))
    .background
    .use { out =>
      out.flatMap(IO.println(_))
    }
$> scala-cli test.scala
Errored(java.lang.Exception: Boom!)

(Almost) the same code prints error when run from sbt console:

Attached file wont print stack trace in when run with scala-cli in terminal.

At the same time, `sbt console` prints it:

```console
scala> import cats.effect.*
scala> import cats.effect.unsafe.implicits.*
scala> IO.raiseError(new Exception("Boom!")).background.use(out => out.flatMap(IO.println(_)))
val res0: cats.effect.IO[Unit] = IO(...)

scala> res0.unsafeRunSync()
java.lang.Exception: Boom!
    at rs$line$3$.<clinit>(rs$line$3:1)
    at rs$line$3.res0(rs$line$3)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at dotty.tools.repl.Rendering.$anonfun$4(Rendering.scala:106)
    at scala.Option.flatMap(Option.scala:283)
    at dotty.tools.repl.Rendering.valueOf(Rendering.scala:106)
    at dotty.tools.repl.Rendering.renderVal(Rendering.scala:149)
    at dotty.tools.repl.ReplDriver.$anonfun$7(ReplDriver.scala:365)
Errored(java.lang.Exception: Boom!)

seigert avatar May 22 '23 19:05 seigert

Ok, so first of all, not logging is the correct behavior :) to see this compare:

a program that sequences the join, thus handling the error (so no logging)

  val run = IO
    .raiseError(new Exception("Boom!"))
    .background
    .use { join =>
      join *> IO.sleep(1.second)
    }

with a program that ignores the join, so the error is unhandled (and indeed logged)

  val run = IO
    .raiseError(new Exception("Boom!"))
    .background
    .use { join =>
      IO.sleep(1.second)
    }

so this seems to be a bug related to unsafeRunSync() as used in the REPL. Note that the result of the unsafeRunSync() is not assigned to a res1, which seems to indicate that it actually threw an error, which is what we are seeing. But I'm not entirely sure about that yet, need to dig in a bit more.

armanbilge avatar May 23 '23 00:05 armanbilge

Ok, I figured it out — it's not related to scala-cli, repls or unsafeRunSync()

It's a race condition. It comes down to whether the IO.raiseError(...) completes before or after the listener is registered on its join. Unless the listener is registered, the error is considered unhandled.

If you change IO.raiseError(...).background to (IO.sleep(1.millis) *> IO.raiseError(...)).background then that seems to fix it.

armanbilge avatar May 23 '23 01:05 armanbilge

I see your point about a race condition, but now I wonder if there should be any race condition at all? Because in all code examples there is no branches, so Fiber#join sequencing in theory could be determined statically.

Another example which counters yours :) but still sequencing join while also logging with scala-cli:

for
  f <- (IO.sleep(1.second) *> IO.raiseError(new Exception("Boom!"))).start
  _ <- IO.sleep(3.seconds)
  o <- f.join
  _ <- IO.println(o)
yield ()

Despite clear fiber.join sequencing we still will see stack trace in stderr.

Now OFC there are (right now) no lifetimes for variables in Scala/JVM, but! maybe (maybe!) we could shift error logging from 'the end of fiber computation' to some point in time where 'it is clear that join is not sequenced'? For example:

  1. for .background and similar methods it would be bracket release -- we canceling the fiber, it is UB to access bracket resource after release, so we could check if anyone actually subscribed to .join. during .use;
  2. for 'regular' Fiber.start at least in JVM logging could be moved to Cleaner (which, sadly, requires JVM 9+) or dreaded finalize, as if this fiber is phantom reachable, then we totally could determine was .join subscribed to or not.

seigert avatar May 23 '23 11:05 seigert

Another example which counters yours :) but still sequencing join while also logging with scala-cli: Despite clear fiber.join sequencing we still will see stack trace in stderr.

Sorry, I'm not sure if I understood. That exactly supports what I said—the error is raised after 1 second, but the join is registered after 3 seconds. So at the time when the error is raised, there are no registered listeners, so it is considered unhandled, and the error is logged.

armanbilge avatar May 23 '23 12:05 armanbilge

That exactly supports what I said—the error is raised after 1 second, but the join is registered after 3 seconds. So at the time when the error is raised, there are no registered listeners, so it is considered unhandled, and the error is logged.

Yes, it is how it happens, but if we look at the code, actual join it absolutely deterministic -- it will happen every time, it is just delayed. I also think that this argument works even with cancelation -- fiber would've been joined every time if not for cancel, so leaked fiber error is problem of one who cancels, not one who got canceled.

So, if we were able to produce some really simple AST for example above with ability to 'look inside of .flatMaps, it would've looked like:

Bind(
  Start(RaiseError(..)),
  Bind(
   Join(),
   Println()
  )
)

and then we could've said: 'for every Start(..) wrapped into Bind(..) lets check right side if there is Join(..) with the result of Start(..)'.

But as we cannot see inside .flatMaps I propose that we 'shift' logging of fiber error stack trace 'after' .flatMap/.use parameter closure produced a result, as this is the moment of time were we can actually determine if fiber was joined or not.

seigert avatar May 23 '23 14:05 seigert

Just linking #2962, as that seems to be the root cause.

durban avatar May 23 '23 15:05 durban