kotest Kotest >= 5.7.0: Jobs launched in beforeSpec can delay or hang the first test

With the following test class, kotest 5.7.x will execute the test, but hang before calling afterSpec while the test would complete as expected with kotest 5.6.2:

class KotestHangSpec : FunSpec(
  {
    lateinit var job: Job
    beforeSpec {
      println("Starting background job")
      val parentJob = checkNotNull(Job(currentCoroutineContext()[Job])) { "No parent job" }
      job = GlobalScope.launch(parentJob) { delay(Duration.INFINITE) }
    }

    afterSpec {
      println("Cancelling background job")
      job.cancel()
    }

    afterTest { (case, result) ->
      println("Test '${case.name.testName}' $result")
    }

    test("always pass") {
      println("Executing test")
    }
  }
)

This happens when a coroutine is launched in the GlobalScope by the beforeSpec function, passing the calling coroutine's job to the launch function, or passing the entire currentCoroutineContext() to the launch function.

Further investigation and understanding of coroutines APIs leads me to believe that providing the parent job is not necessary, and possibly just plain wrong, but hanging indefinitely is undesirable, even with unexpected inputs.

Nov 11 '23 05:11 wfhartford

I've created a repo with a full project to reproduce the issue here: https://github.com/wfhartford/kotest-hang-3759

Nov 11 '23 05:11 wfhartford

Thanks for reporting and the reproducer! What happens here is the following:

      val parentJob = checkNotNull(Job(currentCoroutineContext()[Job])) { "No parent job" }
      job = GlobalScope.launch(parentJob) { delay(Duration.INFINITE) }

This code seems to launch a coroutine in GlobalScope, but it does not, because it immediately attaches a parent Job from the current coroutine context. The code is equivalent to

job = CoroutineScope(coroutineContext).launch { delay(Duration.INFINITE) }

Passing a Job parameter to a coroutine builder is only possible for historical reasons, but strongly discouraged: Disallow Job passed as context to withContext/launch/async · Issue #3670 · Kotlin/kotlinx.coroutines

This case can be easily fixed by creating a job running independently via

job = GlobalScope.launch { delay(Duration.INFINITE) }

Then it would no longer hang.

But why does it hang? Your coroutine is launched as a child of the current coroutineContext. But this is not the spec-level context you'd expect. Rather, it is the context of the first test running in that spec. The cause is that a spec is instantiated lazily by createAndInitializeSpec, which says:

This avoids creating specs that do nothing other than scheduling tests for other specs to run in.

So here, you were creating a child job, which was then waited for by the coroutine scope surrounding your test. And the test scope could not complete without your child job completing.

This is a bug in Kotest which should be addressed:

When launching a coroutine on a spec level, it should have a spec-level coroutine context.
Ideally, the spec would provide a CoroutineScope which could be used to directly launch any companion coroutines which were needed for the spec.
Possibly, the invocation mechanism of the various before/after/wrapping interceptors should be cleaned up on this occasion as well.

Actually, wrapping interceptors are much better suited to coroutines and structured concurrency than the traditional before/after mechanisms. E.g., I'm frequently using one like this:

/**
 * A test spec and test case extension providing a coroutine context to a spec and each of its test cases.
 */
open class CoroutineContextTestSpecExtension(private val coroutineContext: CoroutineContext) :
    SpecExtension, TestCaseExtension {

    override suspend fun intercept(spec: Spec, execute: suspend (Spec) -> Unit) =
        withContext(coroutineContext) {
            execute(spec)
        }

    override suspend fun intercept(testCase: TestCase, execute: suspend (TestCase) -> TestResult): TestResult =
        withContext(coroutineContext) {
            execute(testCase)
        }
}

This should also be considered to provide users with a "safe by default" experience in Kotest.

Nov 11 '23 16:11 OliverO2

Thanks for the detailed explanation. I didn't realize that the context's Job was so closely linked to the scope.

Ideally, the spec would provide a CoroutineScope which could be used to directly launch any companion coroutines which were needed for the spec.

This is what I was looking for when writing that code and settled for GlobalScope.

Nov 11 '23 16:11 wfhartford

Apparently a regression introduced by #3649, so possibly only affecting beforeSpec.

Nov 11 '23 16:11 OliverO2

@OliverO2 do you think we should launch all tests in a fresh child coroutine even if it's the first test of the spec?

Nov 12 '23 18:11 sksamuel

Running each test in its own coroutine makes the most sense to me (even if it is the first/only test in a spec), as any problems with non-terminating coroutines will surface immediately. We could even add some diagnostics like a timeout triggering a coroutine dump (by default: without aborting a test, as CI can be too slow to find a reasonable timeout).

And then, from a user's point of view, we should have a separate coroutine per spec, being the parent of everything at the next lower level (context or test). So the coroutine hierarchy should mirror the test hierarchy. That would be most in line with user expectations, I'd say.

Nov 12 '23 22:11 OliverO2

Running each test in its own coroutine makes the most sense to me (even if it is the first/only test in a spec), as any problems with non-terminating coroutines will surface immediately. We could even add some diagnostics like a timeout triggering a coroutine dump (by default: without aborting a test, as CI can be too slow to find a reasonable timeout).

And then, from a user's point of view, we should have a separate coroutine per spec, being the parent of everything at the next lower level (context or test). So the coroutine hierarchy should mirror the test hierarchy. That would be most in line with user expectations, I'd say.

We should do this for 6.0 then I think.

Mar 10 '24 03:03 sksamuel