scio icon indicating copy to clipboard operation
scio copied to clipboard

SCollectionMatcher issue when targeting Java 17+

Open f-loris opened this issue 5 months ago • 2 comments

I recently upgraded one project from Java 8 to Java 21 and I'm experiencing issues in my tests when targeting Java 17 or 21 bytecode. In this case some tests started to fail using the matchers satisfy/satisfySingleValue in my JobTests.

Therefore, I built some simple reproducer that shows that something is dropping the local variable in my lambda expression as it's value is null when executed.

Scio: 0.14.7 Java: 21 Scala: 2.13.14 Scala Compiler flags: -release 21 -Ydelambdafy:inline -Ymacro-annotations -language:higherKinds -language:implicitConversions

import com.spotify.scio.ContextAndArgs
import com.spotify.scio.io.TextIO
import com.spotify.scio.testing._

object TestPipeline {
  def main(cmdlineArgs: Array[String]): Unit = {
    val (sc, args) = ContextAndArgs(cmdlineArgs)
    sc.read(TextIO(args("source")))(TextIO.ReadParam())
      .map(v => s"Hello $v")
      .write(TextIO(args("destination")))(TextIO.WriteParam())

    sc.run().waitUntilFinish()
  }
}

object IssueReproducer {
  val WORLD = "World"
}

class IssueReproducer extends PipelineSpec {

  // fails now
  it should "work with local value" in {
    val world = "World"
    JobTest[TestPipeline.type]
      .args("--source=source.csv", "--destination=destination.csv")
      .input(TextIO("source.csv"), Seq(world))
      .output(TextIO("destination.csv"))(
        _ should satisfySingleValue[String] { value =>
          println(s"Hello $world") // shows "Hello null"
          value == s"Hello $world"
        }
      )
      .run()
  }

  // works
  it should "work with global value" in {
    JobTest[TestPipeline.type]
      .args("--source=source.csv", "--destination=destination.csv")
      .input(TextIO("source.csv"), Seq(IssueReproducer.WORLD))
      .output(TextIO("destination.csv"))(
        _ should satisfySingleValue[String] {
          _ == s"Hello ${IssueReproducer.WORLD}"
        }
      )
      .run()
  }

}

The output of the failing test is Hello null as the variable world is then null with the test result being of course a failing assertion:

pply@{Matchers.scala:7291}:2/ParMultiDo(Anonymous).output: assertion failed
java.lang.AssertionError: apply@{Matchers.scala:7291}:2/ParMultiDo(Anonymous).output: assertion failed
	at org.apache.beam.sdk.testing.PAssert$PAssertionSite.capture(PAssert.java:175)
	at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:498)
	at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:490)
	at com.spotify.scio.testing.SCollectionMatchers$$anon$19$$anon$20.apply(SCollectionMatchers.scala:488)
	at com.spotify.scio.testing.SCollectionMatchers$$anon$19$$anon$20.apply(SCollectionMatchers.scala:482)
    ....

I already experimented with ClosureCleaner and Externalizer but I couldn't reproduce it with them alone. When changing the Scala compiler release flag to 8 or even 11 it works fine, but fails with 17 or 21. Isn't it supposed to work when targeting newer Java versions? Is there a general risk that targeting Java 21 might break things like this also in productive pipelines and not just only tests? At least right now I haven't discovered any other issue.

f-loris avatar Sep 12 '24 13:09 f-loris