scala3 icon indicating copy to clipboard operation
scala3 copied to clipboard

Scala 3 lazy vals are not serialization-safe

Open lbialy opened this issue 1 year ago • 8 comments

Description

I'm reporting a vulnerability in Scala 3's lazy val implementation concerning Java serialization. JVM can serialize a lazy val field in the "Waiting" state should rhs evaluation take too long. However, because the thread that calls .countDown() isn't present on the recipient machine (the one deserializing the value), the lazy val remains set to Waiting forever, blocking all threads accessing the lazy val field on CountDownLatch#await() call:

"ClusterSystem-akka.actor.default-dispatcher-4" #31 [41987] prio=5 os_prio=31 cpu=52.91ms elapsed=15.05s tid=0x0000000120046a00 nid=41987 waiting on condition  [0x000000017399d000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x000000061ff1fcc0> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:221)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:754)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly([email protected]/AbstractQueuedSynchronizer.java:1099)
	at java.util.concurrent.CountDownLatch.await([email protected]/CountDownLatch.java:230)
	at ClusterApp$Message.bomb$lzyINIT1(ClusterApp.scala:25)
	at ClusterApp$Message.bomb(ClusterApp.scala:24)
	at ClusterApp$WorkerNode$.apply$$anonfun$2$$anonfun$1(ClusterApp.scala:52)
	at ClusterApp$WorkerNode$$$Lambda/0x00000088014197d8.apply(Unknown Source)
	at akka.actor.typed.internal.BehaviorImpl$ReceiveMessageBehavior.receive(BehaviorImpl.scala:152)
	at akka.actor.typed.Behavior$.interpret(Behavior.scala:282)
	at akka.actor.typed.Behavior$.interpretMessage(Behavior.scala:238)
	at akka.actor.typed.internal.InterceptorImpl$$anon$2.apply(InterceptorImpl.scala:57)
	at akka.actor.typed.internal.adapter.GuardianStopInterceptor.aroundReceive(GuardianStartupBehavior.scala:59)
	at akka.actor.typed.internal.InterceptorImpl.receive(InterceptorImpl.scala:85)
	at akka.actor.typed.Behavior$.interpret(Behavior.scala:282)
	at akka.actor.typed.Behavior$.interpretMessage(Behavior.scala:238)
	at akka.actor.typed.internal.adapter.ActorAdapter.handleMessage(ActorAdapter.scala:133)
	at akka.actor.typed.internal.adapter.ActorAdapter.aroundReceive(ActorAdapter.scala:109)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:577)
	at akka.actor.ActorCell.invoke(ActorCell.scala:545)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.util.concurrent.ForkJoinTask.doExec([email protected]/ForkJoinTask.java:387)
	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec([email protected]/ForkJoinPool.java:1312)
	at java.util.concurrent.ForkJoinPool.scan([email protected]/ForkJoinPool.java:1843)
	at java.util.concurrent.ForkJoinPool.runWorker([email protected]/ForkJoinPool.java:1808)
	at java.util.concurrent.ForkJoinWorkerThread.run([email protected]/ForkJoinWorkerThread.java:188)

Compiler version

Scala 3.3.1+, probably Scala 3.3.0 too

Minimized code

https://github.com/VirtusLab/scala-3-lazy-val-vs-java-serialization

Workaround

It is possible to resolve this by annotating the lazy val field with @transient.

Expected outcome

We think that it would be a good idea for the compiler to generate all lazy val fields with transient modifier as this implementation will always be prone to this problem, especially given that LazyValControlState extends Serializable since #16806.

lbialy avatar Jun 28 '24 12:06 lbialy

This is probably impossible to test correctly on our CI.

Once it is fixed, I suggest adding the reproducer repo to the community build. We can run two first steps from the readme with some predefined timeout, to test if the problem is fixed.

@WojciechMazur will this require significant changes to the OCB?

Kordyjan avatar Jun 28 '24 12:06 Kordyjan

This could be simplified further to serialize to binary file but it would still require two jvms. Would that fit to CI?

lbialy avatar Jun 28 '24 12:06 lbialy

will this require significant changes to the OCB?

No, it should be really straightforward. All we need is to include the reproducer repo in the OpenCB config https://github.com/VirtusLab/community-build3/blob/master/coordinator/configs/custom-projects.txt

WojciechMazur avatar Jun 28 '24 12:06 WojciechMazur

Having it just compile will test nothing and with the current shape of the code it just hangs on one thread, there are no assertions as it's just demonstrating the problem. I think it can be greatly simplified (no akka for starters) and incorporated into the test suite of the language instead of OCB.

lbialy avatar Jun 28 '24 12:06 lbialy

//> using scala 3.3.3
//> using jvm 21
//> using dep com.softwaremill.ox::core:0.2.2

import java.io.*
import ox.*
import scala.concurrent.duration.*

case class Message(content: String):
  lazy val bomb: String =
    sleep(200.millis)
    "BOMB: " + content

def serialize(obj: Message): Array[Byte] =
  val byteStream = ByteArrayOutputStream()
  val objectStream = ObjectOutputStream(byteStream)
  try
    objectStream.writeObject(obj)
    byteStream.toByteArray
  finally
    objectStream.close()
    byteStream.close()

def deserialize(bytes: Array[Byte]): Message =
  val byteStream = ByteArrayInputStream(bytes)
  val objectStream = ObjectInputStream(byteStream)
  try
    objectStream.readObject().asInstanceOf[Message]
  finally
    objectStream.close()
    byteStream.close()

@main def main =
  val bytes = supervised:
    val msg = Message("test")

    fork:
      msg.bomb // start evaluation before serialization

    sleep(50.millis) // give some time for the fork to start lazy val rhs eval

    serialize(msg) // serialize in the meantime so that we capture Waiting state

  val deserializedMsg = deserialize(bytes)
  unsupervised:
    @volatile var msg = ""
    val f = forkCancellable:
      msg = deserializedMsg.bomb

    sleep(1000.millis)
    if !msg.isBlank then println(s"succeeded: $msg")
    else
      f.cancel()
      println("failed to read bomb in 1s!")

lbialy avatar Jun 28 '24 15:06 lbialy

arguably can be done without ox and jdk 21 :D one second please

lbialy avatar Jun 28 '24 15:06 lbialy

//> using scala 3.3.3

import java.io.*

case class Message(content: String):
  lazy val bomb: String =
    Thread.sleep(200)
    "BOMB: " + content

def serialize(obj: Message): Array[Byte] =
  val byteStream = ByteArrayOutputStream()
  val objectStream = ObjectOutputStream(byteStream)
  try
    objectStream.writeObject(obj)
    byteStream.toByteArray
  finally
    objectStream.close()
    byteStream.close()

def deserialize(bytes: Array[Byte]): Message =
  val byteStream = ByteArrayInputStream(bytes)
  val objectStream = ObjectInputStream(byteStream)
  try
    objectStream.readObject().asInstanceOf[Message]
  finally
    objectStream.close()
    byteStream.close()

@main def main =
  val bytes = locally:
    val msg = Message("test")

    val touch = Thread(() => {
      msg.bomb // start evaluation before serialization
      ()
    })
    touch.start()

    Thread.sleep(50) // give some time for the fork to start lazy val rhs eval

    serialize(msg) // serialize in the meantime so that we capture Waiting state

  val deserializedMsg = deserialize(bytes)

  @volatile var msg = ""
  @volatile var started = false
  val read = Thread(() => {
    started = true
    msg = deserializedMsg.bomb
    ()
  })
  read.start()

  Thread.sleep(1000)
  if !started then throw Exception("wtf")

  if !msg.isBlank then println(s"succeeded: $msg")
  else
    read.interrupt()
    println("failed to read bomb in 1s!")

lbialy avatar Jun 28 '24 15:06 lbialy

This even shows the nice stack trace from CountDownLatch#await():

Exception in thread "Thread-1" java.lang.InterruptedException
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1100)
	at java.base/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
	at Message.bomb$lzyINIT1(main.scala:8)
	at Message.bomb(main.scala:7)
	at main$package$.$anonfun$2(main.scala:49)
	at java.base/java.lang.Thread.run(Thread.java:1583)
failed to read bomb in 1s!

lbialy avatar Jun 28 '24 15:06 lbialy

We just got that issue in production, not with Java serialization but with Kryo serialization instead. Trying to call a lazy val on a case class that was deserialized by Kryo led to the thread being stuck on that stack:

   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x0000040c85642870> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:221)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:754)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly([email protected]/AbstractQueuedSynchronizer.java:1099)
	at java.util.concurrent.CountDownLatch.await([email protected]/CountDownLatch.java:230)
	at com.devsisters.ck.entities.AchievementCounter2.values$lzyINIT2(AchievementCounter.scala:65)

We use Scala version 3.6.3 so it should include the fix. Was the fix specific to Java serialization? Is there a way to fix it more generally?

ghostdogpr avatar Feb 28 '25 02:02 ghostdogpr

@ghostdogpr I recall we had a conversation about this via DMs, right? Were you able to find a more general solution for Kryo for this in the end? I think something like additional entries for Kryo's registry for these state-encoding types from LazyVals were required but I'm not sure and it would be good to leave a trace and maybe update docs. Since lazy vals are getting another makeover for 3.8.0 (due to sun.misc.Unsafe sunsetting), it sounds like a good moment to check up on this.

lbialy avatar Aug 02 '25 10:08 lbialy

@lbialy our fix was to add a custom serializer for scala.runtime.LazyVals.Waiting and scala.runtime.LazyVals.Evaluating.type. We contributed it to scala-kryo-serialization in https://github.com/altoo-ag/scala-kryo-serialization/pull/25. We also have a fix for Chill but the library is unmaintained (solution mentioned in this blog post, it's pretty similar). If the new solution in 3.8.0 requires the same kind of fix but with different classes, it will make it harder to add to libraries since those stay on the LTS and won't see these classes.

ghostdogpr avatar Aug 02 '25 10:08 ghostdogpr