lagom Play json formats which serialize traits do not work with Lagom's persistence layer

Lagom Version: 1.3.x

API: Scala

Problem:

The process of selecting a potential serializer differs between Akka and Lagom. As a result instances of Format[T] which serve as serializers for traits/base classes are either not used or not found when registered with a JsonSerializerRegistry (for use in Lagom's persistence layer).

Details:

Given a trait and case class like this:

sealed trait UserEvent

case class UserUpdatedEvt(email: String,
                          passwordHash: String,
                          firstName: String,
                          lastName: String,
                          timestamp: Instant) extends UserEvent

// and more case classes which extend UserEvent...

and a Format[T] as well as a JsonSerializationRegistry:

object UserEvent {
  implicit val format: Format[UserEvent] = new Format[UserEvent] {
    override def reads(json: JsValue): JsResult[UserEvent] = {
      // determine which case class we intend to deserialize by 
     // inspecting the json and return JsSuccess
      ???
    }

    override def writes(o: UserEvent): JsValue = {
      // use a match statement to determine which case class we are
      // serializing and add an extra field (e.g. 'type') to the resulting 
      // JsObject (for use in identifying the correct type during deserialization)
      ???
    }
  }
}

object UserSerializerRegistry extends JsonSerializerRegistry {
  override def serializers: Seq[JsonSerializer[_]] = Seq(
    JsonSerializer[UserEvent]
  )
}

When attempting to persist an instance of `UserUpdatedEvt`

If the akka.actor.serialization.bindings.java.io.Serializable configuration setting (in application.conf) is left to its default setting Akka will:

find both a java serializer as well as the provided PlayJsonSerializer
produce the warnings below and use the java serializer:

[warn] a.s.Serialization(akka://account-service-impl-application) - Multiple serializers found for class com.account.impl.UserUpdatedEvt, choosing first: Vector((interface java.io.Serializable,akka.serialization.JavaSerializer@36a7e601), (interface com.account.impl.UserEvent,com.lightbend.lagom.scaladsl.playjson.PlayJsonSerializer@73a57eb1
[warn] a.s.Serialization(akka://account-service-impl-application) - Using the default Java serializer for class [com.account.impl.UserUpdatedEvt] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'

Note that even though it defaults to java serialization it does actually find the PlayJsonSerializer for the UserEvent trait. This is because Akka finds potential serializers in its registry by calling isAssignableFrom as seen here: Serialization.scala#L233

If the akka.actor.serialization.bindings.java.io.Serializable configuration setting (in application.conf) is set to none Akka will:

find the PlayJsonSerializer and call toBinary as seen here: Serialization.scala#L114
PlayJsonSerializer will then attempt to use the manifestClassName to look up the corresponding serializer in a Map[String, Format[AnyRef]] as seen here: PlayJsonSerializer.scala#L48
The look up will fail because the manifestClassName is "UserUpdatedEvt" not "UserEvent"
An exception is thrown producing an error like:

java.lang.RuntimeException: Missing play-json serializer for [com.account.impl.UserUpdatedEvt]

Potential fixes:

For the java serialization default selection issue:

After fixing the lookup issue (see below) set akka.actor.serialization.bindings.java.io.Serializable to none if a non-empty PlayJsonSerializer is specified in the LagomApplication (all or nothing approach)
After fixing the lookup issue (see below) add some documentation stating that if trait/base class Format[T]s will be used in the persistence layer then akka.actor.serialization.bindings.java.io.Serializable should be manually set to none in appliaction.conf
Find someway to get PlayJsonSerializer to have higher priority (e.g. Jsonable :poop:)

For the PlayJsonSerializer serializer lookup issue

~~Implement a similar selection process using isAssignableFrom in PlayJsonSerializer. The downside here is that we'd be repeating the work already done by Akka.~~
~~To avoid the repeated reflection of solution 1 we could instead~~
- ~~add a def toBinary(o: AnyRef, clazz: Class[_]): Array[Byte] method to the Serializer trait in Akka and have Akka call that instead - passing along the Class[_] which lead to the selection of the serializer being called (in the example above this would be the UserEvent trait's Class[_]`).~~
- ~~Override this new method in PlayJsonSerializer and use the provided Class[_] during Format[T] lookup.~~
- ~~For backwards compatibility the default implementation of this new method should be:~~

def toBinary(o: AnyRef, clazz: Class[_]): Array[Byte] = toBinary(o)

I have yet to look at the deserialization side of this problem so I'm not sure if there will be similar issues to resolve. I'm looking to start a discussion around this to see if anyone has ideas for a better solution.

Final Note: I ran into this problem while using play-json-derived-codecs for PersistentEntity serialization/deserialization (in an attempt to reduce boiler plate). I'm aware that this problem can be avoided by ditching the derived codecs and declaring a unique Format[T] for each case class. However after figuring out the root cause I'm inclined to say that Akka is more-or-less correctly assuming that Serializer.toBinary should be able to handle covariant arguments. After all this is the same assumption we make when calling any function that takes some object interface as an argument. The rules of polymorphism are just a little convoluted here by the fact that we first register the serializer's "actual argument type" with Akka - and then later implement an interface which takes anAnyRef.

Jun 07 '17 04:06 crfeliz

Update: I think I found a better solution: Have the serializerDetailsFor method in JsonSerializerRegistry produce a sequence of SerializerDetails mapping each entityClass to a separate instance of PlayJsonSerializer:

  def serializerDetailsFor(system: ExtendedActorSystem, registry: JsonSerializerRegistry): immutable.Seq[SerializerDetails] = {
    registry.serializers.map { serializer =>
      SerializerDetails(
        s"lagom-play-json.serialization.bindings.${serializer.entityClass.getName}",
        new PlayJsonSerializer(system, serializer, registry.migrations),
        Vector(serializer.entityClass)
      )
    }
  }

Note: Entity class here refers to the persisted event's class - not the PersistentEntity class.

As you can see above I also changed the constructor for PlayJsonSerializer to take a single serializer and all of the migrations. Next in PlayJsonSerializer we no longer need to look up a format because there is only one available. We can assume that if toBinary is being called by akka its because akka already determined that this serializer should be able handle the incoming object:

override def toBinary(o: AnyRef): Array[Byte] = {
    // ... elided
    val format = serializer.format.asInstanceOf[Format[AnyRef]]

    val json = format.writes(o)
    val result = Json.stringify(json).getBytes(charset)

    // ...elided
    result
  }

Finally in the fromBinary we can also trust akka's selection process and just use the only available serializer.

I've made these changes locally and verified that they do solve the problem as stated above. However I could use some help understanding if/how this might impact the migration process. My initial thoughts are that migrations should not be affected because migration selection is still based on the object's manifestClassName. It would just be up to the user to make sure the Format[T] instance will actually be capable of deserializing the migrated/transformed JSON.

I'm new to the contribution process. I'd like to push up my early work for folks to see/comment. Should I link a pr in my fork of Lagom or should I create an upstream review-only PR?

Jun 07 '17 11:06 crfeliz

Hi @crfeliz,

I have to say, I'm surprised that this didn't come to the surface before. I'm happy to have a look and try to reproduce it. Or, if you have some sample app around you can publish it on GitHub and put the link here.

With respect to PRs, the procedure is to fork the repo, push your changes to your fork and create a PR in GitHub from your fork to the master branch in Lagom.

We will probably need some iterations on reproducing the issue and discussing the PR.

Jun 07 '17 14:06 octonato

Hello @rcavalcanti Thanks for the info

I've created a [WIP] pr with the beginnings of a fix: #787 I've also created a sample project which reproduces the issue: https://github.com/crfeliz/lagom-issue-786-repro

Jun 07 '17 19:06 crfeliz

Thanks @crfeliz. The sample will be very helpful. I will have a look on it tomorrow.

Jun 07 '17 19:06 octonato

I'm running into an issue on the deserialization side with the approach in #787: Each message is persisted in the log with an int serializer id . That serializer id is then used to look up the deserializer. These ids must be unique. Lagom configures a static serializer id for PlayJsonSerializer in the play-json project's reference.conf following the pattern dictated by akka: akka.actor.serialization-identifiers."FQCN" = ID. Because this pr creates multiple instances of PlayJsonSerializer akka ends up picking the first serializer rather than the correct serializer when deserializing. Overriding the serialization-identifies in the user app doesn't work because the FQCN for each serializer is the same (in #787 implementation).

Some options:

Dynamically generating the ids is possible (by overriding the identifier field of the Serializer trait). However this id generation has to be stable which would make declaring the serializers brittle (depending on what the id generation is based on).
Have the user declare static ids for each registered serializer either as part of the JsonSerializationRegistry declaration or via a config pattern similar to the one used by akka.
Make option 2 an optional mode of operation. I.e. by default use the static id configured by lagom and a single unified serializer, but if the user declares the ids switch to using the user defined ids and separate serializers.

If there is a way to implement option 1 in a robust and non-obtrusive way I'd favor that. I'm not sure if this is possible though. Options 2 could work but it would make using json for serialization a little bit more work having to declare these ids. Option 3 would probably require too much explanation in the docs. Any better ideas?

Jun 07 '17 21:06 crfeliz

Update: I found a fourth option to address the deserialization issue (details in #787's description). I think #787 works as a good proof of concept for fixing this. However because this approach changes the format of the manifest string produced by the serializers we have to be 100% sure that replay/migrations still work. What I have in the pr is intended to be backwards compatible but I definitely need more experienced eyes on this to know if there are any other touch points I'm missing.

Jun 08 '17 11:06 crfeliz

Hi @crfeliz

First of all, thanks for this great investigative work and detailed report. You have found lots of important points here. I finally had the time go through all this. My excuses for the delay.

Building up from what you have found so far, I have the following in mind:

Ideally we should be able to give precedence to serialisers for the most specific type. I still need to check the akka code on that, but I guess it's favouring the first on the list which unfortunately is the JavaSerializer. The Akka team really recommends to avoid Java serialization, so we can escalate that to issue to them.

The other point, which is more related with Lagom philosophy, is that we must avoid pushing to the users the management of serialiser IDs. This is error prone and dangerous. Data persisted with the wrong serialization strategy is an open door to hell in production systems.

Moreover, although we can all agree that disabling java ser is a good thing to do, we can't suppress that option. So, we should be able to have java ser enabled and have serialisers based on ADTs sealed traits at the same time.

That all to say, that your work is of great help and we have lots of material to think and evaluate the options. Unfortunately we will need some more iterations on that and evaluate other options as well.

I suggest that we keep this PR (#787) open for the moment and we keep discussing here. Eventually we open other parallel PRs or branches with some alternative implementations.

Thanks a lot!

Jun 09 '17 18:06 octonato

Hmm, I found this comment in akka Seriazation.scala // bindings are ordered from most specific to least specific

I will need to debug it to understand why it's picking two and not detecting that the sealed trait is more specific than java.io.Serializable

Jun 09 '17 18:06 octonato

@rcavalcanti What they mean by more specific is subclasses will be selected before their base classes. But because arbitrary traits don't subclass Serializable the ordering between them is not guaranteed (hence the second comment in the sort method in Serialization.scala). I mentioned this in the comments for #787. And cool - I'm in favor of more iterations. I did not mention here yet but I actually added tests for various scenarios in the context of #787's changes (including tests for migrations and backwards compatibility). The only piece I haven't sorted out is how to cleanly handle the JavaSerialization problem. I'm beginning to think the only options are to change akka's selection process (e.g. always prefer anything besides java serialization) or dynamically (or manually) disable java serialization when PlayJson serializers are available.

Would it help if I commented on the code in #787? I think the fundamental fix there is sound and worth consideration as the start of a final solution.

Jun 09 '17 19:06 crfeliz

Regarding akka serializer selection here is the comment I referenced previously: Serialization.scala

Jun 09 '17 19:06 crfeliz

Yes, I didn't have time to think or check it further, but I was suspecting that the problem was related to the fact that we need to match a trait.

I also saw your tests. Really good you have thought about covering all those cases.

We still have the problem of having many serialisers but only one serialiser id.

Jun 09 '17 21:06 octonato

FYI, related stuff: https://github.com/akka/akka/issues/17252

Jun 09 '17 21:06 octonato

Thanks for that info @rcavalcanti I closed the previous PR and addressed remaining issues:

Reintroduce Jsonable until akka/akka#17252 is addressed
One serializer for all (with one serializer id)
- In the previous implementation I'd assumed that looking up the serializer by id was only ever used for deserialization. However after you raised your concerns I did a little digging and found this isn't necessarily the case. reverting to one-serializer-for-all alleviates this concern.
Clarified terms to make all of it easier to follow (I hope 😃 ):
- class registered with the serializer (may or may not be abstract): registeredClass
- class of the run time object being serialized/deserialized: actualClass

My primary concern with respect to this feature as a whole is its potential effect on Migration performance. As you'll see in the pr its necessary to track rename migrations for registeredClass and actualClass in order to properly support this feature. As a result:

creating and parsing the manifest string requires more steps
applying a migration requires the additional step of migrating the registeredClassName before looking up the Format[T]
and (imo) the most displeasing change is the need to search for the serializer using isAssignableFrom during serialization -- doubling up on the search already performed by akka internally.

After thinking about this for a while I'm fairly certain that in order to have this feature we have to pick from one of these three options:

Run-time approach (implemented by my new pr):
- fully automated and mostly transparent to the user (with the exception of needing to extend Jsonable until the akka issue is addressed) at the cost of a little more cpu bound work in both serialization/deserialization and more metadata in the messages themseleves.
Compile-time approach
- Generate serializer ids for separate specialized serializers (at the cost of making whatever the id is based on unchangeable --🚫 unless we want to have serializer id migrations as well)
Manual approach:
- Have the user manage serializer ids by having them passed into lagom (I'm just being thorough in listing the options -- I'm very much not in favor of this).

Another thing to consider is what would happen when a user restructures a class hierarchy. I haven't actually tried running this scenario in a test yet (in the context of trait/base class serializers) but I have a hunch that this would lead to complicated migrations and potentially even near-impossible-to-migrate scenarios.

Having gone through the paces I think there is a fourth option worth considering:

Discourage and/or disallow this altogether.
- Explain in the docs why this is not permitted
- use macros to assert that any class registered with JsonSerializer is final

Jun 10 '17 06:06 crfeliz

These three tests keep failing with the same error java.lang.RuntimeException: barrier failed: cassandra-started:

[error] (persistence-cassandra-scaladsl/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] (kafka-broker-scaladsl/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] (persistence-cassandra-javadsl/test:test) sbt.TestsFailedException: Tests unsuccessful

However the tests pass when I run them locally. Are these tests known to be flaky or am I just missing something?

Jun 10 '17 19:06 crfeliz

Yea definitely flaky tests. #792 passing now.

Jun 11 '17 04:06 crfeliz

@rcavalcanti I'm thinking more and more that this feature is a bad idea. After playing around with this feature in an actual project and considering all the things that it "enables" -- I'm inclined to say that the right answer is to explicitly disallow the use of traits/base class serializers. There are probably many weird scenarios and migration puzzlers that can arise from enabling this:

case class MyEvent() extends BaseTraitA with BaseTraitB

val fmtA = // code for Format[BaseTraitA]
val fmtB = // code for Format[BaseTraitB]
// ...build JsonSerializationRegistery
// ...who serializes what??

or

trait GenericTrait[T]
case class MyEvent1() extends GenericTrait[String]
case class MyEven2() extends GenericTrait[Int]

val serializerA = // code for JsonSerializer with Format[GenericTrait[String]]
val serializerB = // code for JsonSerializer with Format[GenericTrait[Int]]
//...erasure happens
//...build JsonSerializationRegistery
serializerA.entityClass == serializerB.entityClass // this will be true

The list goes on...

renaming in complicated hierarchies --> ?
Changing type parameter Arity --> ?
Changing type param arguments --> ?

The more I play with it the more I'm convinced it is a Pandora's Box 😅

Jun 12 '17 20:06 crfeliz

@crfeliz It's true that there are many scenarios that will make migration very hard, however I still think we should provide some facilities for ADTs. Maybe not in the way we have thought / tried so far.

The Java API use annotations and Jackson behind the scene. Users don't have to declare the serializers by hand, because Commands and Events are annotated already. The result of it is that the final manifest is always the actual Command/Event and the serialization is done at runtime. So, in Java is kind of safe, but with the runtime penalty.

For Scala, we can eventually have something the blocks the user to shoot his own foot, but maybe we should not go that far. If we provide some facilities to help the declaration of ADTs that may be enough, at least for the moment.

I have the following in mind:

sealed trait GenericEvent extends Serializable
case class SpecificEvent1(x: Int) extends GenericEvent
case class SpecificEvent2(s: String) extends GenericEvent
case class MigratedSpecificEvent(addedField: Int, newName: String) extends GenericEvent
case class UnrelatedEvent(y: Boolean)

def serializers = Seq(
    JsonSerializer[UnrelatedEvent]
  ) ++ DerivedJsonSerializer[GenericEvent]

The trick is to have a special one that we can use when we have one single Format for the whole ADT.

DerivedJsonSerializer should return a Seq[JsonSerializer]. The Seq would contain a individual JsonSerializer for each unique implementation of the passed sealed trait.

This will be very close to what the Java API is offering and give also the possibility to defined migrations.

However, there is still the possibility to mess-up with everything, for instance:

def serializers = Seq(
    JsonSerializer[UnrelatedEvent],
    JsonSerializer[SpecificEvent1]
  ) ++ DerivedJsonSerializers[GenericEvent]

The above code would have the effect of declaring two JsonSerializer for SpecificEvent1.

Jun 12 '17 22:06 octonato

@rcavalcanti I really like that idea! In fact the problem scenario you mentioned doesn't seem too big of a deal since its possible to "mess up" that way even now using the JsonSerializer.apply which takes a Format[T] as an argument. Also I'm wondering if that problem could be solved by having serializers be a Set and overriding JsonSerializer's equals and hashcode.

Clarification: changing serializers to a set wouldn't prevent the user from trying to add two serializers for the same type (for example one explicit serializer and another one generated by the macro). However it would at the very least signify uniqueness in the type signature.

Jun 12 '17 23:06 crfeliz

@rcavalcanti First, I propose the name ADTJsonSerializers rather than DerivedJsonSerializers because "derived" is a reference to a specific library not included in lagom. Moreover the generic formats might not be derived - they could be manually created.

Second - after thinking about your macro solution for a while I believe the only thing it doesn't fix is the (rare) scenario when a concrete event class doesn't extend Serializable. In this case akka might still choose the java serializer. We can close this hole easily by adding a [T <: Serializable] type constraint to the JsonSerializer builder methods.

Jun 14 '17 02:06 crfeliz

@crfeliz The derived is not a reference to play-json-derived-codecs as you might have thought. Derived refers to the technique of deriving typeclass from an existing sealed trait. This is called "typeclass derivation". This technique is used by play-json-derived-codecs, hence the name.

It's also not referring to derived Format, but to derived JsonSerializers. The Formats may have been generated or written by hand, but the macro do derive as many JsonSerializers as concrete implementations of the sealed trait.

With respect to Serializable, we have the intention to add a flag in akka to solve it. However, since Lagom's Java DSL have already a Jsonable it might be better to reuse it instead. The current problem is that it is now in Lagom's jackson module. I have plan to move it to another package so it can reused by the play-json module as well.

Jun 14 '17 06:06 octonato

@rcavalcanti Ah ok. My mistake. Something I need to go study up on 😅 .

Regarding Serializable: when you say you want to bring back Jsonable does that imply that JsonSerializerRegistry will require Jsonables or do you mean that the trait will simply be available for users that (for whatever reason) want to use regular a class or object for their events.

Jun 14 '17 06:06 crfeliz

I'm not sure yet what would be the best approach.

The akka flag we are planing may not be enough. We could add the flag by default to on in Lagom. In Akka it must stay off.

The problem is that we might have users that have inadvertently serialized they Events using the Java Serializers. That would happen if they don't implement the existing Jsonable trait in the Java DSL. I think we are safe for the ScalaDSL. As you have proved it's not enough to implement Serializable or Jsonable when declaring the Format at the level of the ADTs base trait.

We can assume, I think, that all Scala users out there have declared all Formats one by one. In that case, it's not needed to implement Jsonable because Akka will pick the most specific one and therefore will never pick the JavaSerializer.

Therefore we might consider to add the upper bound type ([T <: Jsonable]) only for the DerivedJsonSerializers

Jun 14 '17 06:06 octonato

If I'm not mistaken turning on the flag (as you described it in akka/akka#17252 ) would only affect newly written events. During event playback the serializer is looked up by serializer id - so those events which inadvertently used java serialization would be unaffected. Maybe a little unfortunate but at least nothing breaks.

Regarding the ScalaDSL: I for one immediately went for trying to use derived Formats. I'm not sure its safe to assume others didn't do the same. Having the upper bound on serializers produced by DerivedJsonSerializers would protect users who are aware of this feature. However for others that might have already declared a bunch of JsonSerializer[GenericTrait] or for those who don't know what DerivedJsonSerializers implies (like myself until a few moments ago lol): not adding a corresponding restriction to JsonSerializer.apply would mean users could continue making this mistake.

So perhaps to prevent all inadvertent java serialization we can:

ensure serializers produced by DerivedJsonSerializers have the Jsonable upper bound (as you suggested)
ensure any format passed to JsonSerializer.apply is a Foramt[ConcreteClass <: Jsonable] (is it possible to check if a class is "concrete" or final via macros?)
turn on the new flag in akka to ensure java users are covered for all new writes

Jun 14 '17 07:06 crfeliz

It can detect if a class is concrete or final via macros.

If I'm remembering correct, on your investigations you could show that if some one declares the Format for a base trait and disable Java Serialization in akka, we get a error because the PlayJsonSerializer will attempt to find a seriliazer for the concrete class name manifest while we only have for one for the base trait.

That means that JsonSerializer[T] without upper bound may be ok. However, if we can turn it into a macro that checks for concrete types that will be even better.

Actually, JsonSerializer will never work for traits. It must be a concrete one because we need the right manifest. A compile time error explaining why and how to fix would be a very good solution to that problem.

Jun 14 '17 08:06 octonato

Yep - that's what I was thinking. And the same manifest lookup problem would occur if the serializer were for a concrete base class. This is why I thought ensuring that the class is 'final' might be good.

Jun 14 '17 08:06 crfeliz

I've been running into this issue. It's unfortunate to not have problems in development and then run into issues in staging/prod!

I ended up just wrapping my Seq[Item] with a custom Items case class to work around this.

Mar 26 '19 23:03 shadiramadan

We ran into this issue. It seems the easiest workaround is to mark any traits we want to serialize in the persistence layer with a trait that is registered in the serialization-bindings of config to jackson-json and not bother with SerializerRegistry for that type.

Anyone see any problems with that approach? It seems to solve it for us..

Apr 27 '20 18:04 blanchet4forte

@blanchet4forte, are you using Lagom 1.6?

Apr 28 '20 09:04 octonato

We have the same problem with Lagom 1.6. Our Lagom services are deployed in Docker containers in AWS ECS, and use Cassandra in AWS Keyspaces. The only workaround that works for us, as suggested by @blanchet4forte , is to bypass SerializerRegistry altogether and instead use Akka jackson-json serializers exclusively. This does require us to convert events and commands that are case objects into case classes.

Jan 04 '21 17:01 dinhnguyen92

Play json formats which serialize traits do not work with Lagom's persistence layer

Problem:

Details:

When attempting to persist an instance of UserUpdatedEvt

Potential fixes:

For the java serialization default selection issue:

For the PlayJsonSerializer serializer lookup issue

When attempting to persist an instance of `UserUpdatedEvt`