Provide a way to handle upcoming remote wire incompatibility
As discussed in https://github.com/mdedetrich/akka-apache-project/discussions/28, pekko core will need to handle wire incompatibility issues arising from the change in the akka:// address prefix to pekko://. There may also be other wire incompatibility changes, for example if we decide to change the default 2552 port and anything else that we may have missed.
The solution will also need to play with the release strategy that we decide for pekko. Since the previous discussion I have done some thinking about it and at least to me the most amicable solution would be to make these fields configurable with typesafe config with different default values depend on the pekko release branch, i.e.
For pekko 1.0.x we would have
acceptPrefix = ["akka", "pekko"] // The prefix's to use when checking against incoming requests
sendPrefix = "pekko" // The prefix to use when making send requests
And for pekko 1.1.x (on the assumption that we don't want to accept migrating current Akka clusters to this Pekko branch by default)
acceptPrefix = ["pekko"] // The prefix's to use when checking against incoming requests
sendPrefix = "pekko" // The prefix to use when making send requests
(Note that the same strategy can also be applied to the port 2552`.
This means that when doing a rolling upgrade from an existing akka cluster to pekko 1.0.x you would temporarily change sendPrefix to "akka" (so that currently existing Akka clusters don't refuse connections from new pekko nodes) and once the rolling upgrade is complete then you would change sendPrefix to "pekko". After that point you are then free to upgrade to pekko 1.1.x with acceptPrefix as ["pekko"] without any problems and making this configurable with typesafe config means that if users have bespoke update requirements its relatively easy for them to handle it.
This also leaves the door open to first deprecating the acceptPrefix/sendPrefix settings later down the Pekko 1.1.x release cycle and then at some point in the future hardcoding it as pekko.
Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.
Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.
From my first impression this should work because the only thing that got changed in the https://github.com/apache/incubator-pekko/pull/58 PR is the option java_package in the *.proto files which isn't used by the wire protocol but rather in generating the typed sources from the proto definitions.
This obviously needs to be tested to be verified, it seems like the serialization related protobuff messages may need extra handling due to explicitly using FQCN.
Can we deserialize protobuf messages sent by an Akka node with Pekko? I'm not too familiar with Protobuf, but akka-kryo-serialization defaults to FQCN for message serialization for example.
Most (all?) of the internal serialization formats use short custom strings as manifests instead of FQCN, so many serializers should just keep working:
https://github.com/apache/incubator-pekko/blob/4ac0f00a477873965ee7d52e16faefb1de91fe3a/remote/src/main/scala/org/apache/pekko/remote/serialization/MiscMessageSerializer.scala#L313-L320
Just relaying what was stated in the mailing list at https://lists.apache.org/thread/nr7q9orzffolwo9tj753cvkrohr9875o, it appears that this issue is not as high priority as I originally anticipated so that means we don't have to worry about this for 1.0.x (or even at all)
I suppose this would be needed by every large scale live application that uses akka and can not afford a downtime. Most of the large gaming firms use akka .. and they can not afford a downtime. so remote wire compatibility looks like a must have
I suppose this would be needed by every large scale live application that uses akka and can not afford a downtime. Most of the large gaming firms use akka .. and they can not afford a downtime. so remote wire compatibility looks like a must have
To be clear we are not against this change and it is something that we planned for originally in Pekko 1.0.0, its just that we didn't have the capacity to do such a change especially considering that it needs to have extensive testing. Our priority was to get the Pekko modules out so the community would at least have something to use.
If the issue is really that critical then it makes sense for someone to own up to it and contribute it and we will do our best to support such an endeavour
Assigning this to myself as per https://github.com/apache/incubator-pekko/issues/146#issuecomment-1773766845. Also removing low priority because I don't think this is accurate (I know for a fact that there are companies that need rolling upgrade and this is blocking them from moving to Pekko).
I created a draft PR at https://github.com/apache/incubator-pekko/pull/765 which contains a skeleton of the desired implementation, still more to do
I wrote a demo app that shows the #765 do allow Pekko actors to send messages to Akka remote actor instances.
https://github.com/pjfanning/pekko-akka-compat
@mdedetrich @kerr I've added a cluster test to https://github.com/pjfanning/pekko-akka-compat (PekkoAkkaAeronCluster)
The simple remote examples work but the clustering code is more complicated.
The 1st issue that PekkoAkkaAeronCluster runs into is that akka.cluster.JoinConfigCompatCheckCluster checks the compatibility. I am no expert on akka/pekko cluster code but it looks like when akka/pekko nodes try to join a cluster they send a message with HOCON data. Our pekko nodes wrap this data with a pekko wrapper while akka nodes use an akka wrapper.
It seems like there could be 2 solutions.
- We add a further Pekko change so that the message it sends has an
akkawrapper instead of apekkowrapper.- we would also need to change the pekko.cluster.JoinConfigCompatCheckCluster to accept akka wrappers
- We create a new JoinConfigCompatChecker class that can be used in Pekko and we could release one for Akka
- users would need to set these classes in their configs
The HOCON message seems to contain at least these 2 configs (or their Akka equivalents).
private val DowningProviderPath = "pekko.cluster.downing-provider-class"
private val SbrStrategyPath = "pekko.cluster.split-brain-resolver.active-strategy"
I think one place to start is with pekko.cluster.SeedNodeProcess and adjusting the config in the InitJoin message. If the protocol-name config is set to akka then the config wrapper should be changed to akka and any class names that appear in config values should have org.apache.pekko changed to akka. It might be best to have the original config and the adjusted config sent together just in case we are talking to a seed node that is a Pekko node pretending to be an Akka node.
Thanks, ill have a look at it in a few days
I had a slightly closer look and am becoming more pessimistic about whether we can support mixed akka/pekko clusters.
Look at this line from the SplitBrainResolver.
context.system.eventStream.subscribe(self, classOf[ThisActorSystemQuarantinedEvent])
ThisActorSystemQuarantinedEvent is in package org.apache.pekko.remote.artery. There is also a similar Akka class but the Pekko SplitBrainResolver will not be listening for Akka ThisActorSystemQuarantinedEvent events - only Pekko ones.
I think if we look around, we'll find this subscribing to the eventStream is common in the cluster code with separate Akka and Pekko classes, I think nodes of different Akka/Pekko type will ignore event stream events of the other type.
ThisActorSystemQuarantinedEvent is in package org.apache.pekko.remote.artery. There is also a similar Akka class but the Pekko SplitBrainResolver will not be listening for Akka ThisActorSystemQuarantinedEvent events - only Pekko ones.
How are these events serialized between nodes? If they are serialized using protobuf, it may be transparent: the protobuf binary doesn't contain class names, only numeric id's - so as long as the object structure and the numeric ids are consistent a serialized Akka ThisActorSystemQuarantinedEvent might be deserialized into a Pekko one and vice-versa.
I have a branch where I have the #765 changes and also an extra change to pekko-cluster jar that hacks the config data that is sent by a cluster node when trying to join the cluster (to make it look like akka node data) - https://github.com/pjfanning/incubator-pekko/commits/remote-release
https://github.com/pjfanning/pekko-akka-compat (PekkoAkkaAeronCluster) gets a bit further when the pekko node tries to join an akka cluster. The latest issue now seems to be the 'version' - Pekko 1.0.1 vs Akka 2.6.21. It looks like I might need to add a new config that let's Pekko pretend that it is running a particular Akka version.
I added a further change to get the pekko node to include akka.version in its init-join message - a valid akka.version (not a pekko version). This seems to get the pekko cluster member to join the akka cluster.
@mdedetrich what do you think we need to do this move this on? With your #765 and the extra changes in https://github.com/pjfanning/incubator-pekko/commits/remote-release, it appears that we have some support. See note about PekkoAkkaAeronCluster above.
@pjfanning I haven't really been ontop if this lately (I am focusing on other stuff) but feel free to add commits onto the #765 branch (it should be open for maintainers) to close it off