akka.net
akka.net copied to clipboard
DistributedPubSub SendOneMessageToEachGroup issue after any subscriber becomes unrecheable
- Akka.Net Version: 1.3.11
- Windows 7 / .NET Framework 4.6.x
- Detailed Description follows
I pushed test code onto https://github.com/joonhwan/akka.net-distribute-pubsub-test
The solution contains following console programs.
-
Seed
-
JobRequester (Console Pseudo Job Requester. any alphabet string will be sent to all Job Handler, any numeric string will be sent to a Single Job Handler)
-
JobHandler : Subscribed like followings. one for 1-to-n, the other for 1-to-1 way using
sendOneMessageToEachGroupproperty ofPublishmessage.Mediator.Tell(new Subscribe("echo", Self, "handler")); // for 1-to-1 Mediator.Tell(new Subscribe("echo", Self)); // for 1-to-n
While I run
- Seed
- JobRequester
- JobHandler (more than 2 instances... port number will automatically allocated)
everything seems to ok. If i enter alphabet string , that will be sent in a broadcast way. All JobHandler got that message, and If enter numerical string, that will be sent in a round-robin way. Only one of JobHandler got it. No message dropped.
Issue 1
but if I close one of JobHandler console, any 1-to-1 message that was supposed to be sent to that closed JobHandler will be dropped.
Issue 2
When I closed all of JobHandler and then re-run another JobHandler to join the cluster, that JobHandler cannot receive any message from JobRequester.
Any Hint or Guide will be appreciated.
I run into the same problem on Akka 1.3.18, using DData for state sync.
We create a short-lived actor subscribing to a topic with a unique guid as the group name. Another actor on a different node publishes to this topic with the flag sendOneMessageToEachGroup = true. First time this actor is created it works fine and it receives messages. After a while it stops itself. When another instance is created later it subscribes with a new guid, gets the subscription ack, but receives no messages. When a message is published to this topic a NRE is thrown in the DistributedPubSubMediator on the publishing node. Stacktrace:
System.NullReferenceException: Object reference not set to an instance of an object.
at Akka.Routing.Router.Send(Routee routee, Object message, IActorRef sender)
at Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubMediator.PublishToEachGroup(String path, Object message)
at lambda_method(Closure , Object , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Object[] )
at Akka.Tools.MatchHandler.PartialHandlerArgumentsCapture`16.Handle(T value)
at Akka.Actor.ReceiveActor.ExecutePartialMessageHandler(Object message, PartialAction`1 partialAction)
at Akka.Actor.UntypedActor.Receive(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)
When I tried to repro this in a unit test I did not manage. I did not use clustering or remoting, it was pub/sub in a single ActorSystem, so I think this only repros when actors are from different nodes.
Thinking about it this may be a BIG deal. If we ever stop a subscribed actor pub/sub becomes broken? We definitely subscribe from cluster singletons, so when the oldest node goes down and they shift, that could lead to these errors? I never noticed this in particular, but in these cases the group stays the same (not a random guid as above), so maybe it's just to do with the last actor of a group going missing?
I also noticed there was a fix for pub/sub actor termination handling in v1.4, was that perhaps related and if so, can that be backported?
The problem was already resolved.
@Arkatufus I don't think this issue has been fully solved. We still get the following exception using sendOneMessageToEachGroup = true in Akka.NET v1.4.48:
System.NullReferenceException: Object reference not set to an instance of an object.
at Akka.Routing.Router.Send(Routee routee, Object message, IActorRef sender)
at Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubMediator.PublishToEachGroup(String path, Publish publish)
at Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubMediator.<.ctor>b__15_2(Publish publish)
at lambda_method43(Closure , Object , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Action`1 , Object[] )
at Akka.Tools.MatchHandler.PartialHandlerArgumentsCapture`16.Handle(T value)
at Akka.Actor.ReceiveActor.OnReceive(Object message)
at Akka.Actor.UntypedActor.Receive(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Actor.ActorCell.ReceiveMessage(Object message)
at Akka.Actor.ActorCell.Invoke(Envelope envelope)
I'll try to provide more details, and perhaps a new issue for it.