rabbitmq-stream-dotnet-client icon indicating copy to clipboard operation
rabbitmq-stream-dotnet-client copied to clipboard

ReliableProducer not reconnecting after leader goes down

Open ricsiLT opened this issue 2 years ago • 1 comments

Hi, we observe that ReliableProducer does attempt to reconnect cluster after leader goes down.

Our setup:

  • 3 node cluster (Node_A, Node_B, Node_C, let's say that Node_A is the leader)
  • no load balancer

When we restarted the leader, we observed the following in the logs:

INFO: Producer reference: X, stream: Y  disconnected, check if reconnection needed in 200 ms.
ERROR: Error during producer initialization: System.Net.Sockets.SocketException (10061): No connection could be made because the target machine actively refused it.
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
--- End of stack trace from previous location ---
   at RabbitMQ.Stream.Client.Connection.Create(EndPoint endpoint, Func`2 commandCallback, Func`2 closedCallBack, SslOption sslOption)
   at RabbitMQ.Stream.Client.Client.Create(ClientParameters parameters)
   at RabbitMQ.Stream.Client.StreamSystem.MayBeReconnectLocator()
   at RabbitMQ.Stream.Client.StreamSystem.CreateProducer(ProducerConfig producerConfig)
   at RabbitMQ.Stream.Client.Reliable.ReliableProducer.GetNewReliable(Boolean boot)
 ---> System.Net.Sockets.SocketException Data:
    Host : Node_A
    Port : 5551

After that, no reconnection was attempted by reliable producer.

ricsiLT avatar Jul 19 '22 07:07 ricsiLT

Thank you for reporting. That part has to be improved.

Gsantomaggio avatar Jul 19 '22 08:07 Gsantomaggio

Ok I can reproduce the issue with three nodes cluster and this configuration:

 var config = new StreamSystemConfig()
        {
            UserName = "test",
            Password = "test",
            Endpoints = new EndPoint[]
            {
                new DnsEndPoint("node0", 5552),
                new DnsEndPoint("node1", 5552),
                new DnsEndPoint("node2", 5552),
            }
        };

Then stop the leader node.

   at RabbitMQ.Stream.Client.Client.Create(ClientParameters parameters) in /Users/gas/git/rabbitmq/rabbitmq-stream-dotnet-client/RabbitMQ.Stream.Client/Client.cs:line 190
   at RabbitMQ.Stream.Client.StreamSystem.MayBeReconnectLocator() in /Users/gas/git/rabbitmq/rabbitmq-stream-dotnet-client/RabbitMQ.Stream.Client/StreamSystem.cs:line 96
   at RabbitMQ.Stream.Client.StreamSystem.CreateProducer(ProducerConfig producerConfig) in /Users/gas/git/rabbitmq/rabbitmq-stream-dotnet-client/RabbitMQ.Stream.Client/StreamSystem.cs:line 123
   at RabbitMQ.Stream.Client.Reliable.ReliableProducer.GetNewReliable(Boolean boot) in /Users/gas/git/rabbitmq/rabbitmq-stream-dotnet-client/RabbitMQ.Stream.Client/Reliable/ReliableProducer.cs:line 81
 ---> System.Net.Sockets.SocketException Data:
    Host : node0
    Port : 5552

Gsantomaggio avatar Sep 07 '22 09:09 Gsantomaggio