azure-service-bus-java
azure-service-bus-java copied to clipboard
MessageSender is unable to recover from network outage
I use an IMessageSender created by ClientFactory.createMessageSenderFromConnectionStringBuilder to feed a queue that I created according to the samples repository. It works well until I simulate a network outage. First, I simulated the network outage by turning off the WiFi on my laptop. I got send timeout errors for a while and when I turned the WiFi back, it kept giving timeout exceptions. I thought that turning off the WiFi is too destructive, it can cause problems in the networking stack so I went more subtle. Currently I simulate the network outage by dropping incoming packets for 45 seconds.
sudo iptables -I INPUT -i wlan0 -j DROP ; sleep 45; sudo iptables -D INPUT -i wlan0 -j DROP;
Networking is restored after 45 seconds, but I keep getting the timeouts seemingly indefinitely.
2018-11-24 19:04:20.794 TRACE 24344 --- [pool-1-thread-1] c.n.asrscenariomanager.RunQueueFeeder : Sending message to run queue for run 1dce6166-311b-421c-aa74-4c22371983ea
2018-11-24 19:04:50.796 WARN 24344 --- [pool-4-thread-1] c.m.a.s.primitives.CoreMessageSender : Delivery '22f5e3aa46124c0890d3fa02098caad6' to 'BasicQueue' did not receive ack from service. Throwing timeout.
2018-11-24 19:04:50.800 ERROR 24344 --- [pool-4-thread-1] c.m.a.s.primitives.CoreMessageSender : Send timed out
com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:04:50.797Z[Europe/London].
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.throwSenderTimeout(CoreMessageSender.java:887) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.access$500(CoreMessageSender.java:59) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender$3.run(CoreMessageSender.java:260) [azure-servicebus-1.2.8.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:04:50.803 ERROR 24344 --- [pool-3-thread-3] c.n.asrscenariomanager.RunQueueFeeder : Failed to process run queue for run 1dce6166-311b-421c-aa74-4c22371983ea: java.util.concurrent.CompletionException: com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:04:50.797Z[Europe/London]., errorContext[NS: nmisbtest.servicebus.windows.net, PATH: BasicQueue, REFERENCE_ID: 42e9c9_880ac5bed8fd483f86aa5944d05aed9b_G28, LINK_CREDIT: 998]
2018-11-24 19:04:50.997 TRACE 24344 --- [pool-1-thread-1] c.n.asrscenariomanager.RunQueueFeeder : Sending message to run queue for run 1dce6166-311b-421c-aa74-4c22371983ea
2018-11-24 19:05:20.997 WARN 24344 --- [pool-4-thread-3] c.m.a.s.primitives.CoreMessageSender : Delivery '6d9af9eb32cb4addb9205fc8d2df292e' to 'BasicQueue' did not receive ack from service. Throwing timeout.
2018-11-24 19:05:20.998 ERROR 24344 --- [pool-4-thread-3] c.m.a.s.primitives.CoreMessageSender : Send timed out
com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:05:20.998Z[Europe/London].
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.throwSenderTimeout(CoreMessageSender.java:887) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.access$500(CoreMessageSender.java:59) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender$3.run(CoreMessageSender.java:260) [azure-servicebus-1.2.8.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:05:20.999 ERROR 24344 --- [pool-3-thread-3] c.n.asrscenariomanager.RunQueueFeeder : Failed to process run queue for run 1dce6166-311b-421c-aa74-4c22371983ea: java.util.concurrent.CompletionException: com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:05:20.998Z[Europe/London]., errorContext[NS: nmisbtest.servicebus.windows.net, PATH: BasicQueue, REFERENCE_ID: 42e9c9_880ac5bed8fd483f86aa5944d05aed9b_G28, LINK_CREDIT: 997]
2018-11-24 19:05:21.160 TRACE 24344 --- [pool-1-thread-1] c.n.asrscenariomanager.RunQueueFeeder : Sending message to run queue for run 1dce6166-311b-421c-aa74-4c22371983ea
2018-11-24 19:05:51.161 WARN 24344 --- [pool-4-thread-4] c.m.a.s.primitives.CoreMessageSender : Delivery '1cee2726058d4a1d9b6966d87edc879e' to 'BasicQueue' did not receive ack from service. Throwing timeout.
2018-11-24 19:05:51.161 ERROR 24344 --- [pool-4-thread-4] c.m.a.s.primitives.CoreMessageSender : Send timed out
com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:05:51.161Z[Europe/London].
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.throwSenderTimeout(CoreMessageSender.java:887) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.access$500(CoreMessageSender.java:59) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender$3.run(CoreMessageSender.java:260) [azure-servicebus-1.2.8.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:05:51.162 ERROR 24344 --- [pool-3-thread-3] c.n.asrscenariomanager.RunQueueFeeder : Failed to process run queue for run 1dce6166-311b-421c-aa74-4c22371983ea: java.util.concurrent.CompletionException: com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:05:51.161Z[Europe/London]., errorContext[NS: nmisbtest.servicebus.windows.net, PATH: BasicQueue, REFERENCE_ID: 42e9c9_880ac5bed8fd483f86aa5944d05aed9b_G28, LINK_CREDIT: 996]
2018-11-24 19:05:51.297 TRACE 24344 --- [pool-1-thread-1] c.n.asrscenariomanager.RunQueueFeeder : Sending message to run queue for run 1dce6166-311b-421c-aa74-4c22371983ea
2018-11-24 19:06:21.298 WARN 24344 --- [pool-4-thread-2] c.m.a.s.primitives.CoreMessageSender : Delivery '8dff44d9d8b540b581df5a9309ffa1bd' to 'BasicQueue' did not receive ack from service. Throwing timeout.
2018-11-24 19:06:21.299 ERROR 24344 --- [pool-4-thread-2] c.m.a.s.primitives.CoreMessageSender : Send timed out
com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:06:21.299Z[Europe/London].
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.throwSenderTimeout(CoreMessageSender.java:887) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender.access$500(CoreMessageSender.java:59) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.CoreMessageSender$3.run(CoreMessageSender.java:260) [azure-servicebus-1.2.8.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_91]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:06:21.300 ERROR 24344 --- [pool-3-thread-3] c.n.asrscenariomanager.RunQueueFeeder : Failed to process run queue for run 1dce6166-311b-421c-aa74-4c22371983ea: java.util.concurrent.CompletionException: com.microsoft.azure.servicebus.primitives.TimeoutException: Send operation timed out at 2018-11-24T19:06:21.299Z[Europe/London]., errorContext[NS: nmisbtest.servicebus.windows.net, PATH: BasicQueue, REFERENCE_ID: 42e9c9_880ac5bed8fd483f86aa5944d05aed9b_G28, LINK_CREDIT: 995]
If I restart the application the message is sent right away.
Environment:
- OS: Ubuntu 18.04
- SDK Version 1.2.8
- Oracle JAVA 1.8.0_91
I did the network outage test again and let it running to see if something changes after a longer period of time. Unfortunately, my stdout buffer was not big enough to see all the logged messages, but I can see that there was a timeout exception at 19:19:24 and the timeout exceptions kept coming until 19:32:72 when I got the following:
2018-11-24 19:32:57.115 TRACE 25271 --- [pool-1-thread-1] c.n.asrscenariomanager.RunQueueFeeder : Sending message to run queue for run 1dce6166-311b-421c-aa74-4c22371983ea
2018-11-24 19:33:03.376 WARN 25271 --- [5f-b43b7618d20b] c.m.a.servicebus.amqp.ConnectionHandler : Connection.onTransportError: hostname:nmisbtest.servicebus.windows.net:5671, error:connection aborted
2018-11-24 19:33:03.376 ERROR 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.MessagingFactory : Connection error. 'Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}'
2018-11-24 19:33:03.376 INFO 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.MessagingFactory : Closing connection to host
2018-11-24 19:33:03.378 WARN 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.RequestResponseLink : Internal send link 'RequestResponseLink-Sender_195914_75d58a56b2254f9c870ca3d2b43b766e_G5' of requestresponselink to '$cbs' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}
at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalSender.onClose(RequestResponseLink.java:846) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:376) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:301) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onTransportError(ConnectionHandler.java:84) [azure-servicebus-1.2.8.jar:na]
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:191) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:484) [azure-servicebus-1.2.8.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:33:03.379 WARN 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.RequestResponseLink : Completing all pending requests with exception in request response link to $cbs
2018-11-24 19:33:03.380 WARN 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.RequestResponseLink : Internal receive link 'RequestResponseLink-Receiver_0a5164_75d58a56b2254f9c870ca3d2b43b766e_G5' of requestresponselink to '$cbs' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}
at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalReceiver.onClose(RequestResponseLink.java:637) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:376) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:301) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onTransportError(ConnectionHandler.java:84) [azure-servicebus-1.2.8.jar:na]
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:191) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:484) [azure-servicebus-1.2.8.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 19:33:03.380 WARN 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.RequestResponseLink : Completing all pending requests with exception in request response link to $cbs
2018-11-24 19:33:03.382 WARN 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.CoreMessageSender : Send link 'e9d8a6_75d58a56b2254f9c870ca3d2b43b766e_G5' to 'BasicQueue' closed. Will retry link creation after 'PT1S'.
2018-11-24 19:33:04.383 INFO 25271 --- [pool-4-thread-2] c.m.a.s.primitives.CoreMessageSender : Recreating send link to 'BasicQueue'
2018-11-24 19:33:04.384 INFO 25271 --- [pool-3-thread-4] c.m.a.s.primitives.RequestResponseLink : RequestResponseLink - recreating internal send and receive links to $cbs
2018-11-24 19:33:04.389 INFO 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.MessagingFactory : Creating connection to host 'nmisbtest.servicebus.windows.net:5671'
2018-11-24 19:33:04.615 INFO 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.MessagingFactory : Connection opened to host.
2018-11-24 19:33:04.622 INFO 25271 --- [pool-3-thread-5] c.m.a.s.primitives.RequestResponseLink : Recreated internal links to $cbs
2018-11-24 19:33:04.645 INFO 25271 --- [5f-b43b7618d20b] c.m.a.s.primitives.CoreMessageSender : Creating send link to 'BasicQueue'
After the connection was closed and recreated the message got sent to the queue.
The same problem affects the MessageReceiver created by ClientFactory.createMessageReceiverFromConnectionStringBuilder. I use the following code for the test.
public void receiveLoop() {
log.trace("Next cycle");
try {
IMessage message = this.getMessageReceiver().receive(Duration.ofSeconds(60));
if (message != null) {
log.trace("Message received");
this.messageReceiver.completeAsync(message.getLockToken());
Thread.sleep(5000); // Sleep so I have time to cut network connection
}
} catch (Throwable t) {
log.error( t.toString());
}
this.receiveLoop();
}
I block the incoming packets with iptables for 45 seconds again when loop is sleeping for 5 seconds.
sudo iptables -I INPUT -i wlan0 -j DROP ; sleep 45; sudo iptables -D INPUT -i wlan0 -j DROP;
The result shows that no new messages are received until the connection is aborted and reopened.
2018-11-24 20:48:51.748 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:48:51.775 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Message received
2018-11-24 20:48:56.776 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:48:56.801 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Message received
2018-11-24 20:49:01.802 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:50:01.803 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:51:01.810 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:52:01.811 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:53:01.813 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:54:01.818 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:55:01.819 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:56:01.820 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:57:01.821 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:58:01.822 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 20:59:01.823 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:00:01.824 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:01:01.826 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:02:01.827 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:03:01.828 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:04:01.829 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:05:01.830 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:05:10.928 WARN 28356 --- [b0-655f1c4cdbae] c.m.a.servicebus.amqp.ConnectionHandler : Connection.onTransportError: hostname:nmisbtest.servicebus.windows.net:5671, error:connection aborted
2018-11-24 21:05:10.929 ERROR 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.MessagingFactory : Connection error. 'Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}'
2018-11-24 21:05:10.930 INFO 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.MessagingFactory : Closing connection to host
2018-11-24 21:05:10.940 WARN 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.RequestResponseLink : Internal send link 'RequestResponseLink-Sender_fcbc96_540fd470e6814d10abd814c326e7e758_G24' of requestresponselink to '$cbs' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}
at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalSender.onClose(RequestResponseLink.java:846) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:376) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:301) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onTransportError(ConnectionHandler.java:84) [azure-servicebus-1.2.8.jar:na]
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:191) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:484) [azure-servicebus-1.2.8.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 21:05:10.941 WARN 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.RequestResponseLink : Completing all pending requests with exception in request response link to $cbs
2018-11-24 21:05:10.941 WARN 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.RequestResponseLink : Internal receive link 'RequestResponseLink-Receiver_da0915_540fd470e6814d10abd814c326e7e758_G24' of requestresponselink to '$cbs' closed with error.
com.microsoft.azure.servicebus.primitives.ServiceBusException: Error{condition=amqp:connection:framing-error, description='connection aborted', info=null}
at com.microsoft.azure.servicebus.primitives.ExceptionUtil.toException(ExceptionUtil.java:113) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.RequestResponseLink$InternalReceiver.onClose(RequestResponseLink.java:637) ~[azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.BaseLinkHandler.processOnClose(BaseLinkHandler.java:68) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.closeConnection(MessagingFactory.java:376) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory.onConnectionError(MessagingFactory.java:301) [azure-servicebus-1.2.8.jar:na]
at com.microsoft.azure.servicebus.amqp.ConnectionHandler.onTransportError(ConnectionHandler.java:84) [azure-servicebus-1.2.8.jar:na]
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:191) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:309) [proton-j-0.22.0.jar:na]
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276) [proton-j-0.22.0.jar:na]
at com.microsoft.azure.servicebus.primitives.MessagingFactory$RunReactor.run(MessagingFactory.java:484) [azure-servicebus-1.2.8.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
2018-11-24 21:05:10.941 WARN 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.RequestResponseLink : Completing all pending requests with exception in request response link to $cbs
2018-11-24 21:05:11.945 INFO 28356 --- [ool-1-thread-35] c.m.a.s.primitives.RequestResponseLink : RequestResponseLink - recreating internal send and receive links to $cbs
2018-11-24 21:05:11.948 INFO 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.MessagingFactory : Creating connection to host 'nmisbtest.servicebus.windows.net:5671'
2018-11-24 21:05:12.178 INFO 28356 --- [b0-655f1c4cdbae] c.m.a.s.primitives.MessagingFactory : Connection opened to host.
2018-11-24 21:05:12.179 INFO 28356 --- [ool-1-thread-34] c.m.a.s.primitives.RequestResponseLink : Recreated internal links to $cbs
2018-11-24 21:06:01.831 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:06:01.859 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Message received
2018-11-24 21:06:06.860 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Next cycle
2018-11-24 21:06:06.886 TRACE 28356 --- [onPool-worker-1] com.nminet.asrscenariorunner.RunManager : Message received
Now I see the traces. As you said, you simulated network outage at 19:19:24 and the trace showing that the connection was aborted is logged at 19:33:03. Ideally network outage should immediately trigger a "connection aborted" event. The SDK will recreate an aborted connection automatically. I have tested this.
In your case, the time lag between network outage and the generated "connection aborted" event is little over 13 minutes. That's why you are seeing timeouts for all that 13 min interval. I really can't say why there was that lag. But we saw similar issue from another customer running their application inside a container. Are you running your application inside a container? If yes, could you try the scenario by running your application outside a container?
hi @yvgopal could you please tell me whether this "But we saw similar issue from another customer running their application inside a container" issue fixed? We have another customer with similar issue, customer using Java SDK run in Function App docker container .the connection disconnected for a few minutes but there isn't any exception in Service Bus and Java Exception. It cannot reconnect with Service Bus server and always waiting connect, until customer restart the Function App. Do you have any idea of it ? thank you