amqp-client Publisher disconnects from queue when AskTimeoutException encountered

Hi Fabrice,

Firstly, thanks for building such a useful library. We use it in a service that publishes email requests onto a queue. Normally, we don't face any issues but sometimes we face the following exception:

akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://actor-ystem/user/$3j/$a#-1930803633]] after [5000 ms]

After we encounter this exception a service restart is required to reconnect to the queue. We'd like if we could just reconnect without having to restart the service.

A simplified example is shown below:


val publisher = createPublisher()

def sendEmail(emailRequest: EmailRequest)(implicit ec: ExecutionContext): Future[Amqp.Publish] = {
     val props = // Build props from EmailRequest
     val queue = getQueue()
     def publish(_publisher: ActorRef) = _publisher ? Publish("", queue, Array[Byte](), Option(props))
     val future = publish(publisher) map {
        case Ok(request, _) => request
        case Error(request, errorVal) => logging
        case s => throw new RuntimeException(s"$s")
     }
}

def createPublisher()(implicit ec: ExecutionContext) = {
        val publisher = ConnectionOwner.createChildActor(conn, ChannelOwner.props())
        Amqp.waitForConnection(system, publisher).await()
        info(s"Publisher connected on server $rabbitUri")
        DeclareQueue(QueueParameters("myQueue", passive = false, durable = true, exclusive = false, autodelete = false, args = queueArgs))
        // must wait for results so we don't send to queues before they're declared
        Await.result(delayQueueFutures, Timeout(5 seconds).duration)
        publisher
    }

The sendMail function is what encounters the disconnect, how would you recommend recovering and reconnecting? I considered recreating the publisher but am not sure of whether that's a good idea.

May 24 '16 08:05 rohitmukherjee

Hi, I won't have time to have to look at this before the end of the week, is it ok with you ? To be sure that I understand the problem, is there anything in the rabbitmq log when it happens ? i.e is it a failure to reconnect after a network/broker problem, or is it a bug in the library which hangs even though there are no network/broker issue ? This library was designed to handle reconnection automatically i.e in most cases, it you stop and restart the broker then all channels should automatically be renewed => what happens to your service when you stop then restart the broker ?

Thanks

May 24 '16 20:05 sstone

Hi,

Thanks for the quick reply. Sure, you can take a look at the end of the week.

There is nothing unusual in the rabbit mq logs, other services connect to it normally
No network issues
When we stop and restart the broker, the service recreates the publisher (reinitiates a connection to the queue) and the service works as expected.

May 26 '16 02:05 rohitmukherjee

Hi, any update on this issue?

Jun 06 '16 02:06 rohitmukherjee

Hi, Sorry for the delay. I had a quick look at this over the weekend but could not find anything really useful. I guess that you could try using longer timeouts and see if there is an improvement ? The only vaguely similar issue that I can remember was caused by a firewall that would block (but not close) long lasting idle connections... One thing that could help would be to run a very basic test (like a basic producer that sends a message every N seconds and a basic consumer that prints the message) that connects to the same broker from the same server and see if it fails when your application fails ? Would that be ok on your side ?

Jun 06 '16 12:06 sstone