Publisher disconnects from queue when AskTimeoutException encountered
Hi Fabrice,
Firstly, thanks for building such a useful library. We use it in a service that publishes email requests onto a queue. Normally, we don't face any issues but sometimes we face the following exception:
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://actor-ystem/user/$3j/$a#-1930803633]] after [5000 ms]
After we encounter this exception a service restart is required to reconnect to the queue. We'd like if we could just reconnect without having to restart the service.
A simplified example is shown below:
val publisher = createPublisher()
def sendEmail(emailRequest: EmailRequest)(implicit ec: ExecutionContext): Future[Amqp.Publish] = {
val props = // Build props from EmailRequest
val queue = getQueue()
def publish(_publisher: ActorRef) = _publisher ? Publish("", queue, Array[Byte](), Option(props))
val future = publish(publisher) map {
case Ok(request, _) => request
case Error(request, errorVal) => logging
case s => throw new RuntimeException(s"$s")
}
}
def createPublisher()(implicit ec: ExecutionContext) = {
val publisher = ConnectionOwner.createChildActor(conn, ChannelOwner.props())
Amqp.waitForConnection(system, publisher).await()
info(s"Publisher connected on server $rabbitUri")
DeclareQueue(QueueParameters("myQueue", passive = false, durable = true, exclusive = false, autodelete = false, args = queueArgs))
// must wait for results so we don't send to queues before they're declared
Await.result(delayQueueFutures, Timeout(5 seconds).duration)
publisher
}
The sendMail function is what encounters the disconnect, how would you recommend recovering and reconnecting? I considered recreating the publisher but am not sure of whether that's a good idea.
Hi, I won't have time to have to look at this before the end of the week, is it ok with you ? To be sure that I understand the problem, is there anything in the rabbitmq log when it happens ? i.e is it a failure to reconnect after a network/broker problem, or is it a bug in the library which hangs even though there are no network/broker issue ? This library was designed to handle reconnection automatically i.e in most cases, it you stop and restart the broker then all channels should automatically be renewed => what happens to your service when you stop then restart the broker ?
Thanks
Hi,
Thanks for the quick reply. Sure, you can take a look at the end of the week.
- There is nothing unusual in the rabbit mq logs, other services connect to it normally
- No network issues
- When we stop and restart the broker, the service recreates the publisher (reinitiates a connection to the queue) and the service works as expected.
Hi, any update on this issue?
Hi, Sorry for the delay. I had a quick look at this over the weekend but could not find anything really useful. I guess that you could try using longer timeouts and see if there is an improvement ? The only vaguely similar issue that I can remember was caused by a firewall that would block (but not close) long lasting idle connections... One thing that could help would be to run a very basic test (like a basic producer that sends a message every N seconds and a basic consumer that prints the message) that connects to the same broker from the same server and see if it fails when your application fails ? Would that be ok on your side ?