Pub/Sub c++ client: How to prevent/reduce ack failures on at least once delivery subscription
Client version: v2.39.0 MaxDeadline = seconds(0) (default value) MaxDeadlineExtension = seconds(180) MinDeadlineExtension = seconds(60) MaxOutstandingMessages = 250 MaxConcurrency = 1
Time to time acks are failing. There is no retry for at least once delivery. How to reduce/prevent these failures? It happens 10-20 times a day.
Some of the logged errors [file_name] - [message]
[NTPubSubLogger] /google-cloud-cpp/google/cloud/pubsub/internal/ack_handler_wrapper.cc:29 - error while trying to ack(), status=UNAVAILABLE: recvmsg:Connection reset by peer
[NTPubSubLogger] /google-cloud-cpp/google/cloud/pubsub/internal/ack_handler_wrapper.cc:29 - error while trying to ack(), status=UNAVAILABLE: 502:Bad Gateway
It sounds like network issue but I doubt that. It happens every day 10-20 times. I can reproduce it with a 10 minutes of load tests by consuming/processing 300 messages per second.
Messages are ordered by account id in the system. When this happens the messages are stuck on pub/sub for 10-20 minutes because of unacked message. initial deadline=60 seconds, max deadline = 180 seconds. I observed the cpp client logs in debug mode. There is no ModifyAckDeadlineRequest is being made but still messages redelivered between 10-20 minutes later, in fact they should arrive in a few minutes. (That's another issue and maybe not related to the client.)
Diagnosing this will probably require some assistance on the backend to see where these network issues are originating. Please file a support ticket at https://cloud.google.com/support/ to get things going.