firebase-admin-node icon indicating copy to clipboard operation
firebase-admin-node copied to clipboard

[FR] Auto-retry FCM requests that failed due to an internal error (messaging/internal-error)

Open dkim-xeal opened this issue 1 year ago • 3 comments

Is your feature request related to a problem? Please describe. Our Sentry has captured about 20,000 messaging/internal-error errors. This means that due to some internal FCM error, we haven't sent 20,000 push notifications to our users.

Describe the solution you'd like FCM documentation recommends retrying such requests. https://firebase.google.com/docs/cloud-messaging/send-message#admin.

Currently, the library retries 503 requests but not 500 (messaging/internal-error returns 500) https://github.com/firebase/firebase-admin-node/blob/b5c4f5ae551249b64632baf2ece7b5c594a1965f/src/utils/api-request.ts#L202

It would be nice to change the default retry configuration or allow library users to use their own retry configs. It seems the solution is already implemented in https://github.com/firebase/firebase-admin-node/pull/1739 but stuck in review phase for 2 years.

Describe alternatives you've considered I considered writing a custom wrapper around the FCM error, but it appears that FCM doesn't expose the Retry-After header, making it unclear how long the wrapper should wait before making another request.

Additionally, copying and pasting the same wrapper in each project that uses FCM is a very frustrating experience

dkim-xeal avatar Jun 19 '24 11:06 dkim-xeal

I found a few problems with this issue:

  • I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.
  • This issue does not seem to follow the issue template. Make sure you provide all the required information.

google-oss-bot avatar Jun 19 '24 11:06 google-oss-bot

Hello, may I ask how you tracked that in Sentry?

Thanks!

pquerner avatar Jun 28 '24 21:06 pquerner

For about 800 of tokens I noticed this today aswell. I don't have a system in place to try again later for these customers. So they are currently lost. For our system its "not so bad", therefore I havent invested time in that topic.

pquerner avatar Jul 26 '24 15:07 pquerner