`runTransaction()`: `maxAttempts` not honored when `> MAX_RETRY_ATTEMPTS`
Issue
A call like:
await db.runTransaction(f, { maxAttempts: DB_TRANSACTION_MAX_ATTEMPTS })
is not honoring maxAttempts when maxAttempts > MAX_RETRY_ATTEMPTS (from ExponentialBackoff class).
So, setting maxAttempts: 20 retries only 10 times (which is the current defined value of MAX_RETRY_ATTEMPTS).
What is happening is that maybeBackoff():
https://github.com/googleapis/nodejs-firestore/blob/8118d37bf27b49d5e695bf18a9550a37ecc34ca2/dev/src/transaction.ts#L617
is throwing with just an error message when retries hit MAX_RETRY_ATTEMPTS:
https://github.com/googleapis/nodejs-firestore/blob/8118d37bf27b49d5e695bf18a9550a37ecc34ca2/dev/src/backoff.ts#L264
so isRetryableTransactionError(err) is NOT getting a GoogleError error, thus returning false and breaking the loop before maxAttempts is reached:
https://github.com/googleapis/nodejs-firestore/blob/8118d37bf27b49d5e695bf18a9550a37ecc34ca2/dev/src/transaction.ts#L623
Environment details
(Using on Cloud Run Function, gen2)
- OS: Ubuntu 22.04
- Node.js version: 20
- npm version: 10
@google-cloud/firestoreversion: 7.10.0
Sorry I can't provide a MRE. It just happens when our Firestore database is under heavy load, it's not something I can reproduce at will.
Thank you for the detailed description!
At first glance, this does appear to be a bug. Effectively, transactions are capped to 10 retry attempts. This limit does not exist in the Java SDK.
The limit was originally imposed as part of https://github.com/firebase/firebase-admin-node/issues/478. However, as a side effect, it also applies to transactions.
Before transaction options were introduced, transactions were hardcoded to maximum 5 retries, so the limit of 10 had no impact. Transaction options did not take into consideration that a secondary retry limit also existed.
Due to the backoff being shared by different components, a fix will require disentangling this a little bit. If we remove this limit from backoff, then we have to make sure components impose their own max.
I will add this to our backlog of bugs to fix. Since this is not a trivial fix, and this being a fairly benign bug, we will wait to fix this until someone on our team has capacity to fix this.
I will bring this to the attention of the rest of the Firestore SDK team. Expect to see updates here when we take on this work.
We would very much appreciate this fix as well!