retry-axios icon indicating copy to clipboard operation
retry-axios copied to clipboard

Potential errors when computing exponential backoff and linear backoff

Open sk- opened this issue 4 years ago • 8 comments

The code to compute the linear backoff and exponential backoff has some issues, as it does not wait in the first attempt, this is because currentRetryAttempt is initially set to 0 instead of 1. See https://github.com/JustinBeckwith/retry-axios/blob/c2843108e7d9758097cd1af89e09cd78c99bcebf/src/index.ts#L183-L188.

It also does not consider the retryDelay, which according to the readme:

Milliseconds to delay at first. Defaults to 100.

Additionally, the formula for computing the exponential backoff should be something like:

Math.min((Math.pow(2, config.currentRetryAttempt) + Math.random()), MAX_DELAY) * 1000

See https://cloud.google.com/iot/docs/how-tos/exponential-backoff#example_algorithm

That would give retry times like:

1.234
2.314
4.012

Compare that to the times obtained with the current algorithm:

0
0.5
1.5

sk- avatar Aug 19 '20 18:08 sk-

@JustinBeckwith would you accept PRs addressing this and other issues?.

sk- avatar Aug 24 '20 22:08 sk-

👋 absolutely! Would love PRs, especially ones that come with tests so I can really understand what was going wrong :)

JustinBeckwith avatar Oct 23 '20 19:10 JustinBeckwith

@JustinBeckwith

about

so I can really understand what was going wrong :)

Have you read this? :)

it does not wait in the first attempt, this is because currentRetryAttempt is initially set to 0 instead of 1

Between the first attempt (attempt 0) and the first retry (attempt 1) there should be a non-zero delay. With the current code there isn't: there is no delay. That's one of two problems that @sk- was pointing out :). ((Math.pow(2, config.currentRetryAttempt!) - 1) / 2) * 1000; is 0 for currentRetryAttem being 0.

Same for the linear one: delay = config.currentRetryAttempt! * 1000; is 0 for ...

jgehrcke avatar Dec 16 '20 10:12 jgehrcke

👋 Saying "have you read this" is kind of being a turd - please don't be a turd. There were some changes here recently which may have resolved the issue. If folks are interested, I'd be happy to take a PR (as mentioned above), but I'm unlikely to dig in here soon.

JustinBeckwith avatar Dec 16 '20 17:12 JustinBeckwith

Saying "have you read this" is kind of being a turd

Thanks for the feedback. I am sorry man, really didn't want to come across like that.

My addition to this ticket was meant in a neutral-friendly way, therefore also the ":)". It certainly was meant to be a productive contribution: it really appeared to me (based on the communication in here so far) as if you might have missed a specific problem description -- the one I have quoted.

Missing something happens to all of us. All I wanted is to ask and make sure that it's not just a simple misunderstanding. I got the impression that you missed it because you didn't comment on it and also suggested that you may not see/understand the specific problem(s) reported. Based on your "so I can really understand". Again: no blame, no stress -- all of this easily happens in a ticket like this (also because this bug report mixes two issues and does not have a precise title); and I wanted to help us align on a problem and/or acknowledge a problem description (which is one of the most important parts in my opinion for inviting contributors: define a problem to be solved rather well -- together).

jgehrcke avatar Dec 16 '20 19:12 jgehrcke

Loud and clear. Sorry if I was crass - I catch a lot of flack in issue trackers, and appreciate the clarification! I completely understand what you mean now, and apologize for being short.

On the issue itself - totally understand what folks are saying. What I was trying to get across is that I don't believe there are tests which dig into the specific timing of the retries. I'd like to avoid having a patch floated that "fixes" the issue without having a fairly in-depth suite of tests that specifically cover backoff expectations. If someone submitted a fix today for this, it's likely all the tests we have in place would just pass with no changes. After that, it's very likely that the next patch breaks it (or you know, I accidentally break it).

Thanks for bearing with my being grumpy.

JustinBeckwith avatar Dec 16 '20 21:12 JustinBeckwith

Thank you for the kind words, Justin! Thanks for being vulnerable. I totally see where you're coming from with "I catch a lot of flack in issue trackers" -- and I am glad that we figured this out w/o feeling bad after all. Feels good that we talked through that; thanks for taking a moment to type up this response. I love open source also because of interactions like this.

jgehrcke avatar Dec 17 '20 11:12 jgehrcke

Between the first attempt (attempt 0) and the first retry (attempt 1) there should be a non-zero delay. With the current code there isn't: there is no delay.

attempt to fix that in https://github.com/JustinBeckwith/retry-axios/pull/163

jgehrcke avatar Aug 16 '21 10:08 jgehrcke