pulumi-aws-native icon indicating copy to clipboard operation
pulumi-aws-native copied to clipboard

Make a custom retry backoff with more agressive retry behavior.

Open mjeffryes opened this issue 2 years ago • 3 comments

We've noticed that this retry backoff doesn't actually seem to max out at 30s as claimed, but actually maxes out at ~1min.

Even if it worked as designed, the terraform AWS provider retry logic is much more aggressive.

Since backoff is not actually that complicated, we should just implement our own and tune it to poll a bit more aggressively.

mjeffryes avatar Dec 02 '23 00:12 mjeffryes

internal discussion: https://pulumi.slack.com/archives/CK21FUGUA/p1701449167548889

Key excerpts: "I created two of the new elasticache serverless caches, one through AWS Native and one through the console. Both availabe in the console in about 2-3 minutes (slightly over AWS’s marketing claims of about a minute) but the Pulumi update was still going on for at least another minute and a half before it was finished."

"Notably though - Terraform AWS provider defaults to a much tighter settings:

  • Min: 10ms
  • Multiplier: 1.3x
  • Max: None?

Vs. Native which is:

  • Min: 1s
  • Multiplier: 2x
  • Max: 30s"

mjeffryes avatar Dec 19 '23 20:12 mjeffryes

logs.txt ^ logs from a slow update that indicate we're waiting longer than necessary for resource creation.

mjeffryes avatar Dec 19 '23 20:12 mjeffryes

It could just be that the response is giving us a retry after value 1 minute in the future. We can try ignoring that and poll for updates anyway.

mjeffryes avatar Dec 19 '23 20:12 mjeffryes