aws-sdk-kotlin
aws-sdk-kotlin copied to clipboard
Waiters do not have a configurable retry / timeout strategy
Describe the bug
When using waiters the underlying operation may take considerable time that exceeds the hard-coded attempts/timeouts. This results in the waiter throwing a TooManyAttemptsException.
For example, RdsClient.waitUntilDBSnapshotAvailable has a hard-coded retry strategy:
val strategy = StandardRetryStrategy {
maxAttempts = 20
tokenBucket = InfiniteTokenBucket
delayProvider {
initialDelay = 30_000.milliseconds
scaleFactor = 1.5
jitter = 1.0
maxBackoff = 120_000.milliseconds
}
}
In a situation where there is a large snapshot that takes greater than the ~17m the retry strategy allows for it will fail; this is unexpected as a) the operation is still in progress (in our case, it completed successfully after ~30m) and b) waitUntilDBSnapshotAvailable isn't documented as having a timeout / doesn't offer any configurability on timeouts.
Regression Issue
- [ ] Select this option if this issue appears to be a regression.
Expected behavior
Expecting that waiters either wait indefinitely (which is the implied behaviour, though not ideal) or take a default-value parameter with a retry strategy (or any alternate approach that makes it clear there are timeouts and how to configure them).
The AWS Java SDK allow waiters to be configurable: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/waiters.html
Current behavior
As described above RdsClient.waitUntilDBSnapshotAvailable (and likely other Waiters) times out unexpectedly (well, unexpected if you haven't read the code).
Steps to Reproduce
Execute a Waiter whose operation takes more than the hard-coded retry/timeout limit.
Possible Solution
Ideally the API would be more up-front and configurable for waiters to allow configurability on the timeouts and clearly document that it isn't solely waiting for the primary condition (there's also timeouts in play).
As a workaround have cobbled together the below to allow coarse-grained configurability:
private suspend fun waitForSnapshot(snapshotId: String) {
var retries = 0
while (true) {
try {
rdsClient.waitUntilDBSnapshotAvailable(
DescribeDbSnapshotsRequest { this.dbSnapshotIdentifier = snapshotId }
)
return
} catch (e: TooManyAttemptsException) {
if (retries >= 3) {
throw e
}
retries++
}
}
}
Context
Using the SDK to initiate an RDS snapshot, and wait for it to complete before continuing on with upgrading the database.
AWS SDK for Kotlin version
1.3.42
Platform (JVM/JS/Native)
JVM
Operating system and version
MacOS / Linux