aws-sdk-go-v2
aws-sdk-go-v2 copied to clipboard
DynamoDB: Using `context.WithDeadline` to set a timeout hides helpful errors
Describe the bug
Following the "Retries and Timeouts" section of the DynamoDB client documentation, I have added a deadline to my call to GetItem
. This works great for making the call fail when the timeout is exceeded! However, the context deadline exceeded
error overwrites helpful error output that the client otherwise would have returned.
This makes troubleshooting difficult for "retryable" errors like forgetting to connect to the VPN before running integration tests, particularly for newer teammates who have less context for what might have gone wrong.
Expected Behavior
Run below code, get output like this:
$ go run main.go
Error: operation error DynamoDB: GetItem, exceeded maximum number of attempts, 1, https response error StatusCode: 0, RequestID: , request send failed, Post "http://localhost:45666/": dial tcp [::1]:45666: connect: connection refused
output: <nil>
dial tcp [::1]:45666: connect: connection refused
is a super helpful error message, presumably generated by the OS and/or go's networking code. It even tells us the port that we typoed!
Current Behavior
Run below code, get output like this:
$ go run main.go
Error: operation error DynamoDB: GetItem, request canceled, context deadline exceeded
output: <nil>
All we get is the operation we were performing and the fact that the deadline was exceeded, no hint as to why.
Reproduction Steps
Here's a go program that reproduces the issue, as referenced above.
package main
import (
"context"
"fmt"
"time"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/credentials"
"github.com/aws/aws-sdk-go-v2/service/dynamodb"
"github.com/aws/aws-sdk-go-v2/service/dynamodb/types"
)
const (
tableName = "some-table"
key = "some-key"
)
func main() {
conf, err := config.LoadDefaultConfig(context.Background(),
// Connect to localstack running on port 4566 for demonstration purposes
// see https://localstack.cloud/
// This program will run and repro the issue even if localstack isn't running
config.WithRegion("us-west-2"),
config.WithCredentialsProvider(credentials.NewStaticCredentialsProvider("test", "test", "test")),
config.WithEndpointResolverWithOptions(aws.EndpointResolverWithOptionsFunc(
func(service, region string, _ ...interface{}) (aws.Endpoint, error) {
return aws.Endpoint{
PartitionID: "aws",
SigningRegion: "us-west-2",
Source: aws.EndpointSourceCustom,
URL: "http://localhost:45666", // NOTE: typo in port
HostnameImmutable: true,
}, nil
},
)),
)
if err != nil {
panic(err)
}
ctx := context.Background()
// comment these two lines to get a helpful error
ctx, cancel := context.WithDeadline(ctx, time.Now().Add(600*time.Millisecond))
defer cancel()
dynamo := dynamodb.NewFromConfig(conf)
out, err := dynamo.GetItem(
ctx,
&dynamodb.GetItemInput{
TableName: aws.String(tableName),
Key: map[string]types.AttributeValue{
"id": &types.AttributeValueMemberS{Value: key},
},
},
// Uncomment this to get a helpful error
// func(o *dynamodb.Options) { o.RetryMaxAttempts = 1 },
)
if err != nil {
fmt.Printf("Error: %s\n", err.Error())
}
fmt.Printf("output: %+v\n", out)
}
Possible Solution
The error message knows that it's in DynamoDB: GetItem
, so clearly you all have some control over it. It would be great if it also listed the last retryable error, something like this:
$ go run main.go
Error: operation error DynamoDB: GetItem, request canceled, context deadline exceeded, last retryable error was: https response error StatusCode: 0, RequestID: , request send failed, Post "http://localhost:45666/": dial tcp [::1]:45666: connect: connection refused
output: <nil>
Additional Information/Context
Either removing the timeout, limiting the number of retries so it completes before the deadline, or increasing the timeout to be longer than the maximum number of retries will reveal the helpful error.
AWS Go SDK V2 Module Versions Used
go.mod
:
module github.com/droberts-sea/timeout-errors
go 1.18
require (
github.com/aws/aws-sdk-go v1.44.10
github.com/aws/aws-sdk-go-v2 v1.16.3
github.com/aws/aws-sdk-go-v2/config v1.15.5
github.com/aws/aws-sdk-go-v2/credentials v1.12.0
github.com/aws/aws-sdk-go-v2/service/dynamodb v1.15.4
)
require (
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.12.4 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.10 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.4 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.11 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.9.1 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.7.4 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.4 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.11.4 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.16.4 // indirect
github.com/aws/smithy-go v1.11.2 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect
)
Compiler and Version used
go version go1.18 darwin/amd64
Operating System and version
OSX 12.3.1
One of our ideas when we designed the SDK was to make the middleware.Metadata
available for error types. This type contains information about the operation's middleware execution flow including information about the errors that had occurred that were trigger retries. While this information is accessible on successful operation calls (for example see S3's PutObjectRetentionOutput), it is not currently accessible on errors. I think this would make a good feature request issue for us to track making this information available on errors as well, as it would help aide investigations into why a deadline was reached.