aws-sdk-go-v2 icon indicating copy to clipboard operation
aws-sdk-go-v2 copied to clipboard

imds errors should be compatible with the smithy.APIError interface

Open TopherGopher opened this issue 4 years ago • 8 comments

Confirm by changing [ ] to [x] below:

Describe the question I've gone through the error handling documentation, and I'm having some issue around detecting 403/404 status codes from the metadata service and S3. This is a snippet of the code from imds specifically - you should be able to replicate a 404 code by running this on a local environment without a mocked imds service:

	c := imds.NewFromConfig(awsConf)
	output, err := c.GetInstanceIdentityDocument(
		context.Background(), &imds.GetInstanceIdentityDocumentInput{}, func(options *imds.Options) {
			options.Retryer = retry.AddWithMaxAttempts(aws.NopRetryer{}, 1)
	if err != nil {
		var gaiErr *smithy.GenericAPIError
		var aerr smithy.APIError
		var rerr *awshttp.ResponseError
		var opErr *smithy.OperationError
		switch {
		case errors.As(err, &rerr) && rerr.HTTPStatusCode() == 404:
			// If the metadata service isn't available, then return
			// a dummy document. This is common when running
			// aws-runas in --ec2 mode
			awsMetadata = imds.InstanceIdentityDocument{
				Region:    awsConf.Region,
				AccountID: qaAWSAccountID,
			}
			return awsMetadata, nil
		case errors.As(err, &aerr):
			// If the metadata service isn't available, then return
			// a dummy document. This is common when running
			// aws-runas in --ec2 mode
			awsMetadata = imds.InstanceIdentityDocument{
				Region:    awsConf.Region,
				AccountID: qaAWSAccountID,
			}
			l.WithFields(logrus.Fields{
				"err":               err,
				"aerr.ErrorCode":    aerr.ErrorCode(),
				"aerr.ErrorMessage": aerr.ErrorMessage(),
				"aerr.ErrorFault":   aerr.ErrorFault(),
				"aerr.Error":        aerr.Error(),
				"aerr":              aerr,
				"errType":           "APIError",
			}).Error("Could not fetch instance identity document - APIError")
		case errors.As(err, &gaiErr):
			// If the metadata service isn't available, then return
			// a dummy document. This is common when running
			// aws-runas in --ec2 mode
			awsMetadata = imds.InstanceIdentityDocument{
				Region:    awsConf.Region,
				AccountID: qaAWSAccountID,
			}
			l.WithFields(logrus.Fields{
				"err":                 err,
				"gaiErr.ErrorCode":    gaiErr.ErrorCode(),
				"gaiErr.ErrorMessage": gaiErr.ErrorMessage(),
				"gaiErr/ErrorFault":   gaiErr.ErrorFault(),
				"gaiErr.Error":        gaiErr.Error(),
				"gaiErr":              gaiErr,
				"errType":             "GenericAPIError",
			}).Error("Could not fetch instance identity document")
		case errors.As(err, &opErr):
			// This is an unexpected error
			l.WithFields(logrus.Fields{
				"err":             err,
				"opErr.Operation": opErr.Operation(),
				"opErr.Service":   opErr.Service(),
				"opErr.Unwrap":    opErr.Unwrap(),
				"errType":         "OperationError",
			}).Error("Could not fetch instance identity document for an unanticipated reason as OperationError")
		default:
			// Not an operation error - may never hit default
			l.WithFields(logrus.Fields{
				"message": err,
				"note":    "We got an unexpected error",
			}).Error("Could not retrieve metadata about the EC2 instance")
		}
	}

This hits the OperationError block - Output:

level=error
msg="Could not fetch instance identity document for an unanticipated reason as OperationError"
err="operation error ec2imds: GetInstanceIdentityDocument, http response error StatusCode: 404, request to EC2 IMDS failed"
errType=OperationError
func=LoadAccountRelatedMetadata
opErr.Operation=GetInstanceIdentityDocument
opErr.Service=ec2imds
opErr.Unwrap="http response error StatusCode: 404, request to EC2 IMDS failed"

I might not be understanding how to work with errors here though. Do I check for OperationError, Unwrap, then use errors.As? Is there a concrete error type somewhere for checking for AccessDenied and NotFound errors?

TopherGopher avatar Jun 17 '21 19:06 TopherGopher

Thanks for reaching out @TopherGopher. The Handling Errors doc might help with this, and provide more details on how to handle errors with the SDK. There are areas that need to be filled out in those docs such as the specific error types that could be returned, and checked for.

The v2 SDK took a very different approach to error handing vs the v1 SDK. The errors returned by API operation calls are layered with the OperationError at the top. While you could use Unwrap at each layer to inspect the subsequent layer, it is not necessary. The errors.As utility will perform the walking automatically, and walk down each layer until it finds an error that matches the type, or interface specified. Generally you don't want to use errors.Is at all with SDK errors, as it checks for error values that are equal, not same type/interface.

For example, if you're only interested in if the error has a HTTP status code, you could check for either an error with a method of HTTPStatusCode() int, or check for the ResponseError type directly.

type StatusCodeError interface {
    HTTPStatusCode() int
}

var statusCodeErr StatusCodeError
if error.As(err, &statusCodeErr) {
    fmt.Println("got status code", statusCodeErr.HTTPStatusCode())
}
// import smithyhttp "github.com/aws/smithy-go/transport/http"

type respErr smithyhttp.ResponseError
if error.As(err, &respErr) {
    fmt.Println("got status code", respErr.HTTPStatusCode())
}

This pattern can be applied to any of the layered errors returned by the SDK. The biggest limitation to this pattern is that you cannot specify an interface that combines multiple error layers, e.g. HTTPStatusCode and OperationError's Operation name


If your application is interested in checking for API errors from a service response you can either check for the specific error or generic API error code and message via type, smithy.APIError. You'll probably never need to check against smithy.GenericAPIError directly. This type is a fallback type the SDK will deserialize the response error into for API errors returned by a service that the SDK does not know about, and does not have a generated type for.

jasdel avatar Jun 17 '21 22:06 jasdel

Hey @jasdel - Thanks for the response.

I linked the same error handling doc and included an implementation of catching the smithy.APIError and a different version of the ResponseError in my example code, so cool - we're on the same page with how to call them and my understanding of using Is/As is correct it sounds like. The key piece I was missing that you said in your response is that it's a smithyhttp.ResponseError, which wasn't in the documentation - I believe the one in the docs is github.com/aws/aws-sdk-go-v2/aws/transport/http as opposed to github.com/aws/smithy-go/transport/http It's confusing to me though that the imds 404 error would still not be recognized as either as a smithy.APIError or awshttp.ResponseError, and instead had to be the smithyhttp.ResponseError error, but I'll use this pattern moving forward.

Is there an error type for authentication failure specifically for all libraries? How does your team handle authentication errors?

TopherGopher avatar Jun 17 '21 23:06 TopherGopher

Thanks for the update.

The key piece I was missing that you said in your response is that it's a smithyhttp.ResponseError, which wasn't in the documentation - I believe the one in the docs is github.com/aws/aws-sdk-go-v2/aws/transport/http as opposed to github.com/aws/smithy-go/transport/http

Thanks for pointing that out. The aws/transport/http#ResponseError embeds the smithy-go version and should be used by the SDK all HTTP based response errors. The smithy-go version is supposed to be a generic ResponseError container, that could be use with other SDK clients not just AWS.

It's confusing to me though that the imds 404 error would still not be recognized as either as a smithy.APIError or awshttp.ResponseError, and instead had to be the smithyhttp.ResponseError error, but I'll use this pattern moving forward.

Thanks for pointing that out. I'd forgotten that the EC2 IMDS Client uses the smithy-go version of ResponseError. I think the intention here is that EC2 IMDS API does not send back a RequestID like the other AWS API do. Which was the reason for it being smithy-go version. Not having the RequestID is the distinction the SDK's version adds between two types.

I think we need to review if this should be returning the AWS version of ResponseError because, while the IMDS API does not provide a ResponseID, it still is probably within the scope of an AWS API client.

Is there an error type for authentication failure specifically for all libraries? How does your team handle authentication errors?

Authentication errors will generally be 403 HTTP status code, and no the SDK does not handle these. The SDK considers authentication errors to be terminal failures. This is for two reasons:

  • If the SDK starts out with invalid or no credentials there is no way it is able to refresh them.
  • If the SDK was unable to refresh the underlying credentials before the authentication error occurred, some failure in the system occurred that triggered the error. Applications may be able to retry these kinds of exceptions if they have knowledge of how the credentials are refreshed.

jasdel avatar Jun 18 '21 00:06 jasdel

Should I create a ticket to get the IMDS client to start supporting a smithy.APIError?

I'm also having additional issues trying to use type specific errors in SQS.

For example, I'm trying to do a SendMessage and confirm a queue does not exist:

	params := sqs.SendMessageInput{
		MessageBody: aws.String(body),
		QueueUrl:    aws.String("https://sqs.us-east-1.amazonaws.com/165297049243/TOPHER_I_DONT_EXIST"),
	}
	_, err := q.service.SendMessage(context.Background(), &params)
	if err != nil {
		var QueueDoesNotExist *types.QueueDoesNotExist
		var apiErr smithy.APIError
		if errors.As(err, &QueueDoesNotExist) {
                         // When a queue is not found, it should hit this block
			 fmt.Println("I'M THE CORRECT ERROR")
                          panic(QueueDoesNotExist)
		} else if errors.As(err, &apiErr) {
                         // But this error is considered still to be an APIError.
                         fmt.Println("I'M AN API ERROR")
			panic(apiErr)
		}

You'll see that there is a block for checking errors.As(err, &QueueDoesNotExist) against *types.QueueDoesNotExist, but it doesn't hit this block. Instead, it hits the generic APIError block and returns the following:

I'M AN API ERROR
panic: api error AWS.SimpleQueueService.NonExistentQueue: The specified queue does not exist for this wsdl version.

Obviously, I can catch the apiErr and check the apiErr.ErrorCode() for AWS.SimpleQueueService.NonExistentQueue, but it seems like unpacking into types.QueueDoesNotExist should work based on what I'm doing in other libraries.

Is this a bug in types.QueueDoesNotExist specifically or am I somehow misusing it?

TopherGopher avatar Jun 24 '21 20:06 TopherGopher

Thanks for the update @TopherGopher. Lets keep this issue for the IMDS issues.

Though, I think we should create a separate issue for the SQS issue you found. This issue probably will impact all AWS SDKs that generate types, and unmarshalers for modeled errors.

It looks like the SQS API model is incomplete, or incorrectly modeling the errors that are returned. The SDK only knows about the QueueDoesNotExist (api model) "error code", where as it looks like the services is returning a completely different error code, AWS.SimpleQueueService.NotExistentQueue that the SDKs have no knowledge of. The API model will need to be updated in order for the SDKs to have knowledge of this error, and be able to deserialize it.

jasdel avatar Jun 25 '21 17:06 jasdel

Agree - let's work IMDS issue here - tldr; the IMDS service returns errors that are not recognized as smithy.APIErrors and are not parsable into awsHttp.ResponseError (github.com/aws/aws-sdk-go-v2/aws/transport/http), but rather are only recognized as smithyHttp.ResponseError (github.com/aws/smithy-go/transport/http) In my ideal world, we should be able to catch the errors coming from the imds service as a generic APIError

For SQS, I created https://github.com/aws/aws-sdk-go-v2/issues/1319 to track those issues.

TopherGopher avatar Jun 28 '21 16:06 TopherGopher

We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.

github-actions[bot] avatar Jul 09 '22 00:07 github-actions[bot]

Please keep this issue open.

TopherGopher avatar Jul 09 '22 00:07 TopherGopher