aws-sdk-go-v2 icon indicating copy to clipboard operation
aws-sdk-go-v2 copied to clipboard

SQS SendMessageBatch: Wrong Error When Sending Very Long Message

Open zachajon-cisco opened this issue 9 months ago • 6 comments

Acknowledgements

  • [x] I have searched (https://github.com/aws/aws-sdk/issues?q=is%3Aissue) for past instances of this issue
  • [x] I have verified all of my SDK modules are up-to-date (you can perform a bulk update with go get -u github.com/aws/aws-sdk-go-v2/...)

Describe the bug

From the sqs pkg, the function SendMessageBatch is not returning the correct error when being passed a string message that is 1662853 characters or longer.

Regression Issue

  • [ ] Select this option if this issue appears to be a regression.

Expected Behavior

Expected the error below:

operation error SQS: SendMessageBatch, https response error StatusCode: 400, RequestID: 38cbfd37-878c-5d18-84b3-9d769e600fe5, AWS.SimpleQueueService.BatchRequestTooLong: Batch requests cannot be longer than 262144 bytes. You have sent 1662853 bytes."

Current Behavior

Current Error:

operation error SQS: SendMessageBatch, https response error StatusCode: 413, RequestID: , deserialization failed, failed to decode response body, invalid character 'H' looking for beginning of value"

Reproduction Steps

Add in URL for SQS queue.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/service/sqs"
	sqstypes "github.com/aws/aws-sdk-go-v2/service/sqs/types"
	"github.com/aws/aws-sdk-go/aws"
)

func generateLongString(length int) (longString string, size int) {
	for i := 0; i < length; i++ {
		longString += "a"
	}
	size = len(longString)
	return
}

func sendMessage(svc *sqs.Client, queueURL string, msgLen int) error {
	//Generate Long String
	fmt.Printf("Sending Message to SQS Queue with length: %d\n", msgLen)
	lm, size := generateLongString(msgLen)
	fmt.Printf("Confirm Long String Length: %d\n", size)

	//Send message
	input := &sqs.SendMessageBatchInput{
		Entries: []sqstypes.SendMessageBatchRequestEntry{
			{
				Id:          aws.String("Id"),
				MessageBody: aws.String(lm),
			},
		},
		QueueUrl: &queueURL,
	}
	_, err := svc.SendMessageBatch(context.TODO(), input)
	if err != nil {
		return err
	}

	return nil
}

func main() {
	//Queue URL
	queueURL := "<REDACTED>"
	//Setup SQS Client
	cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithRegion("us-east-1"))
	if err != nil {
		log.Fatal("failed to initialize SQS client")
	}
	svc := sqs.NewFromConfig(cfg)
	err = sendMessage(svc, queueURL, 1662852)
	if err != nil {
		fmt.Printf("Error: %+v\n", err)
	}

	err = sendMessage(svc, queueURL, 1662853)
	if err != nil {
		fmt.Printf("Error: %+v\n", err)
	}

}
//Sending Message to SQS Queue with length: 1662852
//Confirm Long String Length: 1662852
//Error: operation error SQS: SendMessageBatch, https response error StatusCode: 400, RequestID: //9fe01b67-c67f-5417-a9e0-7b26efc0d32c, AWS.SimpleQueueService.BatchRequestTooLong: Batch //requests cannot be longer than 262144 bytes. You have sent 1662852 bytes.
//--------------------
//Sending Message to SQS Queue with length: 1662853
//Confirm Long String Length: 1662853
//Error: operation error SQS: SendMessageBatch, https response error StatusCode: 413, RequestID: , //deserialization failed, failed to decode response body, invalid character 'H' looking for beginning of value
//--------------------

Possible Solution

No response

Additional Information/Context

No response

AWS Go SDK V2 Module Versions Used

module github.com/cisco-sbg/talos-dae-tap-code

go 1.24

toolchain go1.24.0

require (
	github.com/aws/aws-sdk-go v1.55.6
	github.com/aws/aws-sdk-go-v2 v1.36.3 // indirect
	github.com/aws/aws-sdk-go-v2/config v1.29.8
	github.com/aws/aws-sdk-go-v2/service/sqs v1.38.0
	github.com/aws/smithy-go v1.22.3 // indirect
)

require (
	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.3 // indirect
	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.29.0 // indirect
)

require (
	github.com/aws/aws-sdk-go-v2/credentials v1.17.61 // indirect
	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.30 // indirect
	github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.34 // indirect
	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.34 // indirect
	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.15 // indirect
	github.com/aws/aws-sdk-go-v2/service/sso v1.25.0 // indirect
	github.com/aws/aws-sdk-go-v2/service/sts v1.33.16 // indirect
)

Compiler and Version used

go version go1.24.0 darwin/arm64

Operating System and version

15.3.1 (24D70)

zachajon-cisco avatar Mar 04 '25 16:03 zachajon-cisco

First, note that you don't need to import "github.com/aws/aws-sdk-go/aws", which corresponds to go v1. You can import the same aws utility from go v2 with "github.com/aws/aws-sdk-go-v2/aws"

However, even with it, I can't reproduce this. Replacing the queue URL with mine and I get the expected error

$ go mod tidy && go run main.go

Sending Message to SQS Queue with length: 1662852
Confirm Long String Length: 1662852
Error: operation error SQS: SendMessageBatch, https response error StatusCode: 400, RequestID: aa19cfe4-2919-5891-a6d6-afa5b743715d, AWS.SimpleQueueService.BatchRequestTooLong: Batch requests cannot be longer than 262144 bytes. You have sent 1662852 bytes.
Sending Message to SQS Queue with length: 1662853
Confirm Long String Length: 1662853
Error: operation error SQS: SendMessageBatch, https response error StatusCode: 400, RequestID: bcaaa1d0-8c53-569e-867a-c19112e5c9cd, AWS.SimpleQueueService.BatchRequestTooLong: Batch requests cannot be longer than 262144 bytes. You have sent 1662853 bytes.

My go.mod

module example

go 1.22.12

require (
	github.com/aws/aws-sdk-go v1.55.6
	github.com/aws/aws-sdk-go-v2/config v1.29.9
	github.com/aws/aws-sdk-go-v2/service/sqs v1.38.1
)

require (
	github.com/aws/aws-sdk-go-v2 v1.36.3 // indirect
	github.com/aws/aws-sdk-go-v2/credentials v1.17.62 // indirect
	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.30 // indirect
	github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.34 // indirect
	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.34 // indirect
	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.3 // indirect
	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.15 // indirect
	github.com/aws/aws-sdk-go-v2/service/sso v1.25.1 // indirect
	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.29.1 // indirect
	github.com/aws/aws-sdk-go-v2/service/sts v1.33.17 // indirect
	github.com/aws/smithy-go v1.22.2 // indirect
)

Are you running this behind a proxy? That 413 may be coming from your proxy.

If you keep facing this, log the request and response you get from the HTTP client to ensure you get more information to move forward

//Setup SQS Client
cfg, err := config.LoadDefaultConfig(
	context.TODO(), config.WithRegion("us-east-1"), 
	config.WithClientLogMode(aws.LogRequest|aws.LogResponseWithBody))

Madrigal avatar Mar 05 '25 00:03 Madrigal

Hello @Madrigal !

Noted about using aws-sdk-go-v2 instead of v1.

Interesting. I editted my go.mod to match yours and I added the extra logging you suggested and below is the Debug response:

Expected Error

SDK 2025/03/04 20:31:40 DEBUG Request
...
Content-Length: 1662976
...
SDK 2025/03/04 20:31:41 DEBUG Response
HTTP/1.1 400 Bad Request
Content-Length: 142
Connection: keep-alive
Content-Type: application/x-amz-json-1.0
Date: Wed, 05 Mar 2025 01:31:41 GMT
X-Amzn-Query-Error: AWS.SimpleQueueService.BatchRequestTooLong;Sender
X-Amzn-Requestid: b5a3d045-e072-5bd5-8ffa-b0044bce45de

Unexpected Error

SDK 2025/03/04 20:32:43 DEBUG Request
...
Content-Length: 1662977
...
SDK 2025/03/04 20:32:43 DEBUG Response
HTTP/1.1 413 Request Entity Too Large
Connection: close
Content-Length: 43
Content-Type: text; charset=utf-8
Date: Wed, 05 Mar 25 01:32:43 GMT

HTTP content length exceeded 1662976 bytes.

zachajon-cisco avatar Mar 05 '25 01:03 zachajon-cisco

Was able to reproduce this by increasing the message size on my end on your example (probably a side effect of the name of my queue) and now I get the same error.

While it's true that this is indeed an unexpected error, I'm not sure it matters much at the end of the day. Both the 413 error and the 400 error mean that the message is too large for SQS to handle.

Is there a specific part of your workflow that is impacted by this?

Madrigal avatar Mar 05 '25 19:03 Madrigal

In our codebase currently, we added a workaround so instead of just checking:

var batchTooLongErr *sqstypes.BatchRequestTooLong
if errors.As(err, &batchTooLongErr) {
...
}

We now have a helper functions that checks for both BatchRequestTooLong and HTTP 413

func isBatchTooLongErr(err error) bool {
	var batchTooLongErr *sqstypes.BatchRequestTooLong
	var httpResponseErr *awshttp.ResponseError
	if errors.As(err, &batchTooLongErr) {
		return true
	} else if errors.As(err, &httpResponseErr) {
		//AWS Go SDK V2 Bug - Issue: https://github.com/aws/aws-sdk-go-v2/issues/3026
		llog.Info().Msgf("HTTP status code: %d", httpResponseErr.HTTPStatusCode())
		return httpResponseErr.HTTPStatusCode() == http.StatusRequestEntityTooLarge
	}
	return false
}

zachajon-cisco avatar Mar 05 '25 19:03 zachajon-cisco

While it's true that this is indeed an unexpected error, I'm not sure it matters much at the end of the day. Both the 413 error and the 400 error mean that the message is too large for SQS to handle.

Is there a specific part of your workflow that is impacted by this?

Given the workflow I posted above, we aren't "negatively" impacted by this. However, it's the idea that if the message we are sending is too long, we should expect the error BatchRequestTooLong and not have to worry also about catching a HTTP 413 error.

Based purely on how I think the package should work, I think the 413 case should be handled in the Go SDK, and return BatchRequestTooLong in that case. However I do understand there is a cost/benefit analysis needed to determine if doing this is worth it.

zachajon-cisco avatar Mar 05 '25 19:03 zachajon-cisco

Got it, thanks for explaining your workaround.

Will leave this open in case other people find the same issue. Will file it as p3 on our end

Madrigal avatar Mar 05 '25 19:03 Madrigal