aws-sam-cli icon indicating copy to clipboard operation
aws-sam-cli copied to clipboard

Bug: sam local generate-event does not escape special characters in input

Open rjw57 opened this issue 3 years ago • 5 comments

Description:

sam local generate-event does not escape special characters when interpolating values into the output.

Note that this was previously raised as #2848 but I think it was closed due to a misunderstanding. Regardless of what the body string is the output from sam local generate-event should still be valid JSON. The comment indicating this behaviour was by design seemed to confuse the issue as being with the body text being JSON.

In my reproduction below I'm not using a JSON message body.

I suspect that in general anything being inserted into an event template using chevron will have to be escaped.

Steps to reproduce:

I want to generate a test SQS receive message event with the following body:

This message is "plain text".
It is most certainly not JSON-formatted!

And so I use sam local generate-event:

$ cat body.txt
This message is "plain text".
It is most certainly not JSON-formatted!
$ sam local generate-event sqs receive-message --body "$(cat body.txt)"
{
  "Records": [
    {
      "messageId": "19dd0b57-b21e-4ac1-bd88-01bbb068cb78",
      "receiptHandle": "MessageReceiptHandle",
      "body": "This message is "plain text".
It is most certainly not JSON-formatted!",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1523232000000",
        "SenderId": "123456789012",
        "ApproximateFirstReceiveTimestamp": "1523232000001"
      },
      "messageAttributes": {},
      "md5OfBody": "786457a8a924bfb2f34d6b7fc4e2e7da",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:MyQueue",
      "awsRegion": "us-east-1"
    }
  ]
}

Observed result:

The generated JSON is invalid:

$ sam local generate-event sqs receive-message --body "$(cat body.txt)" | python3 -m json.tool
Expecting ',' delimiter: line 6 column 33 (char 161)

To make it clear given the comments on #2848, this is nothing to do with the body being JSON. Irrespective of what body is used, the generated event should still be a valid JSON document.

Expected result:

I'd expect the following JSON event:

{
  "Records": [
    {
      "messageId": "19dd0b57-b21e-4ac1-bd88-01bbb068cb78",
      "receiptHandle": "MessageReceiptHandle",
      "body": "This message is \"plain text\".\nIt is most certainly not JSON-formatted!",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1523232000000",
        "SenderId": "123456789012",
        "ApproximateFirstReceiveTimestamp": "1523232000001"
      },
      "messageAttributes": {},
      "md5OfBody": "786457a8a924bfb2f34d6b7fc4e2e7da",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:MyQueue",
      "awsRegion": "us-east-1"
    }
  ]
}

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: OS X
  2. sam --version: SAM CLI, version 1.53.0
  3. AWS region: N/A

rjw57 avatar Aug 04 '22 15:08 rjw57

@rjw57 Thanks for the report. I am trying to figure out what is going on here and what the original intentions where. I believe we tried to match the APIs, which might require the encoding upfront and therefore we mimicked the same. Which would mean, you would have to handle the encoding before sending it to SAM CLI.

jfuss avatar Aug 10 '22 16:08 jfuss

Which would mean, you would have to handle the encoding before sending it to SAM CLI.

Fair enough if that's what's intended but the local generate-event tool quacks very much like a JSON templating tool and so not escaping the values when interpolating is, I think, surprising behaviour. If it is determined to be intended behaviour, it would be grand if the documentation page could note it explicitly as intended behaviour.

With respect to the possible intent being to mirror the quoting requirements of the underlying API, AIUI, most AWS APIs require the parameters be application/x-www-form-urlencoded. (Certainly the SQS POST API requires this.) So, if the intent was to match the encoding requirements of APIs then you'd expect the generate-event command to require something like the following for my example:

$ sam local generate-event sqs receive-message \
    --body "This%20message%20is%20%22plain%20text%22.%0AIt%20is%20most%20certainly%20not%20JSON-formatted%21"

Which, although it'd output valid JSON, wouldn't reflect the actual event which SQS would deliver to the lambda; SQS would, when the API is called, decode the URL-encoded body before inserting it into the event document sent to the lambda.

It's also surprising for a command-line tool since bash and other shells already have their own escaping behaviour for command-line parameters and so you'd be requiring an encoding which is unusual for CLI tools.

rjw57 avatar Aug 10 '22 17:08 rjw57

@rjw57 Let's back up. I think you are attaching to the your part of my comment.

As sam local generate-event is coded today (at least for sqs), is that the body property needs to be escaped and should be a string. For the most part, we take the input provided in the command line and replace it into the json data. I think there might be a hidden assumption here in that generate-event outputs json and therefore the data passed in must be json. We do things like urlencoding and base64 by default for some properties but not escaping. I am not sure if this was just overlooked or a specific reason we didn't do it (generate-event hasn't changed much over the years).

With that said, I think we could do something like https://github.com/aws/aws-sam-cli/blob/develop/samcli/lib/generated_sample_events/event-mapping.json#L134 maybe for this?

jfuss avatar Aug 10 '22 17:08 jfuss

Shouldn't sam local generate-event sqs receive-message always generate valid json, regardless of what --body is passed?

Why should the following generate invalid file that can't be used with sam local invoke?

sam local generate-event sqs receive-message --body '"'  > a.json
sam local invoke AFunction --event a.json


{"timestamp":"2025-07-29T20:41:14.000Z","message":"An error occurred during JSON parsing: java.lang.RuntimeException\njava.lang.RuntimeException: An error occurred during JSON parsing\nCaused by: java.io.UncheckedIOException: com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.JsonMappingException: Unexpected character ('\"' (code 34)): was expecting comma to separate Object entries\n at [Source: (ByteArrayInputStream); line: 6, column: 18] (through reference chain: com.amazonaws.services.lambda.runtime.events.SQSEvent[\"Records\"]->java.util.ArrayList[0])\n\tat com.amazonaws.services.lambda.runtime.serialization.factories.JacksonFactory$InternalSerializer.fromJson(JacksonFactory.java:176)\nCaused by: com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.JsonMappingException: Unexpected character ('\"' (code 34)): was expecting comma to separate Object entries\n at [Source: (ByteArrayInputStream); line: 6, column: 18] (through reference chain: com.amazonaws.services.lambda.runtime.events.SQSEvent[\"Records\"]->java.util.ArrayList[0])\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:402)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:373)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:375)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:314)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:177)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2105)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1481)\n\tat com.amazonaws.services.lambda.runtime.serialization.factories.JacksonFactory$InternalSerializer.fromJson(JacksonFactory.java:174)\nCaused by: com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.core.JsonParseException: Unexpected character ('\"' (code 34)): was expecting comma to separate Object entries\n at [Source: (ByteArrayInputStream); line: 6, column: 18]\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2418)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:749)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:673)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextFieldName(UTF8StreamJsonParser.java:1061)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:321)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:177)\n\tat com.amazonaws.lambda.thirdparty.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:359)\n\t... 9 more\n\n","level":"ERROR","AWSRequestId":"d3504172-1b13-4d2d-b041-7dce10dc8411"}

saml avatar Jul 29 '25 20:07 saml

Pretty crazy bug! My workaround is to generate the default SQS message and then to use jq to inject the correct body.

Example:

sam local generate-event sqs receive-message \
| jq --arg bucket "my-bucket" \
     --arg key "my-key" '
    .Records[0].body = ({bucket:$bucket, s3_key:$key} | tojson)
' > event.json

event.json is then a valid JSON and I can read the body as expected:

cat event.json
{
  "Records": [
    {
      "messageId": "19dd0b57-b21e-4ac1-bd88-01bbb068cb78",
      "receiptHandle": "MessageReceiptHandle",
      "body": "{\"bucket\":\"my-bucket\",\"s3_key\":\"my-key\"}",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1523232000000",
        "SenderId": "123456789012",
        "ApproximateFirstReceiveTimestamp": "1523232000001"
      },
      "messageAttributes": {},
      "md5OfBody": "7b270e59b47ff90a553787216d55d91d",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:MyQueue",
      "awsRegion": "us-east-1"
    }
  ]
}
jq -r '.Records[0].body | fromjson | .s3_key' event.json
my-key

PierreKiwi avatar Oct 14 '25 21:10 PierreKiwi