cloudformation-guard icon indicating copy to clipboard operation
cloudformation-guard copied to clipboard

[Enhancement] Support common text parsing / manipulation functions

Open iann0036 opened this issue 4 years ago • 9 comments

Is your feature request related to a problem? Please describe.

For many resources in CloudFormation, a property block can be defined as a JSON string, rather than a nested block. This makes it impossible to inspect a specific sub-property and enforce a ruleset against it.

Describe the solution you'd like

Add a set of in-built functions that could perform a text-based manipulation, which could include JSON parsing, URL encoding/decoding, and a Regex-style replacement.

when %my_variable is_string {
    let my_parsed_variable = jsonparse %my_variable
}
let my_parsed_variable = urldecode %my_variable
let my_parsed_variable = regexreplace /ab(cd)ef/ %my_variable 'zy$1vu'

Describe alternatives you've considered

Something out-of-band probably, or some complex Regex magic if I hated myself enough.

Additional context

iann0036 avatar May 20 '21 11:05 iann0036

Thanks for the feature request. jsonparse and urldecode makes sense. @iann0036 can you elaborate a bit more on the regex usage? There are cases where regex capture into a variable that can be referenced later seems useful. And we were working out details with dynamic substitution of variable as a part of a regex creation, in addition to custom messages. Will any of these help here with this function?

dchakrav-github avatar May 20 '21 20:05 dchakrav-github

At least for public resource types, we also have plans to improve the modeling of the property types themselves, which should help for those

PatMyron avatar May 20 '21 20:05 PatMyron

@iann0036 do you know of any types where only json strings are supported (I checked step functions and api gateway). A sub-optimal solution might to force the use of yaml / json objects (which could be achieved by removing quotes).

benbridts avatar May 20 '21 20:05 benbridts

@dchakrav-github There's a few use cases I can think of for regex capture / sub functionality. For example, matching an SQS Queue URL and ARN (i.e. capturing and comparing name / account ID) within related resources.

@PatMyron @benbridts I don't know of any, but in the end developers can use string-encapsulated when it makes sense to, and I wouldn't want to change that behaviour. Like you said, it's sub-optimal.

iann0036 avatar May 20 '21 22:05 iann0036

I just ran into a use case for this, let's say I have this data file:

{
  "typeName" : "AWS::Amplify::Branch",
  "required" : [ "AppId", "BranchName" ],
}

And I want a rule to check that BranchName is not in the required list (and XyzName not in the list if typeName is Foo:Bar::Xyz).

Being able to write something like

#following the syntax proposed by Ian, switching the order could be closer to how other rules are written?
let forbidden_name = regex_replace /::(\w+)$/ typeName '$1Name'

rule ensure_type_name_not_required when required !empty{
    required[*] {
        this != %forbidden_name
    }
}

benbridts avatar May 24 '21 13:05 benbridts

@iann0036 and @benbridts thanks for the clarification. Just clarifying so that we are on the same page

    regex_replace <regex> <query> <substituition-string>

What would be the semantics for

  1. query does not return any result? Would the value be empty? Would it rather fail?
  2. Can the substitution string itself be a regex? E.g. '/$Name/'

dchakrav-github avatar May 24 '21 16:05 dchakrav-github

regex_replace <regex> <query> <substituition-string>

Yes, although I think

    regex_replace <query> <regex> <substituition-string>

would also make sense

query does not return any result? Would the value be empty? Would it rather fail?

I think having it fail makes the most sense, assuming the error is clear, that will lead to the least surprises. If you're parsing an optional value, using a block would be a straight forward work-around:

when typeName exists {
  let forbidden_name = ...
}

Can the substitution string itself be a regex? E.g. '/$Name/'

I'd like to be able something like the below, but I think being able to use variables as part of a regex is a separate request.

# this only captures the name
let forbidden_name = regex_replace /::(\w+)$/ typeName '$1'

rule ensure_type_name_not_required when required !empty{
    required[*] {
        # I have no idea what the syntax should be to not conflict with regex
        # but this would match both BranchName and Brachname for AWS::Amplify::Branch
        this != /{%forbidden_name}[nN]ame/
    }
}

benbridts avatar May 24 '21 16:05 benbridts

This would be great feature! as we have JSON that is in a string and we need to parse it to check values:

            "after": {
                "policy": "{\r\n  \"Version\": \"2012-10-17\",\r\n  \"Id\": \"sqspolicy\",\r\n  \"Statement\": [\r\n    {\r\n        \"Sid\": \"2021-06-24-16-27-55-bpaaxt\",\r\n        \"Effect\": \"Allow\",\r\n        \"Principal\": {\r\n        \"AWS\": [\r\n            \"arn:aws:iam::123456789012:root\",\r\n            \"arn:aws:iam::123456789012:root\",\r\n            \"arn:aws:iam::593397909543:root\"\r\n        ]\r\n        },\r\n        \"Action\": [\r\n        \"SQS:GetQueueAttributes\",\r\n        \"SQS:GetQueueUrl\",\r\n        \"SQS:SendMessage\"\r\n        ],\r\n        \"Resource\": \"arn:aws:sqs:us-east-1:123456789012:test\"\r\n    },\r\n    {\r\n        \"Sid\": \"2021-06-24-16-27-55-bdxcen\",\r\n        \"Effect\": \"Allow\",\r\n        \"Principal\": {\r\n        \"AWS\": [\r\n            \"arn:aws:iam::123456789012:root\",\r\n            \"arn:aws:iam::123456789012:root\"\r\n        ]\r\n        },\r\n        \"Action\": [\r\n        \"SQS:GetQueueAttributes\",\r\n        \"SQS:GetQueueUrl\",\r\n        \"SQS:SendMessage\"\r\n        ],\r\n        \"Resource\": \"arn:aws:sqs:us-east-1:123456789012:test\"\r\n    }\r\n    ]\r\n}\r\n",
                "queue_url": https://sqs.us-east-1.amazonaws.com/123456789012/test-queue
            },

We need to check account numbers for cross account to make sure valid accounts in our org and settings correct.

Like the cross account examples for cloudformation in the examples folder.

Thanks

pjshort22 avatar Jul 21 '21 10:07 pjshort22

Support for recursive and named rules is complete and we will release it in next version. However, the same needs to be extended for functions which will release as a quick follow.

shreyasdamle avatar Nov 15 '21 23:11 shreyasdamle

Is there an update on using regex with variables? The use case is to assert concatenated strings. Is there a workaround? Following a snippet, reduced to keep it simple. What I would need is something to fulfil the assertion on the Resource attribute as per below.

rule check_insecure_connections {
  let config_resource_id = resourceId
  some supplementaryConfiguration.BucketPolicy.policyText.Statement[*] {
    Effect == 'Deny'
    Resource == /^arn:aws:s3:::{%config_resource_id}\/\*$/
  }
}

fabiodouek avatar Mar 07 '23 17:03 fabiodouek

Hi @iann0036 with the release of guard 3.0 we have added many such functions like this, including parsing json, regex replace, substring, join and more. Feel free to checkout our release notes for a much more detailed description. I am going to go ahead and close out this issue.

Feel free to reopen this issue if need be.

Thanks,

joshfried-aws avatar Jul 10 '23 15:07 joshfried-aws

Where can I find documentation about URL Decode function?

lazize avatar Nov 09 '23 23:11 lazize

Hi @lazize you can find that information here: https://github.com/aws-cloudformation/cloudformation-guard/blob/main/docs/FUNCTIONS.md#url_decode.

Please let me know if you require any further assistance.

Thanks,

joshfried-aws avatar Nov 10 '23 14:11 joshfried-aws