aws-cdk-rfcs icon indicating copy to clipboard operation
aws-cdk-rfcs copied to clipboard

Defaults & configuration policy

Open eladb opened this issue 4 years ago • 49 comments

PR Champion
- @eladb

(See #163 for a previous attempt at an RFC.)

Description

Allow users to specify or install some code that controls the defaults of AWS resources or applies some policy or validation at the stack or app level.

Progress

  • [x] Tracking Issue Created
  • [ ] RFC PR Created
  • [x] Core Team Member Assigned
  • [ ] Initial Approval / Final Comment Period
  • [ ] Ready For Implementation
    • [ ] implementation issue 1
  • [ ] Resolved

eladb avatar Dec 08 '19 22:12 eladb

Design concept:

We add a method to Stack with the following implementation (name is meh):

public getPropsForResource(className: string, props: any): any {
  return props;
}

Then, we somehow enforce that all L2s will call this method when they are initialized and pass in the props that were passed to them, along with the full name of the class (i.e. aws-s3.Bucket).

This will allow users to override this method at the stack level (e.g. their base MyCompanyStack class) and implement policy around default values. For example, they can enforce that all Buckets are created with encryption:

public getPropsForResource(className: string, props: any): any {
  if (className === 'aws-s3.Bucket') {
    if (props.encryption && props.encryption !== ENCRYPTED) { throw new Error('buckets must be encrypted'); }
  props.encryption = ENCRYPTED;
  }

  return props; // passthrough
}

Something along those lines... Just an option.

eladb avatar Apr 06 '20 07:04 eladb

I have tried to create one example of enforcing user to follow organization defaults and if you want you can create new resources or modify existing resource using Aspects. https://github.com/nirvana124/aws-cdk-library-example

nirvana124 avatar Apr 30 '20 10:04 nirvana124

Isn't this already possible to be implemented using Aspects?

Dzhuneyt avatar Dec 07 '20 17:12 Dzhuneyt

@Dzhuneyt Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.

eladb avatar Dec 08 '20 06:12 eladb

One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the core.Construct.onValidate() method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.

JordanDeBeer avatar Dec 08 '20 18:12 JordanDeBeer

One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the core.Construct.onValidate() method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.

Ha! We recently changed the api for validations and you should be able to just do this:

this.node.addValidation(...);

See: https://docs.aws.amazon.com/cdk/api/latest/docs/constructs.Node.html#addwbrvalidationvalidation

eladb avatar Dec 08 '20 20:12 eladb

would an upstream base validation aspect to support easily adding these validations to a given construct type be useful for this rfc?

I am thinking something like this (pseudo code-ish):

class ValidatorAspect implements core.IAspect {
  type: constructs.IConstructs;
  fn: constructs.IValidation
  constructor(type: constructs.IConstruct, fn: constructs.IValidation) {
    this.type = type
    this.fn = fn
  }

  public visit(node: constructs.IConstruct): void {
    const n = node as constructs.Node;
    if (node instanceof this.type) {
      n.addValidation({
        validate: this.fn
      });
    }
  }
}

it's more reactive compared to the getPropsForResource design, but i can see both being useful. I'd be happy to work on adding something like this if there's interest.

JordanDeBeer avatar Dec 09 '20 16:12 JordanDeBeer

@ericzbeard gave an awesome talk to my company a few months ago. He mentioned that companies keep trying to make a CDK L2.5 library. (That's what I wanted to do first too!) He mentioned that this was not recommended. This concept really seems to fix that issue.

In his blog post, he has a section about "Don't use constructs for compliance"

Another common pattern we have seen, particularly among enterprise customers, is creating a collection of construct libraries based on the L2 constructs included with the AWS CDK, with a 1-1 mapping of subclasses. For example, you might create a class called MyCompanyBucket that extends s3.Bucket and MyCompanyFunction that extends lambda.Function. Inside those subclasses, you implement your company’s security best practices like encrypting data at rest, or requiring the use of certain security policies. This pattern serves a need, but it comes with a significant drawback: as the ecosystem of AWS CDK constructs, such as AWS Solutions Constructs, grows and becomes more useful, your developer community will be effectively cut off from it if they can’t use code that instantiates the base L2s.

Investigate the usage of service control polices and permissions boundaries at the organization level to enforce your security guardrails. Use aspects or tools like CFN Guard to make assertions about the properties of infrastructure elements before deployment.

While, I totally agree that we should use tools like CFN Guard, there is still a need and it sounds like Aspects is not the full answer.

Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.

I want to sell the idea that the current level of upvotes (8) does not reflect the actual need for this concept. From the blog post, this sounds like this might "serve a need" to a "common pattern" but doesn't have the drawbacks and headaches of a company specific wrapper.

msmjack avatar Apr 15 '21 22:04 msmjack

I can vouch for @maria-jackson's thoughts above. Currently in my organization, we are struggling to find a good balance. On the one hand, we want to provide the ability for engineers to create infrastructure which follows industry best practice. On the other hand, we want engineers to build infrastructure easily which is safe, secure, and adheres to our organization requirements. For example, one of our organization policies is that no S3 bucket should be public by default. Presently we really have 3 ways to accomplish this:

  1. Provide an "L2.5" construct library, which directly goes against @ericzbeard's guidance, and for good reason.
  2. Run proactive scans against deployed infrastructure in our AWS account, or use SCP's.
  3. Create a Cfn-Guard ruleset. Require engineers to scan their CDK output using it in their pipeline

While cfn-guard would potentially solve this, the downside is that it would likely get you feedback in the pipeline (unless you run it locally). Instead, if we were able to bake these checks into CDK synthing (or just into the native constructs via a hook like onValidate), we would likely get quicker feedback to the engineer writing the code.

AjkayAlan avatar Apr 19 '21 19:04 AjkayAlan

None of these is integral to the solution, but they would be nice to have. Take the pseudo code as an illustration of an idea, not a preferred implementation. (I do not understand CDK or Typescript enough)

Easy to Read Configuration Code

I'd really like the reusable library to be really easy to read and look a lot like how you pass CDK parameters. I want it to be very clear what people are getting and very clear to our Security folks what the defaults are.

aws-s3.Bucket.defaults({Encryption: KMS_MANAGED});
aws-ec2.Volume.defaults({encrypted: True});
my-company.AwesomeL3.defaults({Foo: BAR});

L1, L2, and L3 Default Configurations

As the CDK community expands aws-solutions-constructs, cdk-patterns, and their own L3 libraries, I see a need to provide defaults to L3, not just L2s.

For example, I use aws-apigateway-dynamodb and pretend that it defaults to no encryption. While I could pass the value of the decryption every time to the L3 construct, I don't want to. If the solution just provides L2 defaults, those would be overwritten every time.

  1. Code Provided L3
  2. Company Provided L3 default
  3. Library Provided L3 default
  4. Code Provided L2 default
  5. Company Provided L2 default
  6. aws-cdk provided L2 default
  7. Code Provided L1 default
  8. Company Provided L1 default
  9. CloudFormation Defaults

Multiple Default Configurations

I'm also wondering how multiple defaults would work. If my company had C-L2, my BU had B-L2, and my team had T-L2. I'd have things in my BU that I'd want to come before my company, but I'd also like the latest updates from my company.

aws-s3.Bucket.defaults({
	Encryption: KMS_MANAGED
}).applyAdditionalDefaults(C-L2);

Easy Translation from cfn-guard

It would be cool if your cfn-guard rules could generate your CDK defaults.

Like, this could translated to something very specific AWS::EC2::Volume Encrypted == false

But this could not AWS::EC2::Volume Size == 101 |OR| AWS::EC2::Volume Size == 99

msmjack avatar Apr 23 '21 18:04 msmjack

Thanks for the input, @maria-jackson !

ericzbeard avatar Apr 23 '21 18:04 ericzbeard

@maria-jackson How does the defaults approach help when using libraries like aws-solutions-constructs? You would have to create each resource manually and pass them into the L3 constructs which certainly loses some of the benefit of those libraries.

shawnsparks-work avatar May 11 '21 14:05 shawnsparks-work

The solution below meets "Allow users to specify or install some code that controls the defaults of AWS resources," but not "or applies some policy or validation at the stack or app level." It more meets the Best practices for building company default constructs #3235, but as that is considered a duplicate of this issue I thought it was appropriate to add the discussion here.

Looking at https://github.com/awslabs/aws-solutions-constructs, here's the pattern I'd like the aws-cdk maintainers thoughts on as an alternative to the L2.5 library for default properties for "Defaults & configuration polic[ies]" or "compliant constructs."

  1. Create a library with default properties, ex DefaultLogGroupProps()
  2. Create version of overrideProps()
  3. Pass the defaults (or your overrides) to override the other properties.
_logGroupProps = overrideProps(DefaultLogGroupProps(), logGroupProps);

const logGroup = new logs.LogGroup(scope, _logGroupId, _logGroupProps);

msmjack avatar May 25 '21 18:05 msmjack

I also like the AWS Solutions Constructs approach (thanks @maria-jackson for the great summary!).

The company could have its own libraries with the various defaults. The security team can own company_cdk_security, operations team (if the company has operations as a centralized function) can own company_cdk_operations, and so on. It would be great if CDK could generate these libraries from cfn-guard rules. Alternatively, the central teams can create simple collections, such as the example below (in Python). Then,cfn-guard rules will validate the eventual result as part of the deployment process.

I think that another important capability is to layer multiple defaults, and allow the application owners to reset some properties before creating the construct. For example, the defaults for public website might include enforce_ssl = True. For a specific use case, the application owner can get an exception to turn this off, while keeping the rest of the defaults.

Note: I added non-existing .update(other_props: object) and .from_props(props: object) methods to illustrate how the above might look like without using the hacky implementation that I experimented with :). The .from_props(props: object) method is only needed in Python, since the current implementation uses keyword arguments to pass properties.

Edit: If you want to experiment with that approach, I created a simple example: https://github.com/alexpulver/company-guardrails

company_cdk_security/aws_s3.py

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk


class BucketPropsCollection:
    public_access = s3.BucketProps(enforce_ssl=True, public_read_access=True)

    __expiration_lifecycle_rule = s3.LifecycleRule(expiration=cdk.Duration.days(30))
    retention_policy = s3.BucketProps(lifecycle_rules=[__expiration_lifecycle_rule])

app.py

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk

# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3
# from company_cdk_operations import aws_s3 as operations_s3
# from company_cdk8s_security import deployment as security_deployment

app = cdk.App()

stack = cdk.Stack(app, "Website")

# I am using Python's dict.update([other]) semantics:
# https://docs.python.org/3/library/stdtypes.html#dict.update
bucket_props = security_s3.BucketPropsCollection.public_access
bucket_props.update(security_s3.BucketPropsCollection.retention_policy)
bucket_props.update(s3.BucketProps(versioned=True, enforce_ssl=False))
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)

app.synth()

alexpulver avatar Aug 22 '21 06:08 alexpulver

I really like this discussion, and I agree with the general direction where this is going!

I have a few comments on the details of each of the proposed solutions.


@maria-jackson I think, instead of having an overrideProps() function, you can take advantage of the fact that DefaultLogGroupProps() is also a function (BTW, I think calling it defaultLogGroupProps() is more idiomatic in TypeScript/JavaScript). So, you can have defaultLogGroupProps() take in an optional LogGroupProps object. In case LogGroupProps has any required properties, you can use the Partial TypeScript type, as in function defaultLogGroupProps(props: Partial<LogGroupProps> = {}) { ....

So, the code becomes:

const logGroup = new logs.LogGroup(this, 'LogGroup', defaultLogGroupProps({
  retention: logs.RetentionDays.TWO_MONTHS, // we can override any property of defaultLogGroupProps() this way
});

Which I think looks great.


@alexpulver I think you want to make these functions as well - making them class fields can have weird side-effects, as they are mutable.

You don't have the nice Partial feature from TypeScript, but I believe you can emulate it using keyword arguments:

class BucketPropsCollection:
  def public_access_props(**kwargs):
    return s3.BucketProps(enforce_ssl=True, public_read_access=True, **kwargs)

And then, in your CDK app:

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3


app = cdk.App()
stack = cdk.Stack(app, "Website")

bucket = s3.Bucket(stack, "Bucket", security_s3.BucketPropsCollection.public_access_props(
  versioned=True,
  enforce_ssl=False,
))

app.synth()

I'm sure we can get some nice types here too, but I'm not that familiar with Python to add them 🙂.

skinny85 avatar Aug 24 '21 21:08 skinny85

Thanks for the feedback! I changed to functions; it makes sense :).

In Python, when specifying the same parameter twice, interpreter bombs. For example, it can be enforce_ssl as part of **kwargs besides explicit use by BucketPropsCollection.public_access(). I couldn't think of an easy way for policy library vendors to work around that. Open to suggestions :).

As I mentioned above, I think we should support layered/chained construction of properties. For example, there could be methods to support compliance with FedRAMP (see example at the end from AWS Config conformance pack sample template), SOC, and other standards. If I need to comply with both FedRAMP and SOC, as a policy library consumer, I want to apply both, and trust the AWS CDK API to do the right/defined thing.

@skinny85 if I iterate a bit on the experience you suggested, it would be something like below - what do you think?

cdk.Props.merge is an imaginary new class and a static method :). It gets an arbitrary number of props objects. All objects should be of the same type, otherwise it is an error. The method reads props in the first and next instances, then creates a new instance with all values merged. If there is a same prop in both instances, the latter takes precedence. The same then happens for next instance and so on. I implemented an example of this approach in the cdk_props.py module. The reason I suggested cdk.Props namespace, is because I think that logic can apply to all props classes. That will also allow all props classes to apply whatever logic they have on the props arguments.

Another reason for having the merge method as part of AWS CDK API is that multiple vendors could produce policy libraries, and might not agree on the same API for merging prop values. I am thinking of something like Managed rules for AWS Web Application Firewall.

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk

from company_cdk_security import aws_s3 as security_s3

app = cdk.App()

stack = cdk.Stack(app, "Website")

bucket_props = cdk.Props.merge(
    security_s3.BucketPropsCollection.public_access(),
    security_s3.BucketPropsCollection.retention_policy(),
    s3.BucketProps(versioned=True, enforce_ssl=False),
)
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)

app.synth()

This way, policy library providers would just define collections of policies, and let the clients to combine them in the desired order.

FedRAMP compliance controls example: image

alexpulver avatar Aug 25 '21 13:08 alexpulver

@alexpulver I think a policy library is a great idea. One thing to consider is what happens if two libraries have conflicting guidance. There needs to be some mechanism to set priority and alert users that there is a potential conflict.

As I mentioned above, I think we should support layered/chained construction of properties. For example, there could be methods to support compliance with FedRAMP (see example at the end from AWS Config conformance pack sample template), SOC, and other standards. If I need to comply with both FedRAMP and SOC, as a policy library consumer, I want to apply both, and trust the AWS CDK API to do the right/defined thing.

While not a 100% solution, that's where my team is trying to go with cdk-nag. By using the Rule Packs based on the Config Conformance Packs, users can be alerted when their applications aren't following a specific compliance or set of compliances. Aspects could be used for auto remediation, but it might be better to let users decide what exactly they want to follow or ignore

dontirun avatar Aug 25 '21 18:08 dontirun

One thing to consider is what happens if two libraries have conflicting guidance. There needs to be some mechanism to set priority and alert users that there is a potential conflict.

Yes, in my opinion, merge() should fail if there's a conflict between the provided props instances (meaning, two or more of them provide the same property with values that are not equal).

In the above example, imagine security_s3.BucketPropsCollection.public_access() had Versioned=True, but security_s3.BucketPropsCollection.retention_policy() had Versioned=False (yes, I understand it's unlikely in this specific case, but this is just an example 😜).

skinny85 avatar Aug 25 '21 19:08 skinny85

but it might be better to let users decide what exactly they want to follow or ignore

I agree with @dontirun on this one. Allowing the user to resolve conflicts instead of failing is important. There are always exceptions...

@skinny85 I see your point about "surprising" results if we just resolve conflicts by default. The default can be to fail unless the user explicitly asks for overrides. @skinny85 @dontirun what do you think about the following approach?

from company_cdk_security import aws_s3 as security_s3

bucket_props = cdk.Props.merge(
    props = [
        security_s3.BucketPropsCollection.public_access(),
        security_s3.BucketPropsCollection.fedramp_moderate(),
    ],
    overrides = s3.BucketProps(versioned=True, public_read_access=True),
)

Here I am asking to merge two policies - public access and FedRAMP Moderate. Let's assume that public access sets publicReadAccess=True and FedRAMP Moderate sets it to False (per the s3-bucket-public-read-prohibited AWS Config managed rule).

Let's say the website is a landing page that can have public access. Using the overrides parameter, I can explicitly specify what conflicts I want to have resolved and how. I can also specify additional props that are relevant for the website (e.g. versioned), that are not part of any policy. Any other conflict that doesn't have an override will cause merge() to fail, stating the open conflicts.

alexpulver avatar Aug 26 '21 06:08 alexpulver

@alexpulver I like it!

skinny85 avatar Aug 26 '21 15:08 skinny85

@alexpulver In addition to what you have, I would also suggest adding explicit overrides to the synthed CloudFormation metadata/CDK metadata warning. When my customers do security reviews, assessors find that information very valuable.

dontirun avatar Aug 26 '21 16:08 dontirun

I experimented with combining cdk-nag and the above approach. Would be glad to have your opinion.

The full implementation of the snippets I use below is in the example app I created for this thread: https://github.com/alexpulver/company-guardrails

Merge function signature: cdk.Props.merge(props: Sequence[Any], *, overrides: Optional[Any]) -> Any:

app.py

import os

from aws_cdk import core as cdk
from cdk_nag import NIST80053Checks

from deployment import LandingPageFrontend

app = cdk.App()

LandingPageFrontend(
    app,
    "LandingPageFrontend",
    env=cdk.Environment(
        account=os.environ["CDK_DEFAULT_ACCOUNT"],
        region=os.environ["CDK_DEFAULT_REGION"],
    ),
)

cdk.Aspects.of(app).add(NIST80053Checks())

app.synth()

website/infrastructure.py

from company_cdk import appsec
from company_cdk import fedramp

bucket_props = cdk.Props.merge(
    [
        appsec.PropsCollection.aws_s3_bucket_public_access(),
        fedramp.PropsCollection.aws_s3_bucket(),
    ],
    overrides=s3.BucketProps(versioned=True, public_read_access=True),
)

The intent in this case is to have my app NIST 800-53 compliant, while still ultimately let users decide what rules they want to implement or suppress. I will use cdk-nag to run checks on my application, and make sure synth fails if there are non-compliant rules that were not implemented or suppressed. Here, running cdk synth will lead to the following error:

$ npx cdk synth
[Error at /LandingPageFrontend/Website/Bucket/Resource] NIST.800.53-S3BucketLoggingEnabled: The S3 Bucket does not have server access logs enabled - (Control IDs: AC-2(g), AU-2(a)(d), AU-3, AU-12(a)(c)).
Found errors

The error provides the exact construct path, so I know what resource to look at. I decide to implement the rule guidance. First, I add a logs bucket. Assuming I secured it properly, I can suppress the same check for the logs bucket, because it is the one that keeps the logs for the website bucket:

logs_bucket = s3.Bucket(self, "LogsBucket")
cfn_logs_bucket = cast(s3.CfnBucket, logs_bucket.node.default_child)
cfn_logs_bucket.add_metadata(
    "cdk_nag",
    {
        "rules_to_suppress": [
            {
                "id": "NIST.800.53-S3BucketLoggingEnabled",
                "reason": "This is the logging bucket itself",
            },
        ],
    },
)

Next, I decide to apply the company policy to implement NIST 800-53 compliance on the website bucket:

from company_cdk import appsec
from company_cdk import fedramp
from company_cdk import nist80053

bucket_props = cdk.Props.merge(
    [
        appsec.PropsCollection.aws_s3_bucket_public_access(),
        fedramp.PropsCollection.aws_s3_bucket(),
        nist80053.PropsCollection.aws_s3_bucket(
            server_access_logs_bucket=logs_bucket
        ),
    ],
    overrides=s3.BucketProps(versioned=True, public_read_access=True),
)

No errors this time! :)

$ npx cdk synth

alexpulver avatar Aug 27 '21 06:08 alexpulver

I like that process! Provides two sets of tools for developers on projects. A way to notify them about properties that aren't set for a standard and an easy way to remediate it.

As a side note you will likely get a cdk-nag error on that setup once the last pull-request for the NIST 800-53 Pack is finished 😊

dontirun avatar Aug 27 '21 13:08 dontirun

I like cdk-nag approach, should be pretty easy to automate enforcement in a pipeline.

OperationalFallacy avatar Aug 29 '21 17:08 OperationalFallacy

Another reason for having the merge method as part of AWS CDK API is that multiple vendors could produce policy libraries

Vendors like AWS?

from aws_cdk_policy import appsec
from aws_cdk_policy import fedramp
from aws_cdk_policy import nist80053

msmjack avatar Aug 30 '21 15:08 msmjack

It could be AWS, but I also thought of vendors and solutions like Check Point CloudGuard (former Dome9 is part of it now)

alexpulver avatar Aug 30 '21 17:08 alexpulver

So, this solution

meets "Allow users to specify or install some code that controls the defaults of AWS resources," but not "or applies some policy ~~or validation~~ at the stack or app level."

I think there might be a few separate concepts to solve this issue:

  • "PropsCollections" organize and reuse properties (looking good)
  • "Merge" to combine and overwrite properties in a Collection (also looking good)
  • "Creation Hooks" allow us to overwrite library defaults for the entire application

In previous attempt (https://github.com/aws/aws-cdk-rfcs/pull/163) there was a concept of "Creation Hooks." I'd like to slightly adjust the meaning

  • Override L1, L2 or L3 defaults for a resource construct using the "creation hook" concept (ex. all S3's in your stack are encrypted)
  • Parameters passed on instance creation override anything in the "creation hook" (ex. within the code, you can overwrite the encryption property)
  • cfn-guard/nag check if you shouldn't have overwritten those props on instance creation (ex. if you really shouldn't have decrypted it due to your standards, you are warned and need to update the metadata on that bucket)

This combines with the "PropsCollections" either passing them by the "creation hook" or when creating a new instance.

msmjack avatar Aug 30 '21 17:08 msmjack

The concept of creation hooks sounds pretty interesting to me as a generic enforcement mechanism. I think my hope is that CDK application developers mostly don't even realize that they are building a compliant application. It just all happens as automatically as possible.

I think one thing that was attractive with tools like cfn-guard as that a security team could work on the policies outside of any one project and then we can enforce those policies in our pipelines. This way, individual applications do not need to upgrade/update their CDK applications just to ensure they have the latest policies. Maybe this there is a smart way to ensure an app has the latest policies?

polothy avatar Aug 30 '21 18:08 polothy

from aws_cdk_policy import appsec
from aws_cdk_policy import fedramp
from aws_cdk_policy import nist80053

If these policies could also produce AWS Config deployments so you can verify that you are compliant with the actual running AWS Resources in your account. This would be really helpful for audits and getting security certifications.

polothy avatar Aug 30 '21 18:08 polothy

A Security professional could

  • deploy an update to a Config Library
  • deploy an update to cfn-guard in house rules
  • deploy an update to "PropsCollections" library

They all do different things.

msmjack avatar Aug 30 '21 18:08 msmjack