aws-cdk-rfcs icon indicating copy to clipboard operation
aws-cdk-rfcs copied to clipboard

Global Name Prefix

Open PeterBengtson opened this issue 5 years ago • 27 comments

:rocket: Feature Request

Description

I'm working as an expert AWS consultant with a large internet bank on a high-security project using CDK. It's the very first project at this bank using CDK, and it has allowed us to accelerate development considerably. However, when trying to prepare and deploy to the new staging and production accounts, we were suddenly faced with a problematic integration issue: the fact that boundary policies require us to prefix many resource names (such as lambdas, security groups, and many more) with the team name, e.g. theTeam-my-lambda-function-72636364.

The reason these prefix rules exist is due to the fact that there are some accounts shared by multiple teams. Imposing a network of rules keyed on name prefixes is the only way of keeping teams within one account separate, as tags can't be used for policy conditionals with all resources, e.g. lambdas.

Shared accounts, while no longer best practice on AWS, do exist in many companies and will probably do so for quite some time, which means that our situation at this particular bank isn't unique. In order for CDK to gain widespread acceptance in brownfield situations, it would be good if there was an easy way – and above all a standardised way – of specifying prefixes for CDK.

CDK doesn't support resource name prefixing out of the box. In fact, CDK follows best practices as established by CloudFormation and assumes full control of all generated names. Postprocessing the CloudFormation templates isn't an option, as tokens and other abstract features of CDK make simple text substitutions very difficult if not impossible.

Elad Ben-Israel proposed overloading the CDK method for generating logical names. While this would work, it's somewhat of a hackish solution in that it can be done in several different ways. The security powers-that-be at this bank would prefer a standardised solution in order to be able to recommend CDK.

Proposed Solution

We propose a simple switch to the cdk synth command. It would be used as follows:

$ cdk synth --name-prefix='theTeam-'

This would add the prefix to all resource names, across all stacks. To provide even greater flexibility, the prefix could also be made part of the project configuration, obviating the need to type the proposed --name-prefix switch every time one synthesises the stacks.

Environment

  • CDK CLI Version: 1.8
  • OS: all
  • Language: all

Previously filed request

https://github.com/aws/aws-cdk/issues/3982

PeterBengtson avatar Sep 12 '19 14:09 PeterBengtson

Hi, I understand your need but I do not think that CDK could handle this as it delegates creation of names to Cloudformation if you do not specify a fixed name. So I think the best place for your request is the cfn roadmap to ask for cfn to support this.

hoegertn avatar Sep 12 '19 15:09 hoegertn

Please see case aws/aws-cdk#3982, linked to above your reply, to suggestions by Elad and Richard giving info which seems to indicate that it's indeed possible. Judging by their comments, what we're asking for seems eminently doable.

PeterBengtson avatar Sep 12 '19 15:09 PeterBengtson

I think you could accomplish this by creating a "naming Aspect" that manually sets the physical name for each resource it visits. Since the decision to apply the prefix to certain resources is based on your business needs, my opinion is that (1) this logic should live in your cdk application and (2) The CDK Team should better document how to do such a thing.

rhboyd avatar Sep 12 '19 15:09 rhboyd

Yes but this aspect would need to replicate the cfn name generation and would send fixed names to cfn with all the downsides. This has to be documented then.

hoegertn avatar Sep 12 '19 15:09 hoegertn

Isn't the situation conceptually akin to prefixing every single stack name with the "namespace" prefix? OurTeam-Workload, OurTeam-Pipeline, OurTeam-WhatHaveYou, etc? This doesn't create "hard" names, which of course is something we never ever want.

PeterBengtson avatar Sep 12 '19 15:09 PeterBengtson

@hoegertn Right, which is why the customers should implement their own Naming Aspect. I'm really curious how this security approach works when you need to replace a DynamoDB Table that you've manually named.

rhboyd avatar Sep 12 '19 15:09 rhboyd

Well, we don't want hard, fully specified names in any situation, as this makes it impossible to replace the resource.

PeterBengtson avatar Sep 12 '19 15:09 PeterBengtson

@rhboyd that is my main concern when creating names. And if CDK provides an out-of-the-box solution it will be misused at some point in time. If prefixes for the stack are ok this solves your request without any change to CDK. Be aware that not all resources use the stackname in the generated name (eg RDS)

hoegertn avatar Sep 12 '19 15:09 hoegertn

"Be aware that not all resources use the stackname in the generated name (eg RDS)"

... which then immediately makes that particular solution not viable. I'm sure we're not alone in this situation. After all, the name prefix solution is not uncommon.

PeterBengtson avatar Sep 12 '19 15:09 PeterBengtson

I'm sure there are plenty of teams that are using name prefixes for security, but I'm not sure that adding it to cdk core is the right solution. Suppose the CDK Team adds the feature, now they need to track and clearly document which resources it applies to and which ones it doesn't. Personally, I think CDK balances this tradeoff well. CDK enables developers to use this design pattern (via Aspects) even though it is discouraged, but doesn't force everybody who isn't using this discouraged pattern to be aware of it when designing their applications.

rhboyd avatar Sep 12 '19 15:09 rhboyd

Am I understanding you correctly that the Aspect solution and the Stack.allocateLogicalId solution both lead to fully specified resource names, i.e. "hard" names?

PeterBengtson avatar Sep 12 '19 15:09 PeterBengtson

They can both lead to fully specified LogicalIds, which CFN turns into physical names.

rhboyd avatar Sep 12 '19 16:09 rhboyd

I would add, Richard, that the proposed solution would apply the prefix to all resources, not just some, obviating the need for selective documentation. It would be completely binary and wouldn't break existing code in any way. True, it's not a purist approach, but it's a pragmatic one that could lead to greater adoption.

"They can lead to fully specified LogicalIds" – care to elaborate? It's the difference between a working concept and one that can never succeed.

PeterBengtson avatar Sep 12 '19 16:09 PeterBengtson

For resources, you can either specify a fixed name in CFN which then has several downsides or let CFN generate one. The logical id inside the template serves as a base if you do not specify a custom name. CFN generates a physical name by joining the stack name, the logical resource name, and a random string. For several resources, this is not true as they have length limitations and CFN generates a shorter name.

hoegertn avatar Sep 12 '19 16:09 hoegertn

@PeterBengtson Here is an example to do what you're asking

app.py

#!/usr/bin/env python3

from aws_cdk import (
    core
)

from hello.hello_stack import MyStack, LogicalIdSetter

app = core.App()
my_prefix = app.node.try_get_context("prefix")
stack = MyStack(app, "hello-cdk-1", env={'region': 'us-east-2'})
stack.node.apply_aspect(LogicalIdSetter(team_name="MyTeam"))

app.synth()

hello_stack.py

from aws_cdk import (
    aws_lambda as lambda_,
    aws_s3 as s3,
    core
)

from jsii._reference_map import _refs
from jsii._utils import Singleton
import jsii


@jsii.implements(core.IAspect)
class LogicalIdSetter:
    def __init__(self, team_name: str) -> None:
        self.team_name = team_name

    def visit(self, construct: core.IConstruct) -> None:
        kernel = Singleton._instances[jsii._kernel.Kernel]
        resolve = _refs.resolve(kernel, construct)

        def _walk(obj):
            if isinstance(obj, lambda_.Function):
                raw_fn: lambda_.CfnFunction = obj.node.default_child
                raw_fn.add_property_override(
                    property_path="FunctionName",
                    value=f"{self.team_name}-{obj.node.unique_id}"
                )
            if isinstance(obj, s3.Bucket):
                raw_fn: s3.Bucket = obj.node.default_child
                raw_fn.add_property_override(
                    property_path="BucketName",
                    value=f"{self.team_name}-{obj.node.unique_id}"
                )
        _walk(resolve)


class MyStack(core.Stack):

    def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)
        core.Tag()

        my_cdk_function = lambda_.Function(
            self,
            "MyLambdaFunction",
            code=lambda_.Code.from_inline("def lambda_handler(event, context):\n\tprint(event)\n"),
            runtime=lambda_.Runtime.PYTHON_3_7,
            handler='index.lambda_handler'
        )

rhboyd avatar Sep 12 '19 18:09 rhboyd

This will walk every node in the cdk Construct and look for either Lambda Functions or S3 Buckets and give them a name like MyTeam-hellocdk1MyLambdaFunctionA23CA5A9

rhboyd avatar Sep 12 '19 18:09 rhboyd

Obviously, you would have the Aspect defined separately from the Stacks, I just consolidated them into fewer files to make it easier to share on github.

rhboyd avatar Sep 12 '19 18:09 rhboyd

Hey guys, sorry I am late for the discussion... @PeterBengtson I am still not sure I understand if we are talking about physical names or logical names.

If this is about physical names, we are limited to what we can do since CloudFormation generates physical names during deployment. In most cases, those names are derived from the logical names (and therefore sometimes remind people of the logical name) but in other cases they are completely opaque. Specifying explicit physical names in CFN templates is an anti-pattern and not a recommended practice. Resources that use physical names cannot be replaced during updates and cause all kinds of headaches when it comes to reusability and composability (i.e. you can't use the same construct twice).

If this is about logical names, the official way to customize logical names is to override allocateLogicalId and provide an alternative behavior. You can easily implement something like what you suggested using this approach and context:

export class BaseStack extends Stack {
  public allocateLogicalId(element: CfnElement) {
    const orig = super.allocateLogicalId(element);
    const prefix = this.node.tryGetContext('prefix');
    return prefix ? prefix + orig : orig;
   }
}

Then, you can specify a prefix through the CLI:

$ cdk synth -c prefix="MyTeam"

This will cause all logical names to have a prefix MyTeam.

I've submitted a pull requests to the aws-cdk-examples with a full example.

eladb avatar Sep 13 '19 11:09 eladb

Also, I was wondering, have you considered using tags? CDK has really nice support for tagging scopes and I believe it a much better way to manage groups of resources. It is also something you can define security policies around.

eladb avatar Sep 13 '19 11:09 eladb

Yes, I of course agree that tags would be infinitely better. No, of course we don’t want to specify physical names, for precisely the reasons you mention. However, as i’ve been very careful to point out and describe above, we don’t have a say in the matter since we have to adapt to already existing boundary policies at this bank, which is the largest internet bank of its kind - it’s not a ship I as an expert AWS consultant would even attempt to try to turn around singlehandedly, especially given the fact that there are hundreds of other teams using the system, some of them still sharing accounts - and yes, I’m fully aware that this is an anti pattern - but the policies use name conditionals since conditionals can’t yet use tags for many resources, such as for Lambdas.

Thus we’re trying to find a way for CDK to comply with these restrictions, in a shared environment. There aren’t many other ways these boundary policies can be written except to use names. Thus, this situation might also cause pain at other companies in a similar way. A general solution would be useful and would contribute to the adoption of CDK generally.

Having said this, there are other options, but they all include treating this workload as a special case, something which often is controversial in a large organization.

fre 13 sep. 2019 kl. 13:53 skrev Elad Ben-Israel [email protected]:

Also, I was wondering, have you considered using tags? CDK has really nice support for tagging https://docs.aws.amazon.com/cdk/latest/guide/tagging.html scopes and I believe it a much better way to manage groups of resources. It is also something you can define security policies around.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aws/aws-cdk/issues/4045?email_source=notifications&email_token=AAAA6OJ3OE5LFUAFV3VGFALQJN5NFA5CNFSM4IWFVSSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6UZNIA#issuecomment-531207840, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAA6OLU3T5G5DLJHSPFOY3QJN5NFANCNFSM4IWFVSSA .

PeterBengtson avatar Sep 13 '19 13:09 PeterBengtson

Thanks for the example code, Elad! :)

On Fri, 13 Sep 2019 at 15:10, Peter Bengtson [email protected] wrote:

Yes, I of course agree that tags would be infinitely better. No, of course we don’t want to specify physical names, for precisely the reasons you mention. However, as i’ve been very careful to point out and describe above, we don’t have a say in the matter since we have to adapt to already existing boundary policies at this bank, which is the largest internet bank of its kind - it’s not a ship I as an expert AWS consultant would even attempt to try to turn around singlehandedly, especially given the fact that there are hundreds of other teams using the system, some of them still sharing accounts - and yes, I’m fully aware that this is an anti pattern - but the policies use name conditionals since conditionals can’t yet use tags for many resources, such as for Lambdas.

Thus we’re trying to find a way for CDK to comply with these restrictions, in a shared environment. There aren’t many other ways these boundary policies can be written except to use names. Thus, this situation might also cause pain at other companies in a similar way. A general solution would be useful and would contribute to the adoption of CDK generally.

Having said this, there are other options, but they all include treating this workload as a special case, something which often is controversial in a large organization.

fre 13 sep. 2019 kl. 13:53 skrev Elad Ben-Israel <[email protected]

:

Also, I was wondering, have you considered using tags? CDK has really nice support for tagging https://docs.aws.amazon.com/cdk/latest/guide/tagging.html scopes and I believe it a much better way to manage groups of resources. It is also something you can define security policies around.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aws/aws-cdk/issues/4045?email_source=notifications&email_token=AAAA6OJ3OE5LFUAFV3VGFALQJN5NFA5CNFSM4IWFVSSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6UZNIA#issuecomment-531207840, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAA6OLU3T5G5DLJHSPFOY3QJN5NFANCNFSM4IWFVSSA .

PeterBengtson avatar Sep 13 '19 14:09 PeterBengtson

okay thanks for the clarification. I’ll think about this some more and see if there is something we can do to make this easier to implement.

eladb avatar Sep 13 '19 14:09 eladb

Thank you, Elad. :)

fre 13 sep. 2019 kl. 16:14 skrev Elad Ben-Israel [email protected]:

okay thanks for the clarification. I’ll think about this some more and see if there is something we can do to make this easier to implement.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aws/aws-cdk/issues/4045?email_source=notifications&email_token=AAAA6OIDZ45YAFNPDOB4YYTQJON4BA5CNFSM4IWFVSSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6VESEI#issuecomment-531253521, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAA6OM3I6GVQSGKF3BJOQTQJON4BANCNFSM4IWFVSSA .

PeterBengtson avatar Sep 14 '19 10:09 PeterBengtson

The business around

            kernel = Singleton._instances[jsii._kernel.Kernel]  # The same object is available as: jsii.kernel
            resolve = _refs.resolve(kernel, construct_ref)

In aspects is no longer necessary. That arose from a bug in JSII which has been fixed. The code inside _walk() is the code that should go directly into visit()

rix0rrr avatar Nov 19 '19 17:11 rix0rrr

This is a very interesting idea. I am transferring this to the RFC repo. Please follow the RFC Repo README in order to submit this as an RFC.

eladb avatar Jan 23 '20 12:01 eladb

My thoughts about prefixes

  1. CDK adds stack id to the resource name. So when stack name is a bit long the final resource name will be truncated. When we deal with multiple lambdas (for example) a user able to distinguish lambda by just several symbols in the middle of name. Which is not well.
  2. CDK staty with "policy" like "we add hash to resource name for easy replacing". But there is no distinction between stateful and stateless resources. Why do I need to add hash to lambda function name when it can be easily redeployed/recreated? A stack name is unique so lambdas will always have a unique prefix associated with the stack. So in total for stateless resources there is no need for hash at the end of name (or not for all of them at least).
  3. If we have prefixes for all resources in the stack why we could not group all stacks into a single cloud application? Single prefix (always short) is easy for resource searching/looking without opening and investigation resource for tags, belonging to the CF stack etc. It is much easy and as it was mentioned better for security because I can limit all lambdas access (for example) with name prefix and all bucket with prefix - because all of them belongs to same cloud applciation.
  4. About stateless resources. Consider example of Glue data catalog. When I deploy a database - it is global for region in account. So when I deploy same database and different stack (scope) name I will go en error because I have to manual change the name. Using prefixing allow me to deploy same CDK application with different name and I got different database name. Like dev_database and prod_database. Pleae not that changing database name is much easer then autogenerated table naems. Here we got interesting situation when stateless resources have a hash while they can be easy redeployed but stateful do not have any prefix for resource separation and have to be manual managed.
  5. If use hashing is so good and prefixing is bad. Why do CDK team added qualifier for CDK bootstrap? Looks like prefixing is better for application managing, is not it? >:-) Seriously, bootstrapping is a good example when developer have to deal with multiple stacks/resources tied together and this example shows usthat prefixing must be added into cdk.

About prefix requirements - it should be short (4-5 symbols) and contains letters only because different was resources support (or not) different symbols like dash, lodash etc.

The code

export class BaseStack extends Stack {
  public allocateLogicalId(element: CfnElement) {
    const orig = super.allocateLogicalId(element);
    const prefix = this.node.tryGetContext('prefix');
    return prefix ? prefix + orig : orig;
   }
}

Does not always work as expected because it adds a prefix twice.

Consdier the following:

public class PrefixedStack extends Stack {
  public PrefixedStack(
      App scope,
      String id,
      StackProps props
    ) {
    super(scope, prefix(scope, id), props);
  }

  private static String prefix(App scope, String id) {
    var prefix = scope.getNode().tryGetContext("app.prefix").toString();
    return String.join(
        "-",
        prefix,
        id
    );
  }
}

This adds a prefix to Stack, then every resource (like lambda) which has a stack name prefix automatically has a prefix. But even with this, it is better to pass prefixed name directly because some resources like State machine does not have stack name and it is unclear how to fix this.

Closing this ticket as it has gone stale and does not reflect current priorities. We suggest to pursue experimentation in a separate package or a fork if needed.

awsmjs avatar Dec 15 '23 14:12 awsmjs