dataall icon indicating copy to clipboard operation
dataall copied to clipboard

Handling Pre-existing Endpoints in AWS CDK Constructs Using data.all

Open anandsumit2000 opened this issue 1 year ago • 3 comments

Overview:

The AWS CDK utilized in the data.all construct, creates the AWS VPC endpoints like S3, DynamoDB, and others. In some instances, the VPCs I work with may already have some of these endpoints, either manually configured or created independently.

Key Points:

The use of data.all for the creation of AWS service endpoints. The possibility of pre-existing endpoints within VPCs.

Questions for Maintainers:

  1. What happens when endpoints that already exist within a VPC are encountered by data.all?
  2. Are there recommended practices or mechanisms within data.all or the AWS CDK to gracefully handle scenarios where endpoints are pre-existing in VPCs?
  3. How are conflicts or potential issues addressed by the AWS CDK when attempting to create resources that may overlap with existing configurations?

anandsumit2000 avatar Feb 20 '24 09:02 anandsumit2000

Hi @anandsumit2000, I believe you will run into some issues if you are trying to create multiple of the same VPC endpoints within the same AWS account region pair.

Data.all does allow some customization of the VPC resources deployed that may help with the above - here is some more detailed information into how you are able to customize your VPC configuration when deploying data.all:

  • In data.all's cdk.json file, there are 3 different parameters that are optionally specified for each DeploymentEnvironment:
    "DeploymentEnvironments": [
      {
        ...
        "vpc_id": "string_DEPLOY_WITHIN_AN_EXISTING_VPC|DEFAULT=None",
        "vpc_endpoints_sg": "string_DEPLOY_WITHIN_EXISTING_VPC_SG|DEFAULT=None",
        "vpc_restricted_nacl": "boolean_CREATE_CUSTOM_NACL|DEFAULT=false",
        ...
      }
    ]

By Default (nothing specified in cdk.json)

  • VPC Created (w/ 1 NAT Gateway, 2 Public Subnets, 2 Private Subnets, Flow Log)
  • Security Group Created
  • VPC Interface Endpoints and Gateway Endpoints Created (in private subnet using created SG)

If vpc_id specified in cdk.json

  • Omits creation of new VPC
  • Imports existing VPC specified by vpc_id and uses existing subnets
  • Still would create a new SG and VPC Endpoints unless additional parameter vpc_endpoints_sg specified in cdk.json

If vpc_endpoints_sg specified in cdk.json

  • Omits creation of new security group and any VPC Endpoints
  • Imports Security Group from security group id specified in vpc_endpoints_sg
  • Assumes VPC Endpoints created are associated with that SG

If vpc_restricted_nacl specified in cdk.json

  • Additional NACL restrictions applied to the VPC (only for data.all created VPCs, not pre-existing ones specified with vpc_id in cdk.json)

Feel free to read up more in our Deploy to AWS documentation for Step 6. "Configure the deployment options in the cdk.json file"

Also, you can take a deeper look at the code where VPC resources are created via CDK at dataall/deploy/stacks/vpc.py

Please let me know if any additional questions

noah-paige avatar Feb 20 '24 17:02 noah-paige

Hello @noah-paige . Thank you for respnding quick....... I should have elaborated more about the query. The question was about the Tooling Account. However, what you responded with concerns majorly with the Deployment Account. Because vpc_endpoints_sg is a property that is contained within Deployment Environment field.

anandsumit2000 avatar Feb 22 '24 07:02 anandsumit2000

I see the issue now... I think the easiest resolution to avoid any duplication of resources and/or any potential errors would be to extend to logic we have for Deployment Envs to the Tooling Env - please let me know if you agree

I will label this issue as an enhancement and work with the team to add it to the backlog

we welcome external contributions as well if you would like to contribute the fix sooner, thank you for bringing it to our attention

noah-paige avatar Feb 27 '24 21:02 noah-paige