dataall
dataall copied to clipboard
Automating bootstrap environment account's step
Use case: Automating customer onboarding using API
Before we talk about the expectation wanted to understand why bootstrapping the environment account is manual step?
Where we run the following command with AWS credentials of the environment account:
cdk bootstrap --trust <deployment-account-id> -c @aws-cdk/core:newStyleStackSynthesis=true --cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess aws://<environment-account-id>/<environment-account-region>
We wanted to automate environment bootstrapping using API
Ideation:
Using automation script: Create an IAM user in environment account. Decided to use IAM user as its created for different use case
- Store credentials in deployment account SSM Parameter Store (
dataall/<environment-account-id>
)
In bootstrapEnvironment API
- To check if environment account is bootstrapped we can query environment table if account id and region exists
- Get environment account credentials and execute bootstrap command
- Does IAM user require any specific permission to be defined if cdk bootstrap command is used?
- Wait till stack create is complete and then run createEnvironment API
Any suggestion on this approach?
Hi @Macklon thanks for opening an issue. This is a very frequent question that we have also ask ourselves a couple of times.
Why bootstraping?
The reason why the bootstraping is not automated as part of the environment creation is that it allows us to have a separate step that requires some manual action that "approves" the trust between data.all and the onboarded account. We want to keep this explicit approval that validates that an account can be accessed by data.all.
Having said that, customers typically automate the bootstraping of their accounts through different processes. Because each process is very particular, from the open-source repo we just keep the minimum implementation which is the manual bootstraping.
Ideation
We are open to ideas to make the bootstraping more streamlined. Reviewing your ideation proposal, the first thoughts that come to mind are:
- As a general guideline, better to use IAM roles than IAM users. If the credentials get compromised, the temporary credentials of a role reduce the blast impact.
- Who is going to store the SSM parameter? It looks like the environment AWS account Admins will be dependent from the central data.all Team to be able to link an Environment if an SSM needs to be created every time.
- How are the IAM policies of the role going to look like? and How do we ensure that the central account does not get too much access? In the case the IAM user has broad permissions or someone in the environment makes a mistake an opens the role permissions, data.all would have "unlimited" access to the account.
As a suggestion, I would avoid the security trouble of storing credentials, and especially permanent credentials of IAM users. And if possible I would avoid environment users creating SSM parameters or any other resource in the deployment account, or depending on an admin team that manages SSM parameters for them, it can become a bottleneck.
Here are some alternatives that I could brainstorm:
- CICD pipeline with base infrastructure - most customers already have some sort of base infra deployed in accounts from a tooling account. They would add the CDKToolkit with the necessary parameters as part of their CICD base infra process.
- Automation script - This is a little outdated, because data.all onboarding has simplified a lot over the years. In early projects some customers used a script to run several actions on the onboarded account, including bootstraping.
- AWS Organization - I think it is not your use-case, but a very cool feature in AWS Organizations are CloudFormation StackSets . We could use them to deploy the
CDKToolkit
to all accounts in the Organization. - Let the customer introduce AWS temporary credentials in the UI at the time they are trying to link the environment and let data.all backend execute the bootstrap API with those credentials. We need to evaluate if that implies any security issue.
We will think about more alternatives internally, to see if we can come up with solutions that make this process easier.
FYI environment accounts are created and owned by us when onboarding customers. Post onboarding data and access will be owned by customer.
I think the AWS StackSets solution is very neat if the user is using Organizations already. It will also allow us to deploy other stacks in the accounts (if needs be) so it's not only useful for bootstrapping. Having said that I am not sure if it's worth enforcing usage of Orgs in data.all just for this, we need to understand the implications better.
This issue will be closed soon due to inactivity