terramate [Question] DRYness of after and before

Context

I'm trying to create a multi region and multi environment AWS infrastructure reference example with terramate. You can find the example here: https://github.com/pbn4/terramate-aws-infrastructure-example

Below is a short explanation of the current state of the example:

Environments and regions:

dev:
- eu-central-1
prod:
- eu-west-1
- us-east-1

For this I have 3 accounts in the same AWS organization:

management
dev
prod

Responsibilites of accounts:

management:
- terraform state s3 bucket
- DNS configuration
dev|prod:
- regional web application infrastructure

Currently there are only 3 modules in each regional infrastructure: KMS, S3 VPC flow logs bucket and VPC.

I use management account credentials to run terramate and I obtain subaccount access via role assumption of default OrganizationAccountAccessRole created when account is added to the organization.

Question

Now the question is about DRYness of after and before. I have shared code in /modules that defines e.g. a VPC (modules/myproduct/environment/region/vpc) and this VPC module is imported by all regional stacks that I have (3 currently, 1 dev and 2 prod) - they follow path pattern stacks/myproduct/<env_name>/<region_name>/. Now this VPC module has a dependency on s3-flow-logs module. The problem is that I have to define this dependency in every stack file that I created e.g. it is duplicated in files:

stacks/myproduct/environments/prod/eu-west-1/vpc/vpc/stack.tm.hcl
stacks/myproduct/environments/prod/us-east-1/vpc/vpc/stack.tm.hcl
stacks/myproduct/environments/dev/eu-central-1/vpc/vpc/stack.tm.hcl

code of one of the files:

stack {
  name        = "vpc"

  after = [
    "../s3-flow-logs"
  ]
}

import {
  source = "/modules/myproduct/environment/region/vpc/vpc/stack.tm.hcl"
}

What can I do to get rid of this duplication? Or maybe my approach to using terramate is completely wrong here and it should be done differently?

I have searched https://github.com/ashleymichaelwilliams/aws-sandbox/tree/master repository already and I couldn't find the answer.

Aug 05 '23 17:08 pbn4

Hi Michal,

thanks for sharing your thoughts.

I must agree that the DRYness can be improved here. There are some challenging ideas in our idea backlog on this about allowing generate_hcl blocks to define tags for stacks when the block is creating code. This idea could be extended to automatically allow to define a given order of execution for sure.

One recommendation, we already can give is to use tags instead of relative directories to improve the maintainability when restructuring the directory hierarchy in the future. This does not solve the DRYness part though.

We also have a implicit dependency when using stacks-in-stacks where all child stacks are executed (with an implicit) after the parent stack. If you could import the s3-flow-logs into the actual parent stack, the things would become more dry.

What would be your desired solution here for configuring it easier? We are more than happy to get early feedback on how a desired feature could look like.

Aug 07 '23 15:08 mariux

@mariux Thanks for answering

One recommendation, we already can give is to use tags instead of relative directories to improve the maintainability when restructuring the directory hierarchy in the future. This does not solve the DRYness part though.

I have not used tags on purpose, to keep dependencies defined with "after" parameter of the stack aligned with how I access their dependent terraform stack state via data "terraform_remote_state" object. Example file modules/myproduct/environment/region/vpc/vpc/stack.tm.hcl:

generate_hcl "main.tf" {
  content {
    data "terraform_remote_state" "s3_flow_logs" {
      backend = "s3"
      config = {
        bucket = "${global.terraform_state.bucket_name}"
        key            = "${terramate.stack.path.relative}/../s3-flow-logs/terraform.tfstate"
        region = "${global.terraform_state.region}"
      }
    }

and the actual stack that imports this module stacks/myproduct/environments/dev/eu-central-1/vpc/vpc/stack.tm.hcl:

stack {
  name        = "vpc"

  after = [
    "../s3-flow-logs"
  ]
}

import {
  source = "/modules/myproduct/environment/region/vpc/vpc/stack.tm.hcl"
}

As you can see ../s3-flow-logs is aligned with remote state access key ${terramate.stack.path.relative}/../s3-flow-logs/terraform.tfstate if I used tags I would need two pieces of information: tag and relative path, in this example I only need to know the relative path. I understand that this is not ideal for refactoring.

We also have a implicit dependency when using stacks-in-stacks where all child stacks are executed (with an implicit) after the parent stack. If you could import the s3-flow-logs into the actual parent stack, the things would become more dry.

For me it's not an option because I would like to group stacks by business domains e.g. application -> service -> task definition and imagine a service requires an ALB, I would put ALB somewhere in service directory, close to the service that is using it, what if my ALB requires a security group (for sure it does), then I need to put this SG a level higher, on application level, but it kinda does not belong there in this type of grouping of resources.

What would be your desired solution here for configuring it easier? We are more than happy to get early feedback on how a desired feature could look like.

Hard question.

Maybe lets first try to define what I'm trying to achieve: some sort of a "_base" template at some level e.g. regional or environmental and be able to define multiple regions/environments from that template by simply copy pasting a directory structure, calling it region 'B' and eventually providing some overrides for default input values provided in "_base". The idea is that "live" directory must resemble the directory structure of "_base". You can actually see this in my example repository linked above. I have defined "region" and "environment" directories in "modules" this way.

So what it translates to is: In some directory that I consider to be "live" (not "_base") I import some repeatable chunk of code e.g. VPC module, now this VPC has a dependency on S3 Flow logs bucket. I only see two options to create such dependency: I know the unique name of the stack (unique across entire execution context), or I know where this stack lives exactly e.g. "../s3-flow-logs". Tags could lead to ambigous import targets.

In terragrunt I can achieve this with dependency block, as it can be defined inside file imported via include block. It not only allows to import a repeatable code to something in a DRY way, it also handles the manual data "terraform_remote_state" definition, as outputs are provided as part of dependency block. But this approach has a problem that leads to hard to understand/read terragrunt repositories: I include file A, and this A has a dependency B and this dependency is defined by relative paths between A and B (because the only way to define a dependency is via a config_path) that must actually exist in a place where I include this -> it requires a lot of expertise with Terragrunt to understand that kind of setups, and I really don't want to recreate it, hoping that Terramate could provide something simpler.

But to be honest I don't see a solution right now other than recreating this functionality from Terragrunt, but maybe improving it and e.g. instead config_path try to give an alternative option like stack_id, but don't use unique uuid4 ids, let users handle uniqueness of this id (it would be great to be able to interpolate stack_id so that one could define it based on e.g. path from root + some name). Also this solution assumes that outputs are provided via this dependency functionality and can be used in module code in e.g. "generate_hcl" block, otherwise what would be the point?

Sorry that this post was a bit lenghty, I hope it will be ok to grasp. If you have any questions please let me know. I'm also curious what do you think. Maybe there is a better approach to achieving what I want? I approach this with a very Terragrunt-like way of doing this

Aug 07 '23 17:08 pbn4

Thanks for sharing... this actually helps.. in the very beginning we dropped the pure terraform approach and went to a HCL approach for code generation. The only real terraform dependency we currently have is in our change detection, where we identify modules that are called locally and consider a change in a local terraform module a trigger to mark the stacks including the module as changed.

Now that you explained the use-case a bit better, we actually could do a similar integration for remote state data sources in our order of execution.

I will discuss details with the team and see if this can be achieved in the future and get back to you.

The idea here would be we scan all stack, know the backend used and align it with any remote data sources as an implicit order of execution. For circular dependencies we could just drop the full circle.

This would match our plans to refactor order of execution to allow for better precedence definitions and with implicit orders.

On the "reference another stack".. stack id can be any string already (so no UUID required), it just needs to be unique within the repository. We also want to share data e.g. evaluated globals on this level at some point in the future, but no concrete use-cases at the moment.

In general we prefer using data sources of the resources to get the current state in the cloud and only recommend using remote state when there is no other option.

Using remote state as a data source vs. remote state as outputs has the benefit that you can postpone it to the apply time and create a PR that creates dependent stacks in a single merge.

Not sure if you noticed this and if it could help you but terramate create supports to define order of execution and a stack ID when creating a stack {} configuration block in stack.tm.hcl for a new stack. This way you can script the creation of a full subtree with various stacks and dependencies: keeping it dry by using a script that bootstraps it.. You can define the following fields:

      --id=STRING                        ID of the stack, defaults to UUID
      --name=STRING                      Name of the stack, defaults to stack dir base name
      --description=STRING               Description of the stack, defaults to the stack name
      --import=IMPORT,...                Add import block for the given path on the stack
      --after=AFTER,...                  Add a stack as after
      --before=BEFORE,...                Add a stack as before

maybe this can help you automate it at least a bit until better options are available.

Aug 07 '23 19:08 mariux

I will discuss details with the team and see if this can be achieved in the future and get back to you.

Sure, thank you.

Using remote state as a data source vs. remote state as outputs has the benefit that you can postpone it to the apply time and create a PR that creates dependent stacks in a single merge.

Makes sense, thanks for explaining.

Not sure if you noticed this and if it could help you but terramate create supports to define order of execution and a stack ID when creating a stack {} configuration block in stack.tm.hcl for a new stack. This way you can script the creation of a full subtree with various stacks and dependencies: keeping it dry by using a script that bootstraps it.. You can define the following fields:

Thanks this is an interesting idea because recreating this tree structure was problematic too.

Aug 08 '23 06:08 pbn4

We are currently planning a feature called Terramate Projects which will allow us to tackle this and some other community requested features in a better way.

Terramate Projects will allow to set alias names for stacks in addition to the stack.id. While the stack.id will need to be unique within a repository, the stack.alias will be unique within a project. Multiple projects will be possible per repository. Allowing to clone/promote a stack into another repository will generate a new stack.id but keep the alias stable. So it will be possible to reference an alias in before and after in addition to the already supported tags to target a single stack within a project.

so the combination of tags and alias support in before and after will allow us to keep the configuration more DRY and require less adjustments when promoting stacks. projects could represent environments but the use cases are not limited to this.

Jan 25 '24 22:01 mariux

terramate terramate copied to clipboard

[Question] DRYness of after and before

Context

Question

terramate
terramate copied to clipboard