terramate icon indicating copy to clipboard operation
terramate copied to clipboard

[QUESTION] How to pass output from one stack to another stack?

Open MatthiasScholzTW opened this issue 2 years ago • 1 comments

Is your question related to a problem? Please describe. One of our common use cases is to pass outputs created by one stack as input into variables of a dependent stack. E.g. creating a load balancer in one stack and using the name (or id) of the load balancer in another stack to define the target groups and listeners.

Describe the solution you'd like Being able to reference the output variable name of the providing stack for the input variable.

Describe alternatives you've considered Trying to generate the data resource using global variables and internal knowledge from the providing stack resource naming convention - which is flaky.

Additional context Being able to separate resources in smaller parts allows us to iterate faster and test more independently.

MatthiasScholzTW avatar Aug 11 '22 14:08 MatthiasScholzTW

Hi @MatthiasScholzTW, I think this discussion is related to your question.

katcipis avatar Aug 12 '22 13:08 katcipis

Thanks for your reply.

Unfortunately from an engineering perspective all mentioned choices in the referenced answer are suboptimal.

  1. Using globals for the naming bounds the resource naming within the module to the way the global variables are used - hence there is implicit knowledge to carry around. This makes the setup brittle.
  2. Using the remote state would introduce to share the s3 bucket, as security issue and not nice because of cross account access complexity.

Hence we will go with terragrunt or terraspace for now.

MatthiasScholzTW avatar Aug 16 '22 15:08 MatthiasScholzTW

Hi @MatthiasScholzTW

I see you already closed this question but for the sake of clarity i wanted to still add some points.

When using outputs to share data from stack A to stack B:

When deploying stacks that depend on each other for the very first time the outputs of stack A won't be available yet to be consumed by stack B. In fact terraform would actually fail as no state is available yet.

If you add a condition for that so the second stack is not planned before the first one is applied or manually apply the stacks in order you end up with the process when using -target in a single stack to make terraform dependencies happy but on a higher level: You need manual steps.

We mainly focus on CI/CD execution of Terramate and thus we need to find a way where we do not depend on such outputs.

Why didn't we provide a solution for it (so far) with Terramate? Terramate is not a wrapper around Terraform but more an orchestrator and code generator. So it is not aware of most specifics of Terraform.

In fact you can set up a workflow, where first all outputs are generated and then can be read in a subsequent plan e.g. by using file functions. This adds a lot of orchestration complexity for sure.

How we recommend to do it most of the times

Luckily Terraform has all we need here as providers offer you to import information via data sources. It also comes with a handy null_resource where you can postpone the initial read of the data source (of resources that not yet exist) and actually plan a stack that completely depends on not yet existing resources that are to be deployed in a previous stack.

How Terramate can help here is to share the actual id, name, tags, labels, etc between those stacks. So you can attach the tags you need to resources you want to load via a data source in a second stack. As the second stack knows which tags to filter for, you can easily set up the data source.

Another benefit of using data sources is that you get the latest information as deployed in the cloud and not some outdated terraform state which allows for more flexibility in some cases.

Reading the terraform state as a data source

As you already mentioned sharing the state for outputs is not the best solution due to the fact that you need to manage access and act on "private" data.

But if you run the two stacks with the same identity, this access needs to be granted also to get the outputs from stack A in the first place(?) and is also required to update the states.

But being able to read the state via the provided data source also made us not implement a special solution (so far) as we consider it a valid solution in most cases.

How could Terramate help?

But i am very happy to understand more details about the use case and why this workflow is important to you and we can see what we can add to terramate to support it in the best way possible.

mariux avatar Aug 17 '22 10:08 mariux

@mariux Many thanks for your elaborate answer. I am not sure if I understood all the realities you described.

Using data sources and filters is exactly the approach I am looking for. The output of one stack would be the input for the filter of the next stack. My understanding is that this information is currently not shared among stacks and hence not possible with terramate.

Would you mind to clarify: "How Terramate can help here is to share the actual id, name, tags, labels, etc between those stacks." - How could this be done?

MatthiasScholzTW avatar Aug 17 '22 13:08 MatthiasScholzTW

This depends highly on your structure of course but let me try to give at least two examples:

(We are aware that this requires some rethinking in contrast of how Infrastructure as Code with Terraform was used so far - as such options Terramate provides were not really available)

When having 2 stacks side by side:

my-service/stackA
my-service/stackB

We could set up globals on the my-service level e.g. in my-service/shared-data.tm.hcl

Let's create a VPC in AWS in stackA and use it in stackB

globals {
  vpc_config = {
    cidr_block = "10.0.0.0/16"

    tags = {
      Name = "main"
    }
  }
}

stackA

In stack A we create the VPC (among other resources) (we can either create a set of locals to be used by Terraform or generate the actual resource (or a vpc module block)

generate_hcl "vpc.tf" {
  content {
    resource "aws_vpc" "main" {
      cidr_block = global.vpc_config.cidr_block
      tags       = global.vpc_config.tags
    }
  }
}

stack B

In stackB you now have access to the exact same globals to create a data source to get the actual VPC ID.

generate_hcl "import-vpc.tf" {
  content {
    data "aws_vpc" "main" {
      filter {
        # let's keep it easy and just reference the exact tag we already know
        # we could also use tm_dynamic to dynamically generate a filter 
        # for all the tags we have defined in global.vpc_config.tags
        name   = "tag:Name"
        values = [global.vpc_config.tags.Name]
      }
    }

    # make the VPC ID available for the rest of the stack 
    # so it can be used in e.g. plain terraform code or other generated code
    locals {
      vpc_id = data.aws_vpc.main.id
    }
  }
}

When using stacks in stacks:

my-service/stackA
my-service/stackA/stackB

In this scenario the code looks exactly the same as above but we can set the global to define vpc_config in stackA and not in the parent directory.

In addition we can also define the block generating import-vpc.tf inside of stack A with a condition to generate it in all sub stacks.

generate_hcl "import-vpc.tf" {
  # the `/` at the end just triggers the genrate block inside of sub stacks
  condition = tm_can(tm_regex("/my-service/stackA/", terramate.stack.path.absolute))

  content {
    # same as above
  }
}

To be able to plan both stacks on first apply

Before we initially applied stackA, stackB can not be planned as the data source in stackB will fail due to non existing VPC.

The following can help but is far from being a perfect solution. But as mentioned this is a Terraform issue that is also present in Terragrunt setups, where outputs are not yet existing.

generate_hcl "import-vpc.tf" {
  content {
    resource "null_resource" "initial_deploy_trigger" {}

    data "aws_vpc" "main" {
      # untested dynamic block ;) - just to give the idea how to filter multiple tags
      tm_dynamic "filter" {
        for_each = global.vpc_config.tags

        content {
          name   = "tag:${filter.key}"
          values = [filter.value]
        }
      }

      depends_on = [null_resource.initial_deploy_trigger]
    }
    locals {
      vpc_id = data.aws_vpc.main.id
    }
  }
}

Summary

When sharing data with Terramate Globals, the globals should be used to configure the creation of the resource and then can also be used to configure the import of the resource using data sources.

This is different from what we are used to do: We use hard-coded values to create the resource and then export the resource to share this data. This case is also supported in Terramate but only by using a data source for the remote state.

To make use of the inheritance and global nature of Terramate Globals, we need to start to also configure resources with dynamic values.

Roadmap

On the roadmap we have another feature which will help to improve this even further.

In the first example having stacks side-by-side the globals need to be defined on a parent folder. This is not always the preferred way for sure.

So a feature coming soon (no ETA yet, sorry ;)) is to be able to read globals from other stacks and use them in the current stack.

This will allow us to share globals independent from the hierarchy and inheritance.

Terramate is a very young tool but we use it on a daily base with our customers. It solves the issues we had and will solve the issues we still have (soon) and we wanted to share it as early as possible so other people can give feedback and maybe already start using it.

Please let me know if this makes it a bit more clear, what the idea behind this is and how it might be used ;)

mariux avatar Aug 18 '22 09:08 mariux

@mariux Many thanks for your elaborate answer. I am not sure if I understood all the realities you described.

Using data sources and filters is exactly the approach I am looking for. The output of one stack would be the input for the filter of the next stack. My understanding is that this information is currently not shared among stacks and hence not possible with terramate.

Would you mind to clarify: "How Terramate can help here is to share the actual id, name, tags, labels, etc between those stacks." - How could this be done?

@MatthiasScholzTW Nice to meet you! BTW, we're always happy to jump on a call with you to understand your use case further and to give you some insights on the future development of Terramate!

soerenmartius avatar Aug 18 '22 10:08 soerenmartius

@mariux thanks for those details. It might help place this in context by showing how TM tackles the different elements in this common use case:

https://blog.gruntwork.io/how-to-manage-multiple-environments-with-terraform-32c7bc5d692

taqtiqa-mark avatar Aug 21 '22 20:08 taqtiqa-mark

Many thanks again for the detailed answer with the code snippets - this is very much appreciated!

@mariux We implemented the approach your are sharing with a subtle modification and it works. 🥳

globals {
  vpc_config = {
    cidr_block = "10.0.0.0/16"

    tags = {
      vpc_identifier = "vpc_specfic_identifier"
    }
  }
}

Using a separate and specific tag allows more control since it is very common in modules to write a custom name and at the same time providing support to inject custom tags.

Which leads me to my main concern: Using this approach creates an implicit dependency with the way resources are created. In the example you have full control since the resource implementation is owned by the resource user as well.

We are making extensive use of terraform modules and your example turns for us into this:

# stack A
generate_hcl "vpc.tf" {
  content {
    module "main_vpc" {
      source = "terraform-aws-modules/vpc/aws"

      cidr  = global.vpc_config.cidr_block
      tags = global.vpc_config.tags

       ...
    }
  }
}

In this example we have limited control how the input is used within the module. We can only check the implementation and track future changes of the module. Which makes this setup a bit more brittle.

Aside of this we might end up with a couple of these custom identifiers in the global configurations.

MatthiasScholzTW avatar Aug 29 '22 07:08 MatthiasScholzTW

We are making extensive use of terraform modules

This is also our recommendation to use as much modules as possible to not reinvent the wheel and keeping the setup maintainable.

In this example we have limited control how the input is used within the module. We can only check the implementation and track future changes of the module. Which makes this setup a bit more brittle.

I am not sure, that I understand your concerns.

Module creators as well as provider creators can introduce breaking changes with every major release (and sometimes accidentally with minor or patch releases). This is why our recommendation is to always pin the version used for modules as well as providers and the terraform version.

Module inputs have type checks and validations as well and interfaces are unlikely to change in most cases. On top Module creators can add custom validations for inputs.

The upcoming Terraform 1.3 release (first beta was released 3 days ago) will finally introduce "Optional attributes for object type constraints" that will improve module input validation a lot.

So this highly depends on the quality of the module and if creators follow semantic versioning to guarantee low maintenance overhead when upgrading.

Aside of this we might end up with a couple of these custom identifiers in the global configurations.

The team is currently working on improving (reducing) the pollution of the global namespace. But the final set up highly depends on scale and architecture of your repository/repositories.

Terramate aims to make IaC management on scale more maintainable. Of course this demands some rethinking and architecture consideration how to structure stacks to match the needs of each and every project. How to structure depends on a lot of criteria like code ownership (when working in larger and multiple teams), needs for self-service blast radius and frequency of deploy etc.

mariux avatar Sep 03 '22 13:09 mariux