terraform-aws-components icon indicating copy to clipboard operation
terraform-aws-components copied to clipboard

EKS cluster bootstrap

Open iliazlobin opened this issue 3 years ago • 3 comments

Describe the Bug

Bootstrapping the EKS cluster doesn't seem to be worked out. There are a few self-references on the deployed component. But how would you deploy the component with self-references without yet having one? Where data sources are going to pull information from?

Expected Behavior

When putting out an atmos stack definition for an EKS component without any code modification and launching a stack deploy command the first time, I expect it to set up an AWS EKS cluster without errors and complaints.

Steps to Reproduce

The code is taken from the following sources:

Put out and deploy fundamental atmos stacks - everything on a diagram except the EKS component:

image

Define an EKS stack:

    eks:
      vars:
        aws_assume_role_arn: "arn:aws:iam::314028178888:role/atmos-gbl-test-terraform" # identity role for module
        tfstate_existing_role_arn: "arn:aws:iam::101236733906:role/atmos-gbl-root-terraform"
        cluster_kubernetes_version: "1.19"
        region_availability_zones: ["us-east-1b", "us-east-1c", "us-east-1d"]
        public_access_cidrs: ["72.107.0.0/24"]
        managed_node_groups_enabled: true

Run command to deploy a stack:

atmos terraform plan eks -s gbl-test

It'll end up triggering errors signifying about missing remote state of the following components:

  • workers_role
  • eks

Commenting out the pieces of code and related variables/locals seems like resolving the issue, but it adds up so much of an overhead on bootstrapping the component. Imagine how hard it'd be to replicate the deployment onto other dozens of environments.

iliazlobin avatar May 25 '21 05:05 iliazlobin

@iliazlobin thanks for reporting the issue.

Can you post the specific errors so we know where/how to help?

Unfortunately, these components are being pushed up without any automated tests, unlike our other modules. So while the components work in their native environments, the possibility that there's incompatibility introduced is entirely (likely) plausible without testing.

osterman avatar May 27 '21 04:05 osterman

It'd be time-consuming to reproduce cause I have like hundreds of lines of bash commands that would have to be run in order so that put the solution in a specific state. This only relates to the bootstrapping procedure when we don't yet have an eks cluster. The remote state doesn't see it in the s3 bucket and thus causing the error. I think it pretty much defines the concern.

Also a related issue has been created recently

iliazlobin avatar May 27 '21 14:05 iliazlobin

My apologies @iliazlobin . The self-referential nature of the EKS component is intentional and cannot be avoided. The problem, as discussed in #323 and #324, is that the published module uses the obsolete tfstate.tf pattern for retrieving the remote state and not the currently maintained remote-state.tf pattern. We are working on getting these modules updated, but it is quite a huge task. Please be patient.

To answer your main question

But how would you deploy the component with self-references without yet having one?

The self references are to "remote state" and the data source for remote state provides the ability to specify default values for when the remote state cannot be accessed. We use that to solve the bootstrap problem.

Nuru avatar Jun 24 '21 19:06 Nuru

We expect this has been fixed by #497, so I am closing this. If it remains a problem, please open a new issue with current details.

Nuru avatar Oct 11 '22 20:10 Nuru