pulumi-eks icon indicating copy to clipboard operation
pulumi-eks copied to clipboard

EKS Build Fails with nodeGroupOptions Errors

Open qdzlug opened this issue 3 years ago • 4 comments

Hello!

  • Vote on this issue by adding a 👍 reaction
  • To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

We have code that has worked for months that is suddenly sporadically throwing this error: details = "Setting nodeGroupOptions, and any set of singular node group option(s) on the cluster, is mutually exclusive. Choose a single approach." We hit a number of failures on Friday, it began working again this weekend and is now not working again.

Our code in question is here:

  • https://github.com/nginxinc/kic-reference-architectures/blob/master/pulumi/aws/eks/main.py#L61-L65
  • https://github.com/nginxinc/kic-reference-architectures/blob/master/pulumi/aws/eks/main.py#L73-L83

There is nothing being set outside of the node options that should cause this error (based on the logic in https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/cluster.ts#L361-L375).

The output looks like this:

  • https://gist.github.com/qdzlug/931c07928dca1c3ad6097d89227edc16

This issue started following the issues with AWS, but I don't know if it's related. This is under Pulumi v3.19.0 and all current versions of the pulumi python modules.

Steps to reproduce

  1. Clone the kic-reference-architecture repo, being sure to init the submodule.
  2. Configure the project; there are steps in the getting-started guide that can be followed.
  3. Run the start_all.sh script.

Expected: The EKS step should stand up an EKS cluster, and the rest of the build should continue. Actual: Errors are thrown as shown in the above gist.

qdzlug avatar Dec 14 '21 01:12 qdzlug

One other note; this nearly always results in most of the config.yaml file being deleted for the stack - see https://github.com/nginxinc/kic-reference-architectures/issues/71 for details.

qdzlug avatar Dec 14 '21 15:12 qdzlug

I'm running into this issue as well. I'm using Go and currently the codeblock for the EKS cluster looks like:

                // Create an EKS cluster
                cluster, err := eks.NewCluster(ctx, "Test", &eks.ClusterArgs{
                        VpcId: pulumi.String(vpcid),
                        PrivateSubnetIds: pulumi.StringArray{
                                pulumi.String(private[0]),
                                pulumi.String(private[1]),
                                pulumi.String(private[2]),
                        },
                        PublicSubnetIds: pulumi.StringArray{
                                pulumi.String(public[0]),
                                pulumi.String(public[1]),
                                pulumi.String(public[2]),
                        },
                        ClusterSecurityGroup: sg,
                        EndpointPrivateAccess: pulumi.Bool(true),
                        EndpointPublicAccess: pulumi.Bool(false),
                        NodeGroupOptions: &eks.ClusterNodeGroupOptionsArgs{
                                InstanceType: pulumi.String("t3a.medium"),
                                NodeAssociatePublicIpAddress: pulumi.Bool(false),
                                ExtraNodeSecurityGroups: ec2.SecurityGroupArray{
                                        xtrasg,
                                },
                        },
                })
                if err != nil {
                        return err
                }

This results in:

Diagnostics:
  pulumi:pulumi:Stack (EKS-EKS-test):
    Error: Setting nodeGroupOptions, and any set of singular node group option(s) on the cluster, is mutually exclusive. Choose a single approach.: Error: Setting nodeGroupOptions, and any set of singular node group option(s) on the cluster, is mutually exclusive. Choose a single approach.
        at createCore (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/cluster.ts:374:15)
        at new Cluster (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/cluster.ts:1405:22)
        at Object.construct (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/cmd/provider/cluster.ts:21:29)
        at Provider.construct (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/cmd/provider/index.ts:124:24)
        at Server.<anonymous> (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/provider/server.ts:322:48)
        at Generator.next (<anonymous>)
        at fulfilled (/home/ubuntu/.pulumi/plugins/resource-eks-v0.36.0/node_modules/@pulumi/pulumi/provider/server.js:18:58)
        at processTicksAndRejections (node:internal/process/task_queues:96:5)
 
    error: program failed: waiting for RPCs: rpc error: code = Unknown desc = Setting nodeGroupOptions, and any set of singular node group option(s) on the cluster, is mutually exclusive. Choose a single approach.
    exit status 1
 
    error: an unhandled error occurred: program exited with non-zero exit code: 1

My guess is that defaults set some node group options in the ClusterArgs and I need to find them all and move them to the ClusterNodeGroupOptionsArgs

As you can see from the code setting the options all in ClusterArgs won't work for me as I want to set ExtraNodeSecurityGroups which doesn't appear in ClusterArgs

I'll see what I can achieve and report back here if I managed success.

M-JobPixel avatar Jan 20 '22 18:01 M-JobPixel

Facing a similar issue even if I try to launch a simple EKS cluster with a NodeGroup.

cluster_name = f"{stack}-cluster"
eks_cluster = pulumi_eks.Cluster(
    cluster_name,
    name=cluster_name,
    cluster_security_group=eks_sg,
    node_group_options=pulumi_eks.ClusterNodeGroupOptionsArgs(
        node_subnet_ids=subnet_ids,
        node_associate_public_ip_address=False,
        instance_type="t3a.medium",
        node_public_key="public_key_here",
        node_root_volume_size=8,
        node_user_data="""
                #!/bin/bash
                echo "Running custom user data script"
                """,
        min_size=1,
        max_size=1,
        desired_capacity=1,
        gpu=False,
        auto_scaling_group_tags={"Name": node_group, "Stack": stack},
        node_security_group=eks_node_group_sg,
    ),
)

Traceback

      File "PATH/k8s-peks/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 605, in do_register
        resp = await asyncio.get_event_loop().run_in_executor(None, do_rpc_call)
      File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "PATH/k8s-peks/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 602, in do_rpc_call
        handle_grpc_error(exn)
      File "PATH/k8s-peks/venv/lib/python3.8/site-packages/pulumi/runtime/settings.py", line 268, in handle_grpc_error
        raise grpc_error_to_exception(exn)
    Exception: Setting nodeGroupOptions, and any set of singular node group option(s) on the cluster, is mutually exclusive. Choose a single approach.
    error: an unhandled error occurred: Program exited with non-zero exit code: 1

sushantkumar-amagi avatar Feb 12 '22 20:02 sushantkumar-amagi

I'm using Python as the description language, and here there are a small error in the Cluster value initialization - see cluster.py

            if node_root_volume_size is None:
                node_root_volume_size = 20

The easy way around this - that seems to work is to add eks.Cluster(..., node_root_volume_size=False, ...)

tma-unwire avatar Feb 14 '22 08:02 tma-unwire

To summarise the core issue here – the schema for the SDKs specifies the default value which then generates code setting the default value if no value is provided. However, the provider (nodejs code) is expecting to just recieve no value and to set the default itself, if required.

These default values should be removed from the SDKs (schema) as these are defaults set by EKS itself.

danielrbradley avatar Nov 15 '22 11:11 danielrbradley