stacker icon indicating copy to clipboard operation
stacker copied to clipboard

FailedVariableLookup: Couldn't resolve lookups in variable `Cluster`

Open jamietsao opened this issue 7 years ago • 8 comments

We recently tried upgrading stacker and got an error:

[2018-05-15T15:28:22] ERROR production-bot-refresh-service stacker.plan:92(_run_once): Got unexpected keyword argument 'retries'
Traceback (most recent call last):
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/plan.py", line 88, in _run_once
    status = self.fn(self.stack, status=self.status)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/actions/diff.py", line 216, in _diff_stack
    provider_stack = self.provider.get_stack(stack.fqn)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 489, in get_stack
    return self.cloudformation.describe_stacks(
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/providers/aws/default.py", line 478, in cloudformation
    max_attempts=MAX_ATTEMPTS
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/botocore/config.py", line 94, in __init__
    args, kwargs)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/botocore/config.py", line 119, in _record_user_provided_options
    'Got unexpected keyword argument \'%s\'' % key)
TypeError: Got unexpected keyword argument 'retries'
[2018-05-15T15:28:22] DEBUG production-bot-refresh-service stacker.plan:148(set_status): Setting bot-refresh-service state to failed.
[2018-05-15T15:28:22] INFO production-bot-refresh-service stacker.ui:36(info): production-bot-refresh-service: failed (Got unexpected keyword argument 'retries')
[2018-05-15T15:28:22] ERROR MainThread stacker.actions.base:198(execute): The following stacks failed: production-bot-refresh-service

We were on version 1.0.4 previously and tried upgrading to 1.2.* (I initially tried going to 1.3.* but got the same error and narrowed it down to the 1.2.* upgrade).

I haven't changed anything else (e.g. blueprints, stack config, etc.) so it's very possible that it's a backwards compatibility issue. It's just hard to debug since the error is not very helpful.

Any insight would be appreciated. I'd like to upgrade to the latest stacker (and also troposphere) so that we can add support for some new AWS properties.

jamietsao avatar May 15 '18 22:05 jamietsao

@jamietsao by the looks of it, somehow stacker is using an old version of boto. Can you post what version of boto3/botocore is getting used? This error shouldn't show up if you're using botocore>=1.6.0

ejholmes avatar May 15 '18 23:05 ejholmes

@ejholmes - Thank you. That worked but now I'm getting a new error:

[2018-05-15T17:30:45] ERROR bot-service stacker.plan:91(_run_once): Couldn't resolve lookups in variable `Cluster`. 'NoneType' object has no attribute '__getitem__'
Traceback (most recent call last):
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/plan.py", line 89, in _run_once
    status = self.fn(self.stack, status=self.status)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/actions/diff.py", line 228, in _diff_stack
    stack.resolve(self.context, provider)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/stack.py", line 186, in resolve
    resolve_variables(self.variables, context, provider)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/variables.py", line 77, in resolve_variables
    variable.resolve(context, provider)
  File "/Users/jamie/.pyenv/versions/2.7.11/lib/python2.7/site-packages/stacker/variables.py", line 146, in resolve
    raise FailedVariableLookup(self.name, e)
FailedVariableLookup: Couldn't resolve lookups in variable `Cluster`. 'NoneType' object has no attribute '__getitem__'
[2018-05-15T17:30:45] DEBUG bot-service stacker.plan:143(set_status): Setting bot-service state to failed.
[2018-05-15T17:30:45] INFO bot-service stacker.ui:37(info): bot-service: failed (Couldn't resolve lookups in variable `Cluster`. 'NoneType' object has no attribute '__getitem__')
[2018-05-15T17:30:45] ERROR MainThread stacker.actions.base:188(execute): The following steps failed: bot-service

This looks specific to stacker itself. Is my config just not compatible with the latest version of stacker?

BTW, this is running the diff command:

> stacker diff -i -v -r us-west-1 --stacks bot-service --stacks internal-cluster --force bot-service --stacks internal-cluster-alb conf/production.env conf/production.yml

jamietsao avatar May 16 '18 00:05 jamietsao

Interestingly, if I run the build command for the exact same stack, it runs fine and completes with no change (expected because I didn't modify the stack):

> stacker build -i -v -r us-west-1 --stacks bot-service --stacks internal-cluster --force bot-service --stacks internal-cluster-alb conf/production.env conf/production.yml

....

[2018-05-15T17:38:10] DEBUG bot-service stacker.plan:143(set_status): Setting bot-service state to skipped.
[2018-05-15T17:38:10] INFO bot-service stacker.ui:37(info): bot-service: skipped (nochange)

jamietsao avatar May 16 '18 00:05 jamietsao

I think the diff exception is probably a legitimate bug, but glad to know that upgrading boto3 fixed the original problem.

ejholmes avatar May 17 '18 00:05 ejholmes

@ejholmes - I'll update the title of this issue to reflect the current issue. Thanks!

jamietsao avatar May 17 '18 06:05 jamietsao

@jamietsao any chance you could share an example config for this error? I'm curious what lookups you're using in your variables, as that might have something to do with it. Thanks!

phobologic avatar Jul 01 '18 16:07 phobologic

Ping @jamietsao

phobologic avatar Jul 08 '18 00:07 phobologic

@phobologic - Apologies for the delay. Here are the relevant snippets from our config:

common_vars: &common_vars
  VPC: vpc-7703cf13
  VpcCidrBlock: 10.0.0.0/16
  ThreatStackEnabled: true

internal_cluster_params: &internal_cluster_params
  <<: *common_vars
  Subnets: subnet-5e54f806,subnet-7819861c

internal_service_params: &internal_service_params
  VPC: vpc-7703cf13
  Cluster: ${output internal-cluster::ECSCluster}
  TaskRole: ecsTaskRole

stacks:
  - name: internal-cluster
    locked: true
    class_path: cf_templates.ecs_cluster.ECSCluster
    variables:
      <<: *internal_cluster_params
      ClusterName: ${namespace}-internal-cluster
      SpotMinSize: '0'
      SpotMaxSize: '0'
      OnDemandMinSize: '6'
      InstanceType: c5.2xlarge
      SpotPrice: '0.25'

  - name: bot-service
    locked: true
    class_path: cf_templates.ecs_service.ECSService
    variables:
      <<: *internal_service_params
      Command: './bot'
      DesiredCount: '1'
      Cpu: '256'
      Memory: '128'
      MemoryReservation: ''
      Image: 'quay.io/gametime/bot:master'
      ContainerEnvironment:
        ENVIRONMENT: ${namespace}
        SERVICE: bot
      Role: ''

Let me know if there's anything else you need. Thanks!

jamietsao avatar Jul 17 '18 21:07 jamietsao