nomad icon indicating copy to clipboard operation
nomad copied to clipboard

All tasks in a job receive the same Consul task identity token

Open t-davies opened this issue 10 months ago • 0 comments

Nomad version

1.7.5+ent

Operating system and Environment details

Amazon Linux 2023 on EC2, amd64

Issue

See also, support case: 146116

When running a job with multiple tasks that use Consul tokens from workload identity, although Nomad requests Consul tokens for each task - the token retrieved for the final task in the job is provided to all tasks in the job rather than each receiving its own token.

I've had a quick look and think it may be related to this.
https://github.com/hashicorp/nomad/blob/1e500907767acf325390f15cdb8f452ca22d0210/client/allocrunner/consul_hook.go#L160

My interpretation is that clusterName will always be "default" (or whatever the cluster name is set to) and widName will always be "consul_{clusterName}"? Therefore we're just overwriting the token each time? Happy to be mistaken though!

Reproduction steps

  1. Set up Nomad cluster and Consul cluster, enable workload identity integration.
  2. Configure Consul auth method binding rules as below.
    {
        "Description": "auth/nomad-workloads (task, task-level)",
        "AuthMethod": "jwt-nomad",
        "Selector": "\"nomad_service\" not in value",
        "BindType": "role",
        "BindName": "nomad-${value.nomad_job_id}-${value.nomad_task}",
        "Partition": "default",
        "Namespace": "default",
    },
    {
        "Description": "auth/nomad-workloads (task, job-level)",
        "AuthMethod": "jwt-nomad",
        "Selector": "\"nomad_service\" not in value",
        "BindType": "role",
        "BindName": "nomad-${value.nomad_job_id}",
        "Partition": "default",
        "Namespace": "default",
    },
    {
        "Description": "auth/nomad-workloads (service identity)",
        "AuthMethod": "jwt-nomad",
        "Selector": "\"nomad_service\" in value and value.nomad_service!=\"ext-terminating-gateway\"",
        "BindType": "service",
        "BindName": "${value.nomad_service}",
        "Partition": "default",
        "Namespace": "default",
    },
  1. Deploy a job with multiple tasks that use workload identity for Consul tokens, e.g.
job "traefikee" {
  name        = "traefikee"
  namespace   = "default"
  region      = "xxx"
  datacenters = ["*"]
  type        = "service"

  group "controllers" {
    count = 1

    task "controller" {
      identity {
        name = "consul_default"
        aud  = ["consul.xxx.xxx"]
        ttl  = "1h"
      }

      driver = "docker"

      config {
        [...]
      }

      template {
        destination = "secrets/traefikee.env"
        env         = true

        data = <<EOF
CONSUL_HTTP_TOKEN={{ env "CONSUL_TOKEN" }}
EOF
      }
    }

    task "setup-cluster" {
      [...no Consul identity used]
    }

    task "sync-join-tokens" {
      [...no Consul identity used]
    }

    task "watch-configuration" {
      lifecycle {
        hook    = "poststart"
        sidecar = false
      }

      identity {
        name = "consul_default"
        aud  = ["consul.xxx.xxx"]
        ttl  = "1h"
      }
      
      driver = "docker"

      config {
        image      = "traefik/traefikee:v2.11.0-ubi"
        entrypoint = ["/bin/bash", "${NOMAD_TASK_DIR}/watch_configuration.sh"]
      }

      template {
        change_mode = "noop"
        source      = "${NOMAD_TASK_DIR}/static.toml.ctmpl"
        destination = "${NOMAD_TASK_DIR}/config/static.toml"
      }
      
      template {
        change_mode = "noop"
        source      = "${NOMAD_TASK_DIR}/dynamic.toml.ctmpl"
        destination = "${NOMAD_TASK_DIR}/config/dynamic.toml"
      }
    }
  }

  group "proxies" {
    count = 3
   
    [...]
  }
}
  1. Log into Consul and observe that Nomad has correctly requested tokens for each of the tasks and they have been issued with the correct roles, policies attached. image

  2. Exec into each of the tasks running in the job and observe that they are all using the same token, seemingly the token of the final task in the job. image

Expected Result

Each task in the job receives its own Consul token.

Actual Result

Each task in the job receives the same Consul token, that of the final task in the job.

Job file (if appropriate)

Full job definition in support case.

t-davies avatar Apr 12 '24 12:04 t-davies