nomad icon indicating copy to clipboard operation
nomad copied to clipboard

Allocation Addresses are derived from resources

Open Smithx10 opened this issue 5 years ago • 5 comments

Nomad version

Nomad v0.9.1 (4b2bdbd9ab68a27b10c2ee781cceaaf62e114399)

Operating system and Environment details

Linux and Triton Plugin

Issue

While writing a plugin, I noticed that even though I return a *drivers.DriverNetwork in StartTask nomad status returns an empty value for addresses.

arch@archlinux ~/g/s/g/S/nomad-driver-triton ❯❯❯ nomad status cb0d2442                                                                                                                             master ⬆ ✱ ◼
ID                  = cb0d2442
Eval ID             = 3bfa9835
Name                = redis.redis[0]
Node ID             = 10a275c4
Job ID              = redis
Job Version         = 10
Client Status       = pending
Client Description  = No tasks have started
Desired Status      = run
Desired Description = <none>
Created             = 12s ago
Modified            = 12s ago
Deployment ID       = 25ed363a
Deployment Health   = unset

Task "redis" is "pending"
Task Resources
CPU     Memory  Disk     Addresses
20 MHz  10 MiB  300 MiB

Task Events:
Started At     = N/A
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                       Type        Description
2019-06-20T12:45:03-04:00  Task Setup  Building Task Directory
2019-06-20T12:45:03-04:00  Received    Task received by client

I believe this is because the address and port that are stored in the Allocation are derived from the Resources.NetworkResource type. My Driver registers the proper address and port into Consul via the following Service{} stanza.

          name         = "${TASKGROUP}-redis"
          tags         = ["global", "cache"]
          port         = 6379
          address_mode = "driver"

          check {
            name         = "alive"
            type         = "tcp"
            interval     = "10s"
            timeout      = "2s"
            address_mode = "driver"

            check_restart {
              limit           = 3
              grace           = "90s"
              ignore_warnings = false
            }
          }

But requires nothing to be inputed into Resources.Networks{} which I believe results in an empty value.

Job file (if appropriate)

  job "redis" {
    datacenters = ["dc1"]
    type        = "service"

    update {
      canary            = 1
      max_parallel      = 1
      healthy_deadline  = "8m"
      progress_deadline = "10m"
    }

    group "redis" {
!     count = 1

      task "redis" {
        driver = "triton"

        resources {
          cpu    = 20
          memory = 10
        }

        service {
          name         = "${TASKGROUP}-redis"
          tags         = ["global", "cache"]
!         port         = 6379
          address_mode = "driver"

          check {
            name         = "alive"
            type         = "tcp"
            interval     = "10s"
            timeout      = "2s"
            address_mode = "driver"

            check_restart {
              limit           = 3
              grace           = "90s"
              ignore_warnings = false
            }
          }
        }

        config {
          api_type = "docker_api"

          docker_api {
            public_network = "sdc_nat"

            private_network = "My-Fabric-Network"

            labels {
              group         = "webservice-cache"
              bob.bill.john = "label"
              test          = "test"
            }

            ports {
              tcp = [
                6379,
              ]
            }

            image {
              name      = "redis"
              tag       = "latest"
              auto_pull = true
            }
          }

          package {
            name = "sample-512M"
          }

          fwenabled = false

          cns = [
            "redis",
          ]

          tags = {
            redis = "true"
          }

          fwrules {
            anytoredis = "FROM any TO tag redis ALLOW tcp PORT 6379"
            redistcp   = "FROM tag redis TO tag redis ALLOW tcp PORT all"
            redisudp   = "FROM tag redis TO tag redis ALLOW udp PORT all"
          }
        }

        env {
          envtest = "test"
        }

        meta {
          my-key = "my-value"
        }
      }
    }

Smithx10 avatar Jun 20 '19 16:06 Smithx10

@Smithx10 The cli output reads from that field like you mention above, and is populated by the scheduler during a network assignment step, which is done when the task has a network stanza inside its resources stanza.

Your job spec is using the driver address mode so the address is not allocated by the scheduler and hence Resource.Networks is not populated. The address is known to the client running the alloc and is persisted to its local state, but not known to the server/scheduler.

I see how the UX can be confusing though, will discuss some options internally to see what we can do to make this better.

preetapan avatar Jun 21 '19 02:06 preetapan

@preetapan I think that issue https://github.com/hashicorp/nomad/issues/5863 can perhaps help solve the UX. If a TaskDriver author could store a map on the allocation struct a user could see the info returned during "nomad status ". I thought that was what DriveAttributes were for but I didn't see anywhere for a user to query them. https://github.com/hashicorp/nomad/blob/master/drivers/docker/driver.go#L1173

Smithx10 avatar Jun 21 '19 02:06 Smithx10

@Smithx10 Discussed with the team internally and we see other parallels with upcoming work on Consul connect integration that could also use this feature (i.e exposing driver specific address details via the API). Will be addressing this when we make those changes for Consul connect integration.

preetapan avatar Jun 24 '19 19:06 preetapan

@preetapan I'd hope that these modifications to the API don't have any dependencies on a Service Mesh and are just being grouped up into a bigger pile of work. Currently The Task Driver can advertise the proper address in Consul.

Glad to see the allocation API get some attention :)

Smithx10 avatar Jun 24 '19 23:06 Smithx10

It's the same with the docker driver if someone use the network_mode != bridge (in my case a macvlan). So the UI and also the cli doesn't know anything about the used address but the service is registered within consul with the right address. So nomad basically knows about the used address inside the container but don't display this.

@preetapan is this something which is addressed with 0.10.x?

MorphBonehunter avatar Sep 24 '19 16:09 MorphBonehunter