nomad
nomad copied to clipboard
Allocation Addresses are derived from resources
Nomad version
Nomad v0.9.1 (4b2bdbd9ab68a27b10c2ee781cceaaf62e114399)
Operating system and Environment details
Linux and Triton Plugin
Issue
While writing a plugin, I noticed that even though I return a *drivers.DriverNetwork in StartTask nomad status
arch@archlinux ~/g/s/g/S/nomad-driver-triton ❯❯❯ nomad status cb0d2442 master ⬆ ✱ ◼
ID = cb0d2442
Eval ID = 3bfa9835
Name = redis.redis[0]
Node ID = 10a275c4
Job ID = redis
Job Version = 10
Client Status = pending
Client Description = No tasks have started
Desired Status = run
Desired Description = <none>
Created = 12s ago
Modified = 12s ago
Deployment ID = 25ed363a
Deployment Health = unset
Task "redis" is "pending"
Task Resources
CPU Memory Disk Addresses
20 MHz 10 MiB 300 MiB
Task Events:
Started At = N/A
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2019-06-20T12:45:03-04:00 Task Setup Building Task Directory
2019-06-20T12:45:03-04:00 Received Task received by client
I believe this is because the address and port that are stored in the Allocation are derived from the Resources.NetworkResource type. My Driver registers the proper address and port into Consul via the following Service{} stanza.
name = "${TASKGROUP}-redis"
tags = ["global", "cache"]
port = 6379
address_mode = "driver"
check {
name = "alive"
type = "tcp"
interval = "10s"
timeout = "2s"
address_mode = "driver"
check_restart {
limit = 3
grace = "90s"
ignore_warnings = false
}
}
But requires nothing to be inputed into Resources.Networks{} which I believe results in an empty value.
Job file (if appropriate)
job "redis" {
datacenters = ["dc1"]
type = "service"
update {
canary = 1
max_parallel = 1
healthy_deadline = "8m"
progress_deadline = "10m"
}
group "redis" {
! count = 1
task "redis" {
driver = "triton"
resources {
cpu = 20
memory = 10
}
service {
name = "${TASKGROUP}-redis"
tags = ["global", "cache"]
! port = 6379
address_mode = "driver"
check {
name = "alive"
type = "tcp"
interval = "10s"
timeout = "2s"
address_mode = "driver"
check_restart {
limit = 3
grace = "90s"
ignore_warnings = false
}
}
}
config {
api_type = "docker_api"
docker_api {
public_network = "sdc_nat"
private_network = "My-Fabric-Network"
labels {
group = "webservice-cache"
bob.bill.john = "label"
test = "test"
}
ports {
tcp = [
6379,
]
}
image {
name = "redis"
tag = "latest"
auto_pull = true
}
}
package {
name = "sample-512M"
}
fwenabled = false
cns = [
"redis",
]
tags = {
redis = "true"
}
fwrules {
anytoredis = "FROM any TO tag redis ALLOW tcp PORT 6379"
redistcp = "FROM tag redis TO tag redis ALLOW tcp PORT all"
redisudp = "FROM tag redis TO tag redis ALLOW udp PORT all"
}
}
env {
envtest = "test"
}
meta {
my-key = "my-value"
}
}
}
@Smithx10 The cli output reads from that field like you mention above, and is populated by the scheduler during a network assignment step, which is done when the task has a network
stanza inside its resources
stanza.
Your job spec is using the driver address mode so the address is not allocated by the scheduler and hence Resource.Networks is not populated. The address is known to the client running the alloc and is persisted to its local state, but not known to the server/scheduler.
I see how the UX can be confusing though, will discuss some options internally to see what we can do to make this better.
@preetapan I think that issue https://github.com/hashicorp/nomad/issues/5863 can perhaps help solve the UX. If a TaskDriver author could store a map on the allocation struct a user could see the info returned during "nomad status
@Smithx10 Discussed with the team internally and we see other parallels with upcoming work on Consul connect integration that could also use this feature (i.e exposing driver specific address details via the API). Will be addressing this when we make those changes for Consul connect integration.
@preetapan I'd hope that these modifications to the API don't have any dependencies on a Service Mesh and are just being grouped up into a bigger pile of work. Currently The Task Driver can advertise the proper address in Consul.
Glad to see the allocation API get some attention :)
It's the same with the docker driver if someone use the network_mode != bridge (in my case a macvlan). So the UI and also the cli doesn't know anything about the used address but the service is registered within consul with the right address. So nomad basically knows about the used address inside the container but don't display this.
@preetapan is this something which is addressed with 0.10.x?