nomad
nomad copied to clipboard
Tasks in allocation fail to start with last event "no message" when using bride-mode and 'hostname = "${node.unique.name}"'
Nomad version
1.8.1
Operating system and Environment details
Ubuntu 20.04
Issue
Tasks in allocation fail to start with last event "no message" when using bridge-mode and 'hostname = "${node.unique.name}"'
Reproduction steps
group "test" {
network {
mode = "bridge"
hostname = "${node.unique.name}"
}
Expected Result
Tasks in allocation can be startet and deployment succeeds.
Actual Result
Tasks fail to start with "no message", deployment gets stuck in restart loop
Job file (if appropriate)
group "test" {
network {
mode = "bridge"
hostname = "${node.unique.name}"
}
Hi @SCORP111 and thanks for raising this issue. Could you provide more information about the job, such as the driver it is using? The Nomad client logs would also be useful here. Thanks.
Hi @jrasell, thanks for the quick response!
The job consists of 3 tasks and is using the following drivers:
- prestart-task-1: raw_exec
- prestart-task-2: docker
- service: docker
When trying to get the logs of the tasks, im getting these messages and I am unable to pull the logs. Also in the web-ui I don't see any logs.
me@NOMADS-001:~$ nomad logs -namespace=* 55122a21
Failed to validate task: Allocation "55122a21" is running the following tasks:
* generate-config
* nginx-proxy
* service
Please specify the task.
me@NOMADS-001:~$ nomad logs -namespace=* -task service 55122a21
Failed to read stdout file: error reading file: Unexpected response code: 404 (Unknown allocation "55122a21-6336-a1e6-7d4c-da4832732809")
me@NOMADS-001:~$ nomad logs -namespace=* -task generate-config 55122a21
Failed to read stdout file: error reading file: Unexpected response code: 404 (Unknown allocation "55122a21-6336-a1e6-7d4c-da4832732809")
me@NOMADS-001:~$ nomad logs -namespace=* -task nginx-proxy 55122a21
Failed to read stdout file: error reading file: Unexpected response code: 404 (Unknown allocation "55122a21-6336-a1e6-7d4c-da4832732809")
Hey @SCORP111, can you provide some more details? Does the job fail to start? Can you show nomad job $jobname status? Perhaps nomad alloc status $allocID? nomad eval list? It's kinda hard to understand what's happening.
Hey,
sorry for being a little bit unclear in my description. I think I narrowed down the problem: It seems to be a combination of using the raw_exec-driver and setting the hostname in the network block under group.network.
This is working fine:
job "test" {
datacenters = ["datacenter"]
type = "service"
group "raw_exec" {
count = 1
restart {
attempts = 3
interval = "2m"
delay = "15s"
mode = "fail"
}
network {
mode = "bridge"
}
task "example" {
driver = "raw_exec"
config {
command = "/bin/sh"
args = ["-c", "while true; do echo \"I'm working fine\"; sleep 2; done"]
}
}
}
}
If I now add for example hostname=test the allocation fails to start:
job "test" {
datacenters = ["datacenter"]
type = "service"
group "raw_exec" {
count = 1
restart {
attempts = 3
interval = "2m"
delay = "15s"
mode = "fail"
}
network {
mode = "bridge"
hostname = "test"
}
task "example" {
driver = "raw_exec"
config {
command = "/bin/sh"
args = ["-c", "while true; do echo \"I'm working fine\"; sleep 2; done"]
}
}
}
}
Okay, after running nomad alloc status it seems that hostname and raw_exec is just not supported:
Client Description = Unable to add allocation due to error: failed to configure network manager: hostname is not currently supported on driver raw_exec
Oh and it seems it also says so in the documentation: https://developer.hashicorp.com/nomad/docs/job-specification/network#hostname - ...currently only supported using the Docker driver...
Is there any workaround to keep setting the hostname for docker-drivers, when using bridge mode and also having a prestart-task thats using raw_exec?
Anyway, sorry for the confusion!
Kind regards
Doing a little issue triage cleanup and saw this one.
Oh and it seems it also says so in the documentation
Right, because to set the hostname we need to give the task a /etc/hostname that's been written elsewhere and then bind-mounted to that location. That can only work for task drivers that have a mount namespace like docker.
Is there any workaround to keep setting the hostname for docker-drivers, when using bridge mode and also having a prestart-task thats using raw_exec?
Because networks are defined at the group level and that means all the tasks share a network namespace, we can't have the tasks have different hostnames without causing a mess. But if you don't want them in the same network namespace, you can override the Docker networking configuration with network_mode and hostname on the task configuration. That looks like this:
job "example" {
group "group" {
network {
mode = "bridge"
port "www" {
to = 8001
}
}
task "docker" {
driver = "docker"
config {
image = "busybox:1"
command = "httpd"
args = ["-vv", "-f", "-p", "8001", "-h", "/local"]
network_mode = "bridge"
hostname = "example.local"
ports = ["www"]
}
}
task "raw" {
driver = "raw_exec"
config {
command = "/bin/sh"
args = ["-c", "while true; do echo \"I'm working fine\"; sleep 2; done"]
}
}
}
}
Otherwise it looks like we've got this issue resolved, so I'm going to close it out.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.