nomad
nomad copied to clipboard
API/Inspect loses ConsulConnect params
Nomad version
1.0.6
Operating system and Environment details
NAME="Red Hat Enterprise Linux" VERSION="8.2 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.2" PLATFORM_ID="platform:el8"
Issue
Some parameters of nomad job where missed after deploy a job with connect integration when you use the ui/api with json to make changes
Reproduction steps
deploy a job that expose something like this
connect {
sidecar_service {
proxy {
expose {
path {
path = "/metrics"
protocol = "http"
local_path_port = 8085
listener_port = "api_expose_metrics"
}
}
}
}
}
When the job is correctly running, go to definition of job, click on edit and plan without changes, you will receive a message with some changes to apply
+/- Job: "my_job"
+/- Task Group: "my_group" ( 1 in-place update )
+/- Service {
AddressMode: "auto"
EnableTagOverride: "false"
Name: "openidm-my_job-service"
PortLabel: "api"
TaskName: "my_task"
Tags {
Tags: "metrics"
}
+/- ConsulConnect {
Native: "false"
+/- SidecarService {
Port: ""
+/- ConsulProxy {
LocalServiceAddress: ""
LocalServicePort: "0"
}
}
}
}
Task: "connect-proxy-my_job-service"
Task: my_task"
Expected Result
If you use the deinition at UI, click edit and make a plan you should receive a message like
Job: "my_job"
Task Group: "my_group" (1 in-place update)
Task: "connect-proxy-my_job-service"
Task: "my_task"
Scheduler dry-run:
- All tasks successfully allocated.
Actual Result
+/- Job: "my_job"
+/- Task Group: "my_group" ( 1 in-place update )
+/- Service {
AddressMode: "auto"
EnableTagOverride: "false"
Name: "openidm-my_job-service"
PortLabel: "api"
TaskName: "my_task"
Tags {
Tags: "metrics"
}
+/- ConsulConnect {
Native: "false"
+/- SidecarService {
Port: ""
+/- ConsulProxy {
LocalServiceAddress: ""
LocalServicePort: "0"
}
}
}
}
Task: "connect-proxy-my_job-service"
Task: my_task"
Thanks for logging @pserranoa, I am experiencing the exact same issue.
Versions where we are experiencing it:
OS: Ubuntu 18.04 nomad.version: 1.1.5 consul.version: 1.10.3 envoy.version: 1.18.4
Loosing our prometheus metrics every time we make a change to the job config in the UI is annoying. After testing a bit I believe its related to the service check expose setting...
I've not included all HCL and JSON for brevity, the ... indicates excluded content.
This is a sample our original job definition
job "job1" {
group "job1" {
...
service {
name = "job1"
port = 8080
check {
type = "http"
path = "/ping"
interval = "10s"
timeout = "2s"
expose = true
}
connect {
sidecar_service {
proxy {
expose {
path {
path = "/metrics"
protocol = "http"
local_path_port = 8080
listener_port = "expose"
}
}
}
}
}
}
...
}
}
as it automatically generates the Services.Connect.SidecarService.Proxy.Expose in the JSON
...
"Checks": [
{
"Name": "service: \"transactions\" check",
"Type": "http",
"Command": "",
"Args": null,
"Path": "/ping",
"Protocol": "",
"PortLabel": "svc_job1_ck_22267b",
...
}
],
"Connect": {
"Native": false,
"SidecarService": {
"Tags": null,
"Port": "",
"Proxy": {
...
"Expose": {
"Paths": [
{
"Path": "/metrics",
"Protocol": "http",
"LocalPathPort": 8080,
"ListenerPort": "expose"
},
{
"Path": "/ping",
"Protocol": "",
"LocalPathPort": 8080,
"ListenerPort": "svc_job1_ck_22267b"
}
]
},
...
When we make a change from the UI (in the JSON) then afterwards the result shows the /metics expose is missing:
...
"Expose": {
"Paths": [
{
"Path": "/ping",
"Protocol": "",
"LocalPathPort": 8080,
"ListenerPort": "svc_job_ck_22267b"
}
]
},
...
If I disable the the expose on theServices.Checks.Expose = false then the exposed paths disappear, even though the json had the list (Services.Connect.SidecarService.Proxy.Expose)
...
"Expose": null,
...
The work around is to update the job using the HCL that we used to originally to create the job. but this is annoying as we cant make changes to our job in the UI.
Hello @tgross and Happy New Year. Any progress on this issue?
Hi @pserranoa and thanks for raising this issue as well as @grahambrown11 and @CarelvanHeerden for the additional context. I will raise this internally to see if we can get this roadmapped for a fix in the short-term.
Hi @jrasell. Thanks for your support
Hi @jrasell, any chance of progress being made on this?
Hi @jrasell, any plans on fixing this? I think I can try and fix it if no movement from hashicorp directly. This seems the same as https://github.com/hashicorp/nomad/issues/9379
Hi again,
Changing this (and all references to those fields) fixes the issue for me.
diff --git a/nomad/structs/services.go b/nomad/structs/services.go
index 19e78b08b7..db790979d9 100644
--- a/nomad/structs/services.go
+++ b/nomad/structs/services.go
@@ -1358,9 +1358,7 @@ type ConsulProxy struct {
// Expose configures the consul proxy.expose stanza to "open up" endpoints
// used by task-group level service checks using HTTP or gRPC protocols.
- //
- // Use json tag to match with field name in api/
- Expose *ConsulExposeConfig `json:"ExposeConfig"`
+ ExposeConfig *ConsulExposeConfig
// Config is a proxy configuration. It is opaque to Nomad and passed
// directly to Consul.
@@ -1501,7 +1499,7 @@ func upstreamsEquals(a, b []ConsulUpstream) bool {
// ConsulExposeConfig represents a Consul Connect expose jobspec stanza.
type ConsulExposeConfig struct {
// Use json tag to match with field name in api/
- Paths []ConsulExposePath `json:"Path"`
+ Path []ConsulExposePath
}
Hi @martinffx and @jorgemarey; this issue is on our backlog but has not yet been picked up. I can't currently give an estimate of when it will be started and my own current workload and stream means I don't have the time to look. If you do have a suggested fix, a PR would be grand and we can help review and work through any items.
Hi @jrasell. I made a PR in https://github.com/hashicorp/nomad/pull/15541.
+1 for this, we've run into this, lost a fair bit of time debugging it and it's a pretty big issue for us as our new metrics setup relies on it. Nomad surreptitiously dropping bits from an active job configuration seems like something that should get a bit more attention so we'd really appreciate some traction on this if possible as it's been sitting around for 1yr+ now