nomad
nomad copied to clipboard
raise an error if group has Consul namespace set for non-Enterprise cluster
Nomad version
Output from nomad version
1.3.5
Operating system and Environment details
Ubuntu 22
Issue
If you have a Group with the following definition:
group {
consul { namespace = var.some_namespace }
service "generator-plugin" {}
service "generator-execution-sidecar" {}
}
I would expect every instance of this group to exist in a different namespace (at least in Enterprise).
However: My cluster isn't enterprise yet, but Nomad happily accepts this job and then dumps these services into the only (default) namespace.
This causes an unexpected round-robin situation.
When using the Consul CLI it gives you a pretty harsh error if you try to do anything namespace-related; I'd hope that Nomad's interface with this would try to do the same.
consul catalog services -namespace default
Error listing services: Unexpected response code: 400 (Bad request: Invalid query parameter: "ns" - Namespaces are a Consul Enterprise feature)
Reproduction steps
Deploy the same Group multiple times with different consul { namespace = }
values on a Nomad cluster with non-enterprise Consul.
Expected Result
Explicit failure - refusal to deploy the job, perhaps?
Actual Result
They all get silently dumped in the default
namespace
Job file (if appropriate)
I wouldn't suggest trying this one yourself, way too many prerequisites, but I discovered this with https://github.com/grapl-security/grapl/pull/2026/files#diff-47f0314c3995de007f9d705f4ea0b1f681b482df1f5fa3618ac8a11613599a19
Hi @wimax-grapl! I did some diving into the code and I think I see why it's doing this currently. The client agent is what's talking to Consul here, and so while the client knows whether it's talking to Consul Enterprise or not, the server doesn't. So it would be challenging to surface this information to the server to pick it up at job submit time. That being said, it should be possible to do this at allocation placement time, but the allocation would fail and then be rescheduled until it runs out of reschedules. That's not a great user experience either.
But I'm going to mark this as an enhancement for further discussion and roadmapping. Thanks for opening the issue!
Yep, totally makes sense that it'd be hard to surface to users. If there were perhaps a way to surface Consul Enterprise as a Resource that a cluster needs - like disk space or mem or something - that could be a reasonable way to expose it to the customer.
If there were perhaps a way to surface Consul Enterprise as a Resource that a cluster needs
I think this is already possible, if you were to create a constraint on the attribute ${attr.consul.sku}
, e.g.
➜ nomad node status -self -verbose | grep consul\.sku
consul.sku = oss
(and the Enterprise version would be ent
)
So something like this (haven't tested)
constraint {
attribute = "${attr.consul.sku}"
value = "ent"
}