nomad
nomad copied to clipboard
Default cgroup_parent path no longer respects cgroup version
Nomad version
Nomad 1.7.7 (but issue also exists at tip)
Operating system and Environment details
Unix
Issue
The default value for the cgroup_parent config used to be based on the cgroup version inferred by nomad. However, in 1.7.7 this seems to have become a hardcoded value. This causes the default value to be incorrect on cgroup v1 systems.
The issue seems to be introduced by this commit: https://github.com/hashicorp/nomad/commit/a4cc76bd3e4c7d4f7e623721caa8a716b5a0151f#diff-30de430fcaf9ccc69c864d5e78db52b7c2741787c3382839547d60d0c9000267L798-R797
I think reverting this to the original behavior should be straightforward just by using defaultParent in place of the hardcoded value:
https://github.com/hashicorp/nomad/blob/67009373031481e403adf722e5e181832274a974/client/lib/cgroupslib/switch_linux.go#L20
Though, we might need to teach defaultParent to handle the "off" case better (right now it falls through to the default case, which assumes cgroup v2).
I don't quite know how to reason about the broader impact of this change, so I'm just reporting the issue for now.
Reproduction steps
On a cgroup v1 system:
# Start the agent
$ nomad agent -dev
# Show that the default value is incorrect
$ ./nomad node status -json -self | jq .CgroupParent
"nomad.slice"
Expected Result
Default value should be /nomad
Actual Result
Default value is nomad.slice
I tried to work around this by setting the cgroup_parent field in the agent config, but that config field doesn't seem to be adhered to either (at least, I don't see it being merged in convertClientConfig:
https://github.com/hashicorp/nomad/blob/67009373031481e403adf722e5e181832274a974/command/agent/agent.go#L710
I'm happy to file a separate issue for this too, if you prefer.
Hi @marvinchin! Thank you for reporting this issue, it looks like a regression with the introduction of NUMA support, we will be taking a look at it.
Ref: https://hashicorp.atlassian.net/browse/NET-12139