procfs
procfs copied to clipboard
userHz is hard-coded to 100
Running node_exporter inside a LX zone on Joyent's SmartOS (or their cloud platform, Triton) reports incorrect CPU stats. SmartOS is based on Solaris, and LX zones are containers that enable running Linux application on Solaris.
LX zones report a USER_HZ
value of 1000, which results in incorrect CPU stats being reported.
While it may be argued that SmartOS is incorrectly emulating the USER_HZ value (aka, it should report 100), I feel that procfs should query the value rather than have a hard-coded value to maintain compatibility across multiple platforms.
(And yes, I know that procfs did originally query for the value, but it was replaced with a constant for "reasons")
(And yes, I know that procfs did originally query for the value, but it was replaced with a constant for "reasons")
When this was last looked at, all supported platforms used 100.
The node_exporter
was not designed to run in containers, it is a host system level tool.
So we removed it as the necessary cgo dependency forced too much overhead in crossbuilds. Could we come up with a simple solution here, like defining the constant in per architecture files and use build flags?
We could probably generate the constant in x/sys/Unix and then not worry about it. Easier to push it upstream since the build infrastructure is there.
@SuperQ LX zones should be considered to be similar to full VMs rather than linux containers. We're not talking about LXC or Docker here.
Could we come up with a simple solution here, like defining the constant in per architecture files and use build flags?
The setup here is sufficiently weird that the setting may not be the one we expect for the architecture.
Which architecture is this?
I should have said per operating system files
. It seems SmartOS will just be recognized as Solaris
in golang, and I don't know whether all Solaris variants use USER_HZ=1000.
We could re-introduce C code for the syscall in something like user_hz_solaris.go
and continue to use the hardcoded value in user_hz.go
.
@zegelin Do you know how to query for the USER_HZ value in solaris?
USER_HZ
is a Linux thing, Solaris doesn't have it.
Oh, now I see the issue I guess. From the perspective of the program running in the LX zone, the OS will be Linux.
I'm very hesitant of introducing a C dependency just for that exception. @zegelin Does LX provide any hint about the environment we could query for without a systemcall?
@brian-brazil
Which architecture is this?
x86_64
@grobie Exactly. Processes running in a LX zone think they are running on Linux even though the host is actually Solaris.
Does LX provide any hint about the environment we could query for without a systemcall?
There is a sysctl that returns:
kernel.osrelease = BrandZ virtual linux
LX emulates /proc
:
# cat /proc/sys/kernel/osrelease
BrandZ virtual linux