procfs icon indicating copy to clipboard operation
procfs copied to clipboard

Limits returns 64-bit max for VSS on all platforms

Open SuperQ opened this issue 4 years ago • 9 comments

In https://github.com/prometheus/procfs/pull/340 we fixed the Limits value for "unlimited" to be max uint64. But this value would be incorrect for 32-bit systems, and could depend on kernel/user VM split configuration.

Now I'm also wondering if it's actually valid that 64-bits is the right value for VSS max for Intel 64bit, given there is still probably some kernel reserved address space in 64-bit mode.

SuperQ avatar Apr 19 '21 18:04 SuperQ

I've been digging into this a bit, and there's a bunch of issues with the current values for process_virtual_memory_max_bytes.

When we detect "unlimited" we output uint64 max. This is very wrong for 32-bit systems, and partly wrong even for 64-bit systems.

For example, it depends on kernel compile flags for 32-bit systems (CONFIG_VMSPLIT_3G etc). On 64-bit, it's architecture dependent, and kernel version dependent.

  • https://lwn.net/Articles/717293/
  • https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details
  • https://www.kernel.org/doc/html/latest/x86/x86_64/mm.html

I'm considering reverting the patch, or changing it to return nil in the case of "unlimited".

I'm also wondering what to do in the unlimited case in client_golang. When ulimit is unlimited I have found no way to determine what the value should be.

CC: @beorn7 @discordianfish

SuperQ avatar Apr 22 '21 17:04 SuperQ

How about +Inf in that case? It looks we have no way of determining the true value. We cannot use the string "unlimited" as the value of the metric, but +Inf seems to match it quite well.

beorn7 avatar Apr 22 '21 17:04 beorn7

Yea, +Inf is also an option. The down side is it's also not exactly true, since the real value is there.

Perhaps we should not expose the metric in client_golang if the value is +Inf.

SuperQ avatar Apr 23 '21 07:04 SuperQ

The function currently returns a uint64, so there's no +Inf option.

SuperQ avatar Apr 23 '21 15:04 SuperQ

How many fields in the ProcLimits could have that "unlimited" string in their source field? If that's common, perhaps change the value from uint64 to a struct {uint64, bool} where the bool says if it is set to unlimited?

We could document in the help string of the proc metrics that a Prometheus floating point value of +Inf means that the limit is set to unlimited.

beorn7 avatar Apr 26 '21 17:04 beorn7

Disclaimer: I might totally misunderstand the context here. (But just not exposing a metric sounds weird.)

beorn7 avatar Apr 26 '21 17:04 beorn7

The problem is that "unlimited" doesn't actually mean the value in bytes is "+Inf". There are hard limits imposed by the platform and kernel, and configuration combinations. This is mostly not an issue on 64-bit platforms, but can be very limiting on 32-bit.

So, exposing the value as +Inf is not true.

SuperQ avatar Apr 27 '21 10:04 SuperQ

I got that. That's why I said it has to be documented.

However, if I understand you correctly, it's highly non-trivial to find out what the true value is if what we get is unlimited.

Not exposing the metric means the metric doesn't exist. It does not communicate "The value is reported as unlimited by the kernel. That doesn't mean it is actually unlimited. It means there is a limit which we cannot easily determine."

However, we could document that a value of +Inf means the following: "The value is reported as unlimited by the kernel. That doesn't mean it is actually unlimited. It means there is a limit which we cannot easily determine."

beorn7 avatar Apr 27 '21 17:04 beorn7

Of course, if there is a way to find out the true value, then let's just use it.

beorn7 avatar Apr 27 '21 17:04 beorn7