micrometer icon indicating copy to clipboard operation
micrometer copied to clipboard

System memory metrics

Open fabcmartins opened this issue 1 year ago • 6 comments

The class JvmMemoryMetrics offers memory metrics, but those are all related to heap and non heap values and do not expose the actual memory limits of the machine/container on which the application is running.

It'd be an improvement to offer some additional metrics in this class, such as system.memory.max system.memory.used

The metrics are available in the OperatingSystemMXBean classes in the Oracle and IBM JREs, and are already used by Micrometer to expose CPU metrics in the class io.micrometer.core.instrument.binder.system.ProcessorMetrics.

My use case would be to measure my application running inside a container in Kubernetes. With these metrics I would be able to know the percentage of available memory being used by the application.

The io.micrometer.core.instrument.binder.jvm.JvmMemoryMetrics is not enough to do it. For instance, by default Metaspace size is unlimited and Micrometer returns -1 when trying to read it.

The most important metric missing is getTotalPhysicalMemorySize, without it there's no way to build a Grafana panel that shows the percentage of memory being used by an application.

fabcmartins avatar Jun 27 '24 15:06 fabcmartins

Thanks for opening the issue. It's surprising no one else has asked for this yet (as far as I can remember). I suppose it isn't application metrics but rather system metrics and a solution could be to get that from outside of your application somehow. I think it'd be best to add this in a separate binder than JvmMemoryMetrics. Perhaps in a SystemMemoryMetrics class. Though from the JavaDocs of the JMX methods, it sounds like they're trying to move away from the "system" naming. Would you be interested in contributing a pull request for this?

shakuzen avatar Jul 02 '24 08:07 shakuzen

I haven't looked into the details what the JDK specific implementations return, but for the purpose of retrieving system level process metrics I came up with https://github.com/mweirauch/micrometer-jvm-extras which uses procfs (Linux only) to retrieve these metrics. (vss,rss,swap)

I started a draft to also read cgroup memory (limit) metrics, but haven't finished the implementation.

Perhaps that is useful.

mweirauch avatar Jul 02 '24 08:07 mweirauch

Would you be interested in contributing a pull request for this?

Sure, I'm writing the code. I'm planning on exposing the 5 metrics available in on the following properties:

  • system.virtualmemory.commited
  • system.swap.total
  • system.swap.free
  • system.memory.free
  • system.memory.total

As you already mentioned, Oracle seems to be moving from the term system to environment, maybe I should use a different prefix.

fabcmartins avatar Jul 16 '24 19:07 fabcmartins

It's surprising no one else has asked for this yet (as far as I can remember). I suppose it isn't application metrics but rather system metrics and a solution could be to get that from outside of your application somehow.

There has been a bit of debate for us about where the responsibility of exposing the system-level metrics lies. Should individual processes report these or should the host (probably some host agent) should be responsible for exposing this information? There are discussions in favor of both sides.

lenin-jaganathan avatar Jul 22 '24 05:07 lenin-jaganathan

I am really happy about this contribution. Was looking for something similar.

However, I am not sure about the discussion regarding the responsibility of exposing the system-level metrics. This decision lies in whoever uses the specific meter binder, or am I missing the point?

We have system-level metrics next to process-level metrics in ProcessorMetrics

  • system.cpu.usage / process.cpu.usage
  • system.load.average.1m
  • process.cpu.time

I think the current version of MemoryMetrics in #5308 looks good. Are there open issues we might need to address to accept this contribution? Is there something I can help with?

Just for completeness, the ProcessMemoryMetrics mentioned by @mweirauch earlier look good, even without cgroup memory (limit) metrics. On Linux, we would have the following information from /proc/self/status

Metric Field Content
process.memory.vss VmSize total program size
process.memory.rss VmRSS size of memory portions (sum of resident anonymous memory + file mappings + shmem memory
process.memory.swap VmSwap amount of swap used by anonymous private data (shmem swap usage is not included)

kariem avatar Apr 03 '25 09:04 kariem

Any news about this feature? This will make much more easy to get container memory information, in my case, for example, the application and kubernetes metrics are in different datasources; which make difficult to build a single view like 4GS dashboards.

elton-alves avatar Jul 09 '25 08:07 elton-alves