apm-agent-dotnet
apm-agent-dotnet copied to clipboard
APM Agent on K8s take system memory from host
Describe the bug When using APM agent on Kubernetes cluster, the memory reported in system.process.memory.size is incorrect.
To identify this situation whe have the suport of @tanquetav
To Reproduce Steps to reproduce the behavior:
- In kibana, search your service metrics:
service.name : your-service-name and processor.event : metric
this value is all memory of host.
Expected behavior The correct value of the service.
This part of the code is actually fairly straightforward in the agent code - all we do is that we query Process.GetCurrentProcess().VirtualMemorySize64
and report that. So the value comes from the framework itself.
@hramos8 could you give me some more info?
- What .NET version do you use?
- You say "this value is all memory of host." - to make sure we are all on the same page: this means the value that you see in
system.process.memory.size
is the all available memory of the host? Or all used memory of the host? What's exactly the value you see in that field and what would you expect there?
When we run containers on k8s, it is as Linux environments. The code to get memory is available here: https://github.com/elastic/apm-agent-dotnet/blob/master/src/Elastic.Apm/Metrics/MetricsProvider/FreeAndTotalMemoryProvider.cs#L58
It reads memory from /proc/meminfo. Unfortunately this information is not cgroups compliant, and gives the information from host environment. The cgroup information is available on /sys/fs/cgroup/
[george@casa02 apm-agent-dotnet]$ docker run -it --rm -m 500m ubuntu:latest /bin/bash
root@0b787d6c9417:/# cat /proc/meminfo |grep Mem
MemTotal: 16269404 kB
MemFree: 282024 kB
MemAvailable: 4454260 kB
root@0b787d6c9417:/# cat /sys/fs/cgroup/memory/memory.limit_in_bytes
524288000
The -m flag tells docker to limit the container to 500Mb, but /proc/meminfo shows 16G. The correct information is available in cgroup environment.
I hope it can help to fix this issue
I can confirm that it's true also for our applications in k8s.
Resource usage by our pods:
Pod definition:
{
"kind": "Pod",
"metadata": {
"name": "##################-66d6b86979-8x5rj"
},
"spec": {
"containers": [
{
"resources": {
"limits": {
"cpu": "250m",
"memory": "256Mi"
},
"requests": {
"cpu": "100m",
"memory": "128Mi"
}
}
}
]
}
}
Metrics reported by APM agent:
Memory information inside a pod:
The metric name is system.memory
which means that memory values of the system are reported. The good question, what it should be used in case of k8s, memory values of host or memory values which are available for container. For me, the 2nd value is more valuable than the memory values of the host.