apm
apm copied to clipboard
Add support for cgroup-based CPU metrics
Is your feature request related to a problem? Please describe. CPU metrics reported from containerised services are inaccurate.
Describe the solution you'd like So far we only support memory cgroup-based metrics, we need to apply the same for CPU.
Steps
- [ ] Define the related data model and intake. @elastic/apm-server your assistance with that, based on Metricbeat and/or stack monitoring, would be great
- [ ] Define how and when UI will be using those
- [ ] Extend the APM agents spec accordingly
Describe alternatives you've considered Not supporting those and keep inaccurate 🙂
Additional context <TBD - @elastic/apm-server if you can provide evidence from such inaccuracies observed in ESS, that would be great>
The libbeat/APM Server monitoring reports following metrics:
cgroup.cpu.idcgroup.cpu.cfs.period.uscgroup.cpu.cfs.quota.uscgroup.cpu.stats.periodscgroup.cpu.stats.throttled.periodscgroup.cpu.stats.throttled.nscgroup.cpuacct.idcgroup.cpuacct.total.ns
You can find the implementation for
- the metrics reporting implemented in beats/pull#21113
- the
cgroup.cpumetrics - the
cgroup.cpuacctmetrics
APM Server issue for defining Elasticsearch mapping https://github.com/elastic/apm-server/issues/4433.
Hey folks -- apm-server is going to start indexing these fields with this pr: https://github.com/elastic/apm-server/pull/4956
are there any other cgroup cpu metrics that are going to be sent that aren't accounted for here?
@stuartnelson3 Thanks for picking this up!
are there any other cgroup cpu metrics that are going to be sent that aren't accounted for here?
AFAIK there is no agent reference implementation yet, so not sure anyone really looked into this. Is there a plan in the server team to implement that in the go agent?
There isn't currently an issue in the repo, maybe @simitt knows something though
@axw do you have an overview over when this is planned to be added?
@simitt no, sorry