apm-agent-nodejs icon indicating copy to clipboard operation
apm-agent-nodejs copied to clipboard

Replace metrics gathering infrastructure with something more flexible

Open Qard opened this issue 6 years ago • 3 comments

The current metrics gathering components built on measured lacks some flexibility in a few ways which make it a bit painful for us:

  • lack of proper support for async gathering makes some things awkward and hacky
  • bulk gathering of multiple metrics from one async source is extra awkward
  • little control of what is tracked in the registry globally to apply limits, and therefore little control of overhead
  • reporting has too much indirection around it, making it difficult to reason about and control reliably

I think we should start by discussing our needs for metrics gathering and see if we can come up with a high-level description of how the data model and control flow need to fit together, for our use-case.

Qard avatar Aug 09 '19 09:08 Qard

https://github.com/elastic/apm-agent-nodejs/pull/1273 added an option to limit the number of recorded metrics. When the limit is reached, the oldest metric is removed. This is fine (although divergent from the breakdown metrics spec), but we should at least record a log message when this occurs (see the spec again). So we can put this down as one of the needs.

axw avatar Aug 19 '19 04:08 axw

This really deserves some attention. I'm currently migrating our custom metrics collector from a competing APM solution and the lack of async gathering is very limiting.

MartinKolarik avatar Dec 11 '24 17:12 MartinKolarik

Also, the fact that the callback must always return a number, otherwise the agent throws, is a problem. It should be able to handle missing values by not reporting them.

UPDATE: ~~although not documented, it seems returning NaN works fine.~~ UPDATE 2: No, returning a NaN doesn't result in any errors, but the whole set of metrics gets dropped by the server.

MartinKolarik avatar Dec 13 '24 13:12 MartinKolarik