Tracking: Open Telemetry for metrics and/or tracing
Right now we use the prometheus for metric capture and exposure.
OpenTelemetry is a project to watch as an OSS standard for both metrics and telemetry data, with the ability to send that data to a variety of backends.
They recently went 1.0 on the tracing specification, but have yet to reach a stable specification for metrics.
Some resources to dig into further: OpenTelemetry Metrics Roadmap, Mar 2021 OpenTelemetry Projects, with latest timelines
Highlight a few timeline points from the above article that may be most relevant (I'm sure they will be subject to change):
- 31st May, 2021 -Release an “Experimental” metrics API/SDK specification which we can recommend to language client owners to implement a metrics preview release - At this point we may want to review it and/or provide feedback if we see any blocking issues.
- 30 Sept, 2021 - Metrics API/SDK specification reaches “Feature-freeze” - this seems like a good point to start exploration of cutover from Prometheus to OpenTelemetry, and ensure if the benefits outweigh the costs.
- 30 Nov, 2021 - Metrics API/SDK specification reaches “Stable”. Together with the stable version of the specification, we should expect release candidates from multiple language clients, similar to what we had for tracing.
Recent profiling has shown that the current prometheus metrics are actually a bit expensive right now (~24ms), so if we moved to openmetrics we could see a significant reduction in the the time.
https://github.com/prometheus/client_rust