opendal icon indicating copy to clipboard operation
opendal copied to clipboard

Unable to reuse the PrometheusLayer with the same registry

Open evenyag opened this issue 1 year ago • 4 comments

As mentioned in https://github.com/GreptimeTeam/greptimedb/pull/4277#issuecomment-2208390871, the PrometheusLayer will create a new PrometheusMetrics each time we register it to an OperatorBuilder. https://github.com/apache/opendal/blob/174bda53f79123cd114d2409189423a0a4cf6bf3/core/src/layers/prometheus.rs#L154-L163

However, the registry panics if we create PrometheusMetrics with the same registry more than once. https://github.com/apache/opendal/blob/174bda53f79123cd114d2409189423a0a4cf6bf3/core/src/layers/prometheus.rs#L181-L198

So users are unable to create multiple operators with the same layer that collects metrics to the default global registry of rust-prometheus.

One approach to work around this without breaking change is to allow users to set the PrometheusMetrics.

pub struct PrometheusLayer {
    registry: Registry,
    requests_duration_seconds_buckets: Vec<f64>,
    bytes_total_buckets: Vec<f64>,
    path_label_level: usize,
    metrics: Option<Arc<PrometheusMetrics>>,
}

If users set the metrics explicitly, then the layer uses the provided metrics. So we can pre-build the PrometheusMetrics and reuse it without constructing a new one each time.

evenyag avatar Jul 04 '24 09:07 evenyag

I think we can just remove this unwrap https://github.com/apache/opendal/blob/174bda53f79123cd114d2409189423a0a4cf6bf3/core/src/layers/prometheus.rs#L199C9-L199C19 by checking the error type: https://github.com/tikv/rust-prometheus/blob/master/src/registry.rs#L48 and ignore this AlreadyReg

sunng87 avatar Jul 04 '24 13:07 sunng87

I think we can just remove this unwrap https://github.com/apache/opendal/blob/174bda53f79123cd114d2409189423a0a4cf6bf3/core/src/layers/prometheus.rs#L199C9-L199C19 by checking the error type: https://github.com/tikv/rust-prometheus/blob/master/src/registry.rs#L48 and ignore this AlreadyReg

Thanks, I think this can be addressed in a separate PR. Would you like to help with that?

Xuanwo avatar Jul 04 '24 13:07 Xuanwo

Oh, I just see why it's impossible because we need that register call to return the handle. If there is an error we won't be able to get the handle. We need to go upstream for this.

sunng87 avatar Jul 04 '24 13:07 sunng87

Oh, I just see why it's impossible because we need that register call to return the handle. If there is an error we won't be able to get the handle. We need to go upstream for this.

We need a way to get the registered metric https://github.com/tikv/rust-prometheus/issues/248

evenyag avatar Jul 04 '24 13:07 evenyag

After our refactor, this should now be possible. CC @evenyag and @sunng87, would you like to evaluate it again?

Xuanwo avatar Sep 09 '24 14:09 Xuanwo

Thank you for notification. We will switch to upstream implementation during next upgrade

sunng87 avatar Sep 09 '24 19:09 sunng87