gnmic icon indicating copy to clipboard operation
gnmic copied to clipboard

Feature Request: Add Prometheus metrics for target-loader health (e.g. Consul connectivity)

Open soggycactus opened this issue 4 months ago • 4 comments

gNMIc currently exports metrics for targets (gnmic_target_up) and subscriptions, but not for target loaders (Consul, http, file, etc). If Consul is unreachable, gNMIc logs errors, but there’s no Prometheus metric to alert on. There are situations where something like Consul itself can be healthy & online but unreachable due to network partitions, and it would be great to be able to trigger alerts whenever gNMIc is unable discover targets.

I want to propose we add loader-level metrics such as:

gnmic_loader_up{loader="consul"} — 1 if loader healthy, 0 if not.
gnmic_loader_errors_total{loader="consul"} — counter of loader failures.

I'd be happy to submit a PR implement this if you think it's a good idea

soggycactus avatar Sep 11 '25 01:09 soggycactus

Ahh I see now there are Consul-specific metrics that can be exported if I enable metrics in the loader configuration. In our use case we are using Consul for both locking and loading - so perhaps it might be nice to add metrics to the Consul locker?

soggycactus avatar Sep 11 '25 05:09 soggycactus

Sure, what kind of metrics are you thinking about? Maybe just a gnmic_cluster_locker_up{locker="consul"}

karimra avatar Sep 12 '25 05:09 karimra

@karimra gnmic_cluster_locker_up{locker="consul"} sounds good! Happy to PR if needed

soggycactus avatar Sep 12 '25 16:09 soggycactus

@karimra gnmic_cluster_locker_up{locker="consul"} sounds good! Happy to PR if needed

Go for it, thanks!

karimra avatar Sep 12 '25 17:09 karimra