vigil icon indicating copy to clipboard operation
vigil copied to clipboard

Feature Request Inquiry: Add outage threshold

Open csp197 opened this issue 4 years ago • 1 comments

Hello!

Currently if there are 10 replicas on the vigil status page, and 3 of them are dead, Vigil declares a "Partial Service Outage". I would like to inquire if the concept of an outage threshold value could be instrumented, which would be the minimum ratio needed before declaring an "Service Outage".

This value could be set in the config.cfg file:

outage_threshold = 0.5 //or 50?

This value would represent the minimum ratio of the # of dead replicas to the # of total replicas needed to declare a state of "Service Outage".

csp197 avatar Feb 18 '21 22:02 csp197

Thanks, nice idea! I could also suggest that some nodes have a stronger "downtime" weight than other ones, eg. database servers, or any SPOF node.

valeriansaliou avatar Feb 19 '21 08:02 valeriansaliou