vigil
vigil copied to clipboard
Feature Request Inquiry: Add outage threshold
Hello!
Currently if there are 10 replicas on the vigil status page, and 3 of them are dead, Vigil declares a "Partial Service Outage". I would like to inquire if the concept of an outage threshold value could be instrumented, which would be the minimum ratio needed before declaring an "Service Outage".
This value could be set in the config.cfg
file:
outage_threshold = 0.5 //or 50?
This value would represent the minimum ratio of the # of dead replicas to the # of total replicas needed to declare a state of "Service Outage".
Thanks, nice idea! I could also suggest that some nodes have a stronger "downtime" weight than other ones, eg. database servers, or any SPOF node.