Support fail-close mode in Lua healthcheck functions
- Program: Authoritative
- Issue type: Feature request
Short description
Allow Lua healthcheck helpers (ifportup, ifurlup, ifurlextup) to support a fail-close operation mode via the backupSelector parameter, so that when all targets are down, no records are returned, in addition to the current supported backup strategies.
Usecase
Some deployments require consistency during complete outage of a backend set. Returning any "fallback" addresses when all checked endpoints are unhealthy can lead to errors (e.g., writes to unhealthy shards, bypassing maintenances...). In these environments, operators want the name to resolve to nothing when health checks fail across the board, so upstream resolvers/clients can fail fast and honor negative caching.
Description
Currently, when every checked endpoint fails, backupSelector always returns at least one address (pickclosest, random and hashed) or the entire set (all). This request is to add an extra mode that returns no RRs and instead answers a NODATA response.
Example:
ifurlup(
{ "https://node1/health", "https://node2/health" },
{ backupSelector = "empty" }
)
I agree that always returning an answer even when none of the tested addresses are up can be quite confusing.
This would make it easier for us to work around the undesired behavior of backupSelector we explained in https://github.com/PowerDNS/pdns/issues/12789. We're currently working around it by checking if the number of returned results indicates that it's backupSelector='all' since it's larger than the size of any individual array of input IPs.