cockroach
cockroach copied to clipboard
kv: cluster unavailability in proximity to node drain
On the DRT cluster, we're seeing a brief period of complete unavailability around the time that we're draining a node. The recently introduced chaos script periodically drains/kills/disk-stalls nodes and then recovers the outage. In a recent run, we're seeing that half of all ranges become unavailable:
At the same time, the foreground workload drops to zero momentarily:
This is being discussed further here.