go-dqlite icon indicating copy to clipboard operation
go-dqlite copied to clipboard

Dqlite client's cluster membership (cluster.yaml) is out of sync (by 1 second) with Dqlite

Open louiseschmidtgen opened this issue 5 months ago • 0 comments

During Canonical Kubernetes clustering tests I discovered that the Dqlite client (cluster.yaml) only gets updated every second leaving a race window during membership operations where snap refreshes/ node restarts/crashes prevent a new membership configuration to be updated in the client (even after the join in Dqlite Write-Ahead-Log WAL was successful).

This scenario below shows how we hit this race window where the node was successfully added in Dqlite but before Node A updates its Dqlite client (cluster.yaml) it was refreshed. The client only knows about itself and it can’t find the leader because it doesn’t know about it. When restarting Node B this forced the Node A’s Dqlite client to sync and recover us from the state.

Image

louiseschmidtgen avatar Aug 01 '25 06:08 louiseschmidtgen