valkey icon indicating copy to clipboard operation
valkey copied to clipboard

Deflake total sum of full synchronizations

Open naglera opened this issue 1 year ago • 3 comments

Stabilize test PSYNC2: total sum of full synchronizations at least 4

Explanation: During the previous test PSYNC2: generate load while killing replication links, load is generated on the master, and the replica's connection is repeatedly terminated. On a busy machine, this load can cause the replica to perform a full sync. There is no guarantee that the replicas will find the necessary bytes in the COB.

naglera avatar Jun 10 '24 17:06 naglera

Have you seen this fail recently in Daily? If yes, then this fix is good to include in 8.0.

zuiderkwast avatar Aug 31 '24 10:08 zuiderkwast

I haven't seen this test fail in a long, long time. (It was hardened a long time ago by me and oran)

enjoy-binbin avatar Aug 31 '24 11:08 enjoy-binbin

I don't recall if I saw this specific test failing locally or in a PR GitHub workflow. However, the scenario where the primary does not have the necessary replication data for PSYNC is a valid possibility, especially under high load conditions and when the connection between the master and replica is often killed. Do we have a reason to believe that the replica should not disconnect for too long in this scenario?

naglera avatar Sep 01 '24 14:09 naglera

I've never seen this fail, so I think we can leave this as is.

madolson avatar Sep 05 '25 18:09 madolson