KeyDB icon indicating copy to clipboard operation
KeyDB copied to clipboard

[CRASH] Killing a replicas during multi-master replication might cause other replicas to die

Open vmax opened this issue 3 years ago • 3 comments

Crash report

Killing a replica during multi-master replication might cause other replica(s) to crash. In the attached logs, I've repeatedly killed replica0 until replica1 crashed; logs attached:

keydb0.log keydb1.log keydb2.log

Additional information

  1. OS distribution and version

I'm using eqalpha/keydb:x86_64_v6.3.1 image from Docker Hub

  1. Steps to reproduce (if any)
  1. Create a 3-replica deployment with multi-master replication: kubectl apply -f keydb.yml; manifest is attached: keydb.yml.txt
  2. Monitor logs of all replicas (in separate terminals): kubectl logs -f my-keydb-X where X is 0..2
  3. Kill a random replica: kubectl delete pod/my-keydb-X where X is 0..2
  4. Observe other replicas crashing

vmax avatar Jul 29 '22 15:07 vmax

I've managed to find a few other bugs while trying to repro this, but no luck on your specific assert yet. I noticed our active-rep tests that randomly restart servers are limited to two servers so this may be why.

JohnSully avatar Aug 01 '22 07:08 JohnSully

Any news on this? We are experiencing the exact same issue and can reliably reproduce this with a few GB's of data in the cluster

svenhakvoort avatar Mar 16 '23 08:03 svenhakvoort

I can also reproduce it. Running a 3 keydb install, in a multi master setup orchestrated by Nomad. The issue is triggered even with a totaly empty database

dani avatar Jun 01 '23 21:06 dani