mayastor [FAQ]: Single replica failure mode

Are you reporting an issue with existing content?

Are you proposing new content, or a change to the existing documentation layout or structure? I'd propose a new section be added giving an outline as to what one should expect if they were to set the replication count to 1 for a volume. On the storage class parameters page, it says: "If set to one then the volume does not tolerate any node failure. " but that doesn't really outline what will happen. If the node hosting the volume were to go offline, the disk would go offline (obviously) but would another node in the cluster create a new, empty volume that would then be "usable"? If the original node came back online, would the (now old) volume be wiped? Or maybe the entire time that the node hosting the volume is offline the volume is unavailable, does that trigger references to the volume to be removed from the cluster? If/when the node comes back online does the volume re-appear and things continue as though nothing had happened?

I'm sure it's not recommended to use a single replica volume for reasons of potential data loss, but there are certain use cases where data loss might be okay and I'm curious about the ramifications of a node failure.

Aug 19 '22 17:08 blakethepatton

Tested this out on my cluster:

Drained node, rebooted node

Volume goes to Unknown state
No new volume on another node is created
Pod using volume crashes due to IO error
PV stays around

Once the node came back online, our pod remained in an error state, had to force termination, the volume didn't seem to want to re-mount, and at this point we aborted the test and cleaned things up.

Aug 23 '22 20:08 blakethepatton

Did you manage to collect any logs from the csi node plugin?

Aug 23 '22 23:08 tiagolobocastro

I have logs, don't think it was a mayastor issue, I think kubernetes was just freaking out.

Aug 24 '22 01:08 blakethepatton

@datacore-tilangovan can you please have the FAQ added in the documentation as per the current behavior?

Jun 01 '24 07:06 avishnu

@datacore-tilangovan has this been done or still pending?

Jun 12 '24 16:06 tiagolobocastro

mayastor mayastor copied to clipboard

[FAQ]: Single replica failure mode

mayastor
mayastor copied to clipboard