seaweedfs-operator Discussion: Handling adding/removing volume servers

It is common that a user wants to add/remove volume servers on demand. Currently, the operator just changes the replicas of the StatefulSet, but several problems need to be addressed:

It is documented that

Adding/Removing servers does not cause any data re-balancing.

So I think we should send an admin command to the master server to rebalance data for better performance.

Removing servers may lose volume replicas and render some virtual volumes read-only.

In this situation, we need to fix the replication. A safer way to remove a volume server is that we add an extra replica for virtual volumes on the server to be removed first. Then we can take down the volume server afterward. In addition, when a user wants to remove more than one volume servers at once, it is better to remove only one at a time and stop the removing process if the cluster goes abnormal.

We should forbid removing volume servers when the number of healthy volume servers is less than or equal to replicas needed by the cluster. Otherwise, the cluster will not be writable at all. In this case, the operator needs to know the minimum count of volume servers for the cluster to be functional.

Is there an HTTP client or gRPC client that can communicate with the masters of a SeaweedFS cluster and execute the needed admin commands?

Oct 28 '20 08:10 howardlau1999

weed shell has all the commands, to rebalance, evacuate a volume server, etc. By default, the rebalancing is executed every 15 minutes as a customizable script in master.toml.

Not sure how to intervene the resizing process. Seems it is all controlled by K8s.

Oct 28 '20 08:10 chrislusf

It will be better if the operator can accept these customizable scripts.

btw: https://github.com/chrislusf/seaweedfs/tree/master/k8s has a common SeaweedFS K8s setup for reference.

Oct 28 '20 08:10 chrislusf

@chrislusf I wonder if there is a more programmatic way to execute commands. Otherwise we have to parse the plain text output of the shell.

Oct 28 '20 13:10 howardlau1999

For example, in the operator code, we can invoke this function directly, https://github.com/chrislusf/seaweedfs/blob/master/weed/shell/command_volume_balance.go#L63

We can refactor the command execution code to use typed error.

Oct 28 '20 15:10 chrislusf

For example, in the operator code, we can invoke this function directly, https://github.com/chrislusf/seaweedfs/blob/master/weed/shell/command_volume_balance.go#L63

We can refactor the command execution code to use typed error.

It seems that some information is output directly to the stdout. Should that be refactored too?

Oct 29 '20 03:10 howardlau1999

It seems that some information is output directly to the stdout. Should that be refactored too?

If the operator needs to report better error details, yes.

Actually the commands outputs are all written to an io.Writer and can be replaced with an in memory writer.

Oct 29 '20 05:10 chrislusf

seaweedfs-operator seaweedfs-operator copied to clipboard

Discussion: Handling adding/removing volume servers

seaweedfs-operator
seaweedfs-operator copied to clipboard