m3db-operator icon indicating copy to clipboard operation
m3db-operator copied to clipboard

Operator not scaling up cluster

Open yywandb opened this issue 4 years ago • 1 comments

Thanks for opening an issue for the M3DB Operator! We'd love to help you, but we need the following information included with any issue:

  • What version of the operator are you running? Please include the docker tag. If using master, please include the git SHA logged when the operator first starts.

v0.10.0

  • What version of Kubernetes are you running? Please include the output of kubectl version.
❯ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5", GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"clean", BuildDate:"2020-06-27T00:38:11Z", GoVersion:"go1.14.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.12", GitCommit:"17c50ce2d686f4346924935063e3a431360e0db7", GitTreeState:"clean", BuildDate:"2020-06-26T03:33:27Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • What are you trying to do?

Increase the instances per isolation group of our m3db cluster by 1, i.e. adding 3 nodes to the cluster, one for each replica.

  • What did you expect to happen?

Operator to detect that it needs to begin adding the node.

  • What happened?

The operator doesn't scale up the cluster. We see logs that look like this:

{"level":"info","ts":"2021-01-29T19:05:39.751Z","msg":"statefulset already exists","controller":"m3db-cluster-controller","name":"m3db-rep0"}
{"level":"info","ts":"2021-01-29T19:05:39.751Z","msg":"successfully synced item","controller":"m3db-cluster-controller","key":"m3/m3db"}
{"level":"info","ts":"2021-01-29T19:05:40.254Z","msg":"processing pod","controller":"m3db-cluster-controller","pod.namespace":"m3","pod.name":"m3db-rep2-1"}
{"level":"info","ts":"2021-01-29T19:05:40.254Z","msg":"processing pod","controller":"m3db-cluster-controller","pod.namespace":"m3","pod.name":"m3db-rep0-2"}
{"level":"info","ts":"2021-01-29T19:05:40.254Z","msg":"processing pod","controller":"m3db-cluster-controller","pod.namespace":"m3","pod.name":"m3db-rep2-7"}
{"level":"info","ts":"2021-01-29T19:05:40.254Z","msg":"processing pod","controller":"m3db-cluster-controller","pod.namespace":"m3","pod.name":"m3db-rep0-16"}
{"level":"info","ts":"2021-01-29T19:05:40.254Z","msg":"processing pod","controller":"m3db-cluster-controller","pod.namespace":"m3","pod.name":"m3db-rep2-4"}

We previously saw the same issue when using v0.7.0 of the operator with these logs.

{"level":"error","ts":"2021-01-28T21:18:58.717Z","msg":"statefulsets.apps \"m3db-rep0\" already exists","controller":"m3db-cluster-controller"}
E0128 21:18:58.717342       1 controller.go:319] error syncing cluster 'm3/m3db': statefulsets.apps "m3db-rep0" already exists

At that time, we chatted with @robskillington who suggested we upgrade to 0.8.0 or newer where there would be better state syncing in large k8s clusters that might reduce issues where there are stale view of objects, such as statefulsets not being seen as existing.

We thought it might be resolved by upgrading to v0.10.0 but we think the same issue persists. Though it seems like the "statefulset already exists" log is info level rather than error.

We're trying to understand more about how "statefulset already exists" might relate to the operator is not beginning to scale up the cluster. Still unsure if this is an issue on our k8s cluster side or a bug in the operator.

Other things we've tried:

  • [didn't work] edit the m3dbcluster back to the original number of instances, then restart the operator, then edit the m3dbcluster back up to the desired number of instances
  • [worked] delete the m3db-rep0 statefulset (operator doesn't recreate sts yet), then restart the operator, then we saw the operator started creating the new statefulset with the desired number of instances + started scaling up the cluster

yywandb avatar Jan 29 '21 19:01 yywandb

Hi @yywandb! Sorry for the delay in following up on this issue. Based on your description, it seems like the operator doesn't become aware that the cluster spec has been updated unless it's restarted. Does that sound right? If so, it seems like this issue might be similar to a previous issue we ran into, #268, where the operator would update a StatefulSet without waiting for a previous StatefulSet that it just updated to become healthy. The root cause of that issue was that the operator was working with stale copies of the StatefulSet's in the cluster and was addressed in #271. That commit was included in the most recent release, v0.13.0, and while it's concerned with StatefulSet's and not m3db cluster CRD's like this issue, it would be interesting to see if the issue still occurs with the latest release. To that end, would it possible to update your operator to v0.13.0? One tricky thing to be aware of before upgrading is that v0.12.0 contained breaking changes to the default ConfigMap that the operator uses for M3DB, so if you are relying on the default ConfigMap you'll need to provide the old default as a custom ConfigMap.

jeromefroe avatar Mar 26 '21 20:03 jeromefroe