nats-operator icon indicating copy to clipboard operation
nats-operator copied to clipboard

NatsCluster doesn't update pods with new configurations

Open mfaur opened this issue 4 years ago • 4 comments

I have recently added limits and requests to my NatsCluster resource as follows:

apiVersion: nats.io/v1alpha2
kind: NatsCluster
metadata:
  name: nats-cluster
spec:
  size: 2
  pod:
    resources:
      requests:
        memory: "300Mi"
        cpu: 200m
      limits:
        memory: "700Mi"
        cpu: 400
    enableConfigReload: true

When applying the changes, the NatsCluster resource was updated but the pods it was running weren't. Only after manually deleting the pods did the NatsCluster create new pods with the new configurations.

mfaur avatar Mar 04 '20 17:03 mfaur

I had reason to look into the source code previously, and it seems this is a known behaviour. https://github.com/nats-io/nats-operator/blob/master/pkg/apis/nats/v1alpha2/cluster.go#L315:

	// Resources is the resource requirements for the NATS container.
	// This field cannot be updated once the cluster is created.
	Resources v1.ResourceRequirements `json:"resources,omitempty"`

egoon avatar Apr 17 '20 06:04 egoon

In that case, I assume there's no harm in updating those resource requests manually on the individual Pods?

I also wonder what the expected behavior is for a NatsCluster that has its resource limits updated after initial deployment, but before any new pods that may be needed are deployed.

For instance, let's say I set the size to 10, and the antiAffinity to true, but I only have 3 nodes in my Kubernetes cluster. This only allows 3 NATS servers to be scheduled, but doing so affords me "Pseudo-Daemonset" behavior if I were to also use this alongside a Cluster Autoscaler.

So, if my cluster is scaled up to 4 nodes, this should allow the NATS operator to schedule one additional server on the new node.

Let's say I've updated the resource limits before doing this - the resource limit would have been updated just before adding a new node.

Will the new Pod have the newly set resource limit, or will it inherit the old limits?

In any case, I think this feature is certainly worth adding, as I'm sure many would agree. Having to tear down all of the servers just to update resource limits in a managed way isn't very ideal.

ubergeek77 avatar Aug 30 '20 09:08 ubergeek77

This is an issue for me too and it applies more broadly than resource limits. I was expecting changing other aspects of NatsCluster configuration which introduce changes to a cluster node's Pod specification to trigger a rolling update across the cluster. For example, adding labels to the pod template or enabling the enableConfigReload option.

In reality, that is not the case. I have to manually cycle the pods of the cluster to have new pods created with the new configuration. Reviewing the code, the only cases which actually cycle pods are cluster size or pod container version changes.

That's not ideal and not how I was expecting the operator to behave, so unless I am mistaken in my tests and review, this would indeed be an advantageous addition of great utility to the controller.

mhuxtable avatar Dec 02 '20 10:12 mhuxtable

Just another comment to say that this behaviour is a surprising omission to a controller.

As noted above, having to manually reload pods for configuration changes to picked up is not great.

whyman avatar May 11 '21 10:05 whyman