tidb-operator icon indicating copy to clipboard operation
tidb-operator copied to clipboard

improve the revision check in upgrader

Open hihihuhu opened this issue 2 years ago • 8 comments

Bug Report

What version of Kubernetes are you using? v1.19

What version of TiDB Operator are you using? from master

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

What's the status of the TiDB cluster pods?

What did you do? this is not exactly a tidb-operator issue and the root cause could be in k8s, but enhance tidb-operator could workaround the issue

during tidb deployment using tidb-operator, we found the process is stuck, with the status like

jq ".status.tikv.statefulSet"
{
  "collisionCount": 0,
  "currentReplicas": 21,
  "currentRevision": "mussel-prod-tikv-7cb857cc78",
  "observedGeneration": 1,
  "readyReplicas": 30,
  "replicas": 30,
  "updateRevision": "mussel-prod-tikv-7cb857cc78",
  "updatedReplicas": 21
}

the status of tikv pods looks like

mussel-prod-tikv-0                                  3/3     Running       0          5d22h
mussel-prod-tikv-1                                  3/3     Running       0          6d20h
mussel-prod-tikv-10                                 3/3     Running       0          15d
mussel-prod-tikv-11                                 3/3     Running       0          6d20h
mussel-prod-tikv-12                                 3/3     Running       0          7d19h
mussel-prod-tikv-13                                 3/3     Running       0          5d22h
mussel-prod-tikv-14                                 3/3     Running       0          12d
mussel-prod-tikv-15                                 3/3     Running       0          12d
mussel-prod-tikv-16                                 3/3     Running       0          5d19h
mussel-prod-tikv-17                                 3/3     Running       0          6d19h
mussel-prod-tikv-18                                 3/3     Running       0          5d22h
mussel-prod-tikv-19                                 3/3     Running       0          15d
mussel-prod-tikv-2                                  3/3     Running       0          12d
mussel-prod-tikv-20                                 3/3     Running       0          5d22h
mussel-prod-tikv-21                                 3/3     Running       0          5d20h
mussel-prod-tikv-22                                 3/3     Running       0          12d
mussel-prod-tikv-23                                 3/3     Running       0          5d20h
mussel-prod-tikv-24                                 3/3     Running       0          5d19h
mussel-prod-tikv-25                                 3/3     Running       0          5d21h
mussel-prod-tikv-26                                 3/3     Running       0          5d20h
mussel-prod-tikv-27                                 3/3     Running       0          5d21h
mussel-prod-tikv-28                                 3/3     Running       0          5d20h
mussel-prod-tikv-29                                 3/3     Running       0          5d20h
mussel-prod-tikv-3                                  3/3     Running       0          7d19h
mussel-prod-tikv-4                                  3/3     Running       0          7d19h
mussel-prod-tikv-5                                  3/3     Running       0          12d
mussel-prod-tikv-6                                  3/3     Running       0          6d19h
mussel-prod-tikv-7                                  3/3     Running       0          15d
mussel-prod-tikv-8                                  3/3     Running       0          6d20h
mussel-prod-tikv-9                                  3/3     Running       0          12d

all the tikv running for longer than 10d are not on the latest revision, and the old revision one are not in order.

this is weird and likely some race condition in k8s but not tidb-operator, and we are not sure how we got into this situation but once this happens, tidb-operator stuck to make process. dig into the code, the operator checks the status of every pod https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/tikv_member_manager.go#L992, so it decides there should be a upgrade in this case, but skips the upgrade later because the currentRevision equals to updateRevision https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/tikv_upgrader.go#L105

so i would suggest to check if currentReplicas equals replicas in the line to check currentRevision as well, what do you think? if this sounds fine, i could go ahead and add it

What did you expect to see? operator could make progress

What did you see instead? operator stuck

hihihuhu avatar Sep 29 '22 19:09 hihihuhu

Hi @hihihuhu, thanks for reporting the issue! Would you like to share the tidbcluster spec you're using? I'm interested in the .spec.statefulSetUpdateStrategy part as in https://github.com/kubernetes/kubernetes/issues/106055, it seems there're some issues regarding the OnDelete strategy.

For the fix you've mentioned, I think it's a reasonable fix that makes IsUpdating semantic consistent. It would be helpful if you can try the fix in your environment and see if it takes effect. Thanks! cc @KanShiori @liubog2008.

hanlins avatar Sep 29 '22 20:09 hanlins

cc @csuzhangxc

hanlins avatar Sep 29 '22 20:09 hanlins

we do not set it in the tidbcluster, so it should be default value using RollingUpdate, when it happened, the spec.updateStrategy of the tikv statefulset is

{
  "rollingUpdate": {
    "partition": 30
  },
  "type": "RollingUpdate"
}

hihihuhu avatar Sep 29 '22 20:09 hihihuhu

@hihihuhu Cloud you get the controller revision of this statefulset ? It may because the current revision is deleted.

See https://github.com/kubernetes/kubernetes/blob/v1.25.2/pkg/controller/statefulset/stateful_set_control.go#L235-L246

liubog2008 avatar Sep 30 '22 02:09 liubog2008

oh, great catch, i do believe that is what i saw, because there there is another weird thing that is, from k8s audit log, the stateful set is being deleted and recreated, seems by operator

September 21st 2022, 06:12:47.701+00:00	kube-audit	update	system:unsecured	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 18:37:21.016+00:00	kube-audit	update	system:unsecured	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 18:37:20.268+00:00	kube-audit	create	system:serviceaccount:tidb-mussel-prod:tidb-controller-manager	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 18:37:16.984+00:00	kube-audit	patch	system:unsecured	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 18:37:15.967+00:00	kube-audit	update	system:unsecured	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 18:37:15.253+00:00	kube-audit	delete	system:serviceaccount:tidb-mussel-prod:tidb-controller-manager	mussel-prod-tikv	tidb-mussel-prod
September 20th 2022, 17:17:30.207+00:00	kube-audit	update	system:unsecured	mussel-prod-tikv	tidb-mussel-prod

hihihuhu avatar Sep 30 '22 18:09 hihihuhu

oh, i think the sts is recreate in https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/pvc_resizer.go#L479, this deployment is just to resize the volumes in this change, and the complete the timestamp matches the recreate of sts

jq '.status.tikv.conditions'
[
  {
    "lastTransitionTime": "2022-09-20T18:37:23Z",
    "message": "All volumes are resized",
    "reason": "EndResizing",
    "status": "False",
    "type": "ComponentVolumeResizing"
  }
]

so what would be the consequence of existing pods when recreate a sts with a different spec?

hihihuhu avatar Oct 01 '22 00:10 hihihuhu

i would guess the status of the sts after recreate is something like

{
  "collisionCount": 0,
  "currentReplicas": 0,
  "currentRevision": "mussel-prod-tikv-7cb857cc78",
  "observedGeneration": 1,
  "readyReplicas": 30,
  "replicas": 30,
  "updateRevision": "mussel-prod-tikv-7cb857cc78",
  "updatedReplicas": 0
}

in this case, what i am proposal could actually a proper fix for this, so that the operator could rolling restart the tikv sts after recreated?

hihihuhu avatar Oct 01 '22 00:10 hihihuhu

@liubog2008 @hanlins what do you think about this? should i add a replica count check in https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/tikv_upgrader.go#L105 as a fix for this problem? thanks

hihihuhu avatar Oct 04 '22 20:10 hihihuhu

@liubog2008 @hanlins we can add a fix by adding the check for the replica count? WDYT. Thanks

wxiaomou avatar Oct 13 '22 20:10 wxiaomou

@liubog2008 @hanlins we can add a fix by adding the check for the replica count? WDYT. Thanks

Sure, I think the fix is a proper one. Please share your findings on reproducing the issue and whether the fix would work, thanks!

hanlins avatar Oct 26 '22 17:10 hanlins