pd icon indicating copy to clipboard operation
pd copied to clipboard

primary changed twice when pdms rolling update

Open Lily2025 opened this issue 11 months ago • 2 comments

Bug Report

What did you do?

1、update the config of tso

What did you expect to see?

the primary changed once when tso rolling update

What did you see instead?

the primary changed twice when tso rolling update

What version of PD are you using (pd-server -V)?

./pd-server -V Release Version: v8.0.0-alpha Edition: Community Git Commit Hash: bc92c13c26bb7abbe8f332cacdb5817a509391c9 Git Branch: heads/refs/tags/v8.0.0-alpha UTC Build Time: 2024-03-27 11:37:04 2024-03-28T15:26:29.929+0800

Lily2025 avatar Mar 28 '24 07:03 Lily2025

/type bug /severity major /assign HuSharp

Lily2025 avatar Mar 28 '24 07:03 Lily2025

For tiup

  1. When we have 3 pdms, pdms-0/pdms-1/pdms-2, and pdms-2 is primary
  2. upgrade pdms-2 firstly maybe transfer primary to pdms-0
  3. uprade pdms-0 will transfer primary again

We can upgrade pdms primary in last place(named defer feature) can avoid unnecessary primary transfer

Ref https://github.com/pingcap/tiup/pull/2414

For operator

tidb-operator does not have the ability to defer feature, it can only upgrade the pods in order.

Furthermore, Thinking about this situation:

  1. When we have 3 pdms, pdms-0/pdms-1/pdms-2, and pdms-2 is primary
  2. upgrade pdms-2 firstly maybe transfer primary to pdms-1
  3. upgrade pdms-1 maybe transfer primary to pdms-0.

To fix it, Assume that current primary ordinal is x, and range is [0, n]

  1. Find the max suitable ordinal in (x, n], because they have been upgraded
  2. If no suitable ordinal, find the min suitable ordinal in [0, x) to reduce the count of transfer

Ref https://github.com/pingcap/tidb-operator/pull/5643

HuSharp avatar Jun 25 '24 07:06 HuSharp