redpanda
redpanda copied to clipboard
AlterPartitionAssignments does not change the leader
Version & Environment
$ rpk version
v23.1.1-rc6 (rev dc47c26)
What went wrong?
Given the leaders:
$ rpk topic create test2_replica3_parition6 --partitions 6 --replicas 3
$ rpk topic describe -p test2_replica3_parition6
PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
0 0 23 [0 1 2] 0 0
1 0 18 [0 1 2] 0 0
2 1 21 [0 1 2] 0 0
3 2 22 [0 1 2] 0 0
4 2 10 [0 1 2] 0 0
5 2 13 [0 1 2] 0 0
And making broker 2 the leader for first 3 partitions, and then 0 leader of the remaining:
$ cat partition_replica.json
{
"version": 1,
"partitions": [
{
"topic": "test2_replica3_parition6",
"partition": 0,
"replicas": [2,0,1]
},
{
"topic": "test2_replica3_parition6",
"partition": 1,
"replicas": [2,0,1]
},
{
"topic": "test2_replica3_parition6",
"partition": 2,
"replicas": [2,0,1]
},
{
"topic": "test2_replica3_parition6",
"partition": 3,
"replicas": [0,1,2]
},
{
"topic": "test2_replica3_parition6",
"partition": 4,
"replicas": [0,1,2]
},
{
"topic": "test2_replica3_parition6",
"partition": 5,
"replicas": [0,1,2]
}
]
}
$ cat partition_replica_inverse.json
{
"version": 1,
"partitions": [
{
"topic": "test2_replica3_parition6",
"partition": 0,
"replicas": [0,1,2]
},
{
"topic": "test2_replica3_parition6",
"partition": 1,
"replicas": [0,1,2]
},
{
"topic": "test2_replica3_parition6",
"partition": 2,
"replicas": [0,1,2]
},
{
"topic": "test2_replica3_parition6",
"partition": 3,
"replicas": [2,0,1]
},
{
"topic": "test2_replica3_parition6",
"partition": 4,
"replicas": [2,0,1]
},
{
"topic": "test2_replica3_parition6",
"partition": 5,
"replicas": [2,0,1]
}
]
}
Produces:
$ bin/kafka-reassign-partitions.sh --bootstrap-server 192.168.0.5:9092,192.168.0.6:9092,192.168.0.7:9092 --reassignment-json-file partition_replica.json --execute
Current partition replica assignment
{"version":1,"partitions":[{"topic":"test2_replica3_parition6","partition":0,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"test2_replica3_parition6","partition":1,"replicas":[0,2,1],"log_dirs":["any","any","any"]},{"topic":"test2_replica3_parition6","partition":2,"replicas":[0,2,1],"log_dirs":["any","any","any"]},{"topic":"test2_replica3_parition6","partition":3,"replicas":[0,1,2],"log_dirs":["any","any","any"]},{"topic":"test2_replica3_parition6","partition":4,"replicas":[2,0,1],"log_dirs":["any","any","any"]},{"topic":"test2_replica3_parition6","partition":5,"replicas":[0,2,1],"log_dirs":["any","any","any"]}]}
Save this to use as the --reassignment-json-file option during rollback
Successfully started partition reassignments for test2_replica3_parition6-0,test2_replica3_parition6-1,test2_replica3_parition6-2,test2_replica3_parition6-3,test2_replica3_parition6-4,test2_replica3_parition6-5
Only 5 changed leadership:
$ rpk topic describe -p test2_replica3_parition6
PARTITION LEADER EPOCH REPLICAS LOG-START-OFFSET HIGH-WATERMARK
0 0 24 [0 1 2] 0 0
1 0 19 [0 1 2] 0 0
2 1 22 [0 1 2] 0 0
3 2 23 [0 1 2] 0 0
4 2 10 [0 1 2] 0 0
5 0 14 [0 1 2] 0 0
The operations are rejected:
Feb 22 10:12:18 mini2 rpk[1234292]: INFO 2023-02-22 10:12:18,767 [shard 2] cluster - controller_backend.cc:778 - [{kafka/test2_replica3_parition6/5}] (retry 2) result: Current node is not a leader for partition operation: {type: update, revision: 276, assignment: { id: 5, group_id: 262, replicas: {{node_id: 2, shard: 2}, {node_id: 0, shard: 2}, {node_id: 1, shard: 2}} }, previous assignment: {{{node_id: 0, shard: 3}, {node_id: 2, shard: 3}, {node_id: 1, shard: 3}}}}
Feb 22 10:12:18 mini2 rpk[1234292]: INFO 2023-02-22 10:12:18,774 [shard 1] cluster - controller_backend.cc:778 - [{kafka/test2_replica3_parition6/1}] (retry 2) result: Current node is not a leader for partition operation: {type: update, revision: 272, assignment: { id: 1, group_id: 258, replicas: {{node_id: 0, shard: 0}, {node_id: 2, shard: 1}, {node_id: 1, shard: 1}} }, previous assignment: {{{node_id: 0, shard: 2}, {node_id: 2, shard: 3}, {node_id: 1, shard: 3}}}}
Feb 22 10:12:18 mini2 rpk[1234292]: INFO 2023-02-22 10:12:18,775 [shard 1] cluster - controller_backend.cc:778 - [{kafka/test2_replica3_parition6/3}] (retry 2) result: Current node is not a leader for partition operation: {type: update, revision: 274, assignment: { id: 3, group_id: 260, replicas: {{node_id: 0, shard: 0}, {node_id: 2, shard: 1}, {node_id: 1, shard: 1}} }, previous assignment: {{{node_id: 0, shard: 2}, {node_id: 1, shard: 3}, {node_id: 2, shard: 3}}}}
And changing an inverse of leadership:
bin/kafka-reassign-partitions.sh --bootstrap-server 192.168.0.5:9092,192.168.0.6:9092,192.168.0.7:9092 --reassignment-json-file partition_replica_inverse.json --execute
What should have happened instead?
Membership should have changed.
JIRA Link: CORE-1176
Hi @freef4ll,
If you want to change leadership, we recommend the use of the Admin API. For example, to change leadership for partition 0 to node 2:
curl -X POST "http://192.168.0.5:9644/v1/partitions/kafka/test2_replica3_parition6/0/transfer_leadership?target=2"
Changing leadership with the AlterPartitionReassignments API in Redpanda is unsupported and needs to be documented. Thank you for flagging this.
To add a little more context to this:
Redpanda does not have sticky leadership in the same way that Kafka does, and will continuously attempt to maintain leadership balance across the cluster. Therefore having the same leadership semantics as Kafka for alter partitions wouldn't accomplish much. You could disable the leadership balancer and manually move leadership (like @NyaliaLui mentioned above), but this can be risky because leadership imbalance may still occur.
However, in our next release 23.2 we plan to have sticky leadership integrated into the system at which point we'll pass through the knowledge of leader-is-first-in-replica-set from alter partition reassignments into the balancer.
@micheleRP this is beta feedback that should be documented. See @NyaliaLui 's comment above
Thanks guys! It would be worth while to clarify the leadership change under https://github.com/redpanda-data/documentation/issues/606
Thanks. Adding this with https://github.com/redpanda-data/documentation/pull/1309
This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.
This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.