rabbitmq-server icon indicating copy to clipboard operation
rabbitmq-server copied to clipboard

When a cluster node is removed, consider deleting all non-mirrored queues hosted on it

Open michaelklishin opened this issue 8 years ago • 9 comments

See this rabbitmq-users thread for background. If a node is removed from a cluster, all non-mirrored queues hosted on it should be removed or migrated (depending on their durability: durable queues are not migrated when their hosting node is unavailable but transient queues are).

This is somewhat of an edge case. There can be other edge cases with the proposed solution above.

Update from 2020-2021: with the introduction of maintenance mode and planned removal of classic queue mirroring in favor of quorum queues and streams, this becomes both less relevant and smaller in scope/less risky of a change.

michaelklishin avatar Dec 02 '16 22:12 michaelklishin

+1. And users are most likely to be unaware of which queues are residing on the node being removed, in case any are critical, and the exercise of checking this could be tedious (as queue counts increase). Brainstorming a possibility of queue migration policies, putting onus on users to decide how migration is carried out, and for which queues in particular (durable/transient);

  • migrate-to-none - simply delete all non-mirrored queues on removed node
  • migrate-to-any - move matching non-mirrored queues to any other cluster node
  • migrate-to-node - move matching non-mirrored queues to a specific cluster node
  • migrate-to-min-master - move matching non-mirrored queues to node with least queue masters

Durability definitely a big factor. Migration would also slow down duration of node removal procedure - but, it would/could be the price to pay, in order to retain any non-mirrored queues.

Ayanda-D avatar Dec 04 '16 03:12 Ayanda-D

Will this be implemented soon?

ron819 avatar Sep 26 '18 05:09 ron819

Now that there is a way to transfer leadership of a queue to a different node, it is an option for, say, durable queues.

michaelklishin avatar Nov 30 '20 12:11 michaelklishin

Now that there is a way to transfer leadership of a queue to a different node, it is an option for, say, durable queues.

@michaelklishin Which method are you referring to? Thanks

luddd3 avatar Jul 22 '21 08:07 luddd3

@luddd3 I was referring to an internal API introduced for maintenance mode.

You can put a node under maintenance before removing it, the node won't keep or accept any client connections and host any queue leader replicas. So the effect would be comparable to what this issue tries to achieve.

michaelklishin avatar Jul 22 '21 17:07 michaelklishin

Update from 2020-2021: with the introduction of maintenance mode and planned removal of classic queue mirroring in favor of quorum queues and streams, this becomes both less relevant and smaller in scope/less risky of a change.

michaelklishin avatar Jul 22 '21 17:07 michaelklishin

@michaelklishin I tried to reproduce the scenario in rabbitmq 3.8.22. When i removed then node from a cluster,all the queue were deleted. Is this issue has been solved? Below was my reproduce steps

  1. rabbit@rabbitmqservice-0 and rabbit@rabbitmqservice-1 was a cluster
  2. test,test2,test3,test4 ware four queues created in two nodes
test     **durable**    **rabbit@rabbitmqservice-0**
test2   transient  rabbit@rabbitmqservice-0
test3   **durable**    **rabbit@rabbitmqservice-1**
test4   transient  rabbit@rabbitmqservice-1
  1. stop rabbitmqservice-1
[rabbitmq@rabbitmqservice-1 rabbitmq-service]$ rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rabbitmqservice-1 ...
[rabbitmq@rabbitmqservice-1 rabbitmq-service]$

test3 queue was deleted, test4 queue was down

  1. remove rabbitmqservice-1 from cluster
[rabbitmq@rabbitmqservice-0 /]$ rabbitmqctl  -n rabbit@rabbitmqservice-0 forget_cluster_node rabbit@rabbitmqservice-1
Removing node rabbit@rabbitmqservice-1 from the cluster
  1. only test,test2 queues were left.
[rabbitmq@rabbitmqservice-0 /]$ rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...
name    messages
test    0
test2   0

polaris-alioth avatar Nov 13 '21 09:11 polaris-alioth

@polaris-alioth we can't tell since you haven't shared any details on the properties of those queues. They might have been exclusive, for example. We do not guess in this community.

michaelklishin avatar Nov 15 '21 08:11 michaelklishin

@michaelklishin I created four queues, all were non-mirrored. test3 was durable, test4 was transient test durable on rabbit@rabbitmqservice-0 test2 transient on rabbit@rabbitmqservice-0 test3 durable on rabbit@rabbitmqservice-1 test4 transient on rabbit@rabbitmqservice-1

polaris-alioth avatar Nov 16 '21 02:11 polaris-alioth