rabbitmq-server icon indicating copy to clipboard operation
rabbitmq-server copied to clipboard

Do not return nodes in maintenance mode as active stream replicas

Open gerhard opened this issue 4 years ago • 0 comments

After the rabbitmq-upgrade drain command runs and succeeds, that node gets placed in maintenance mode, meaning that all TCP sockets will close (including stream ones). The stream coordinator should not return nodes that are in maintenance in the metadata, because clients will fail to connect to them. This is how the Go Stream client currently fails to connect:

2021/09/06 16:09:43 [info] - Silent (0.13-alpha) Simulation, url: [rabbitmq-stream://Z5U5eSZbPmaWvuJUPnOQab1R_WF7SaZS:Y0quG0xQGiZhrvzXrHL2b4LRqHvf_ntL@rabbitmq:5552/%2f] publishers: 0 consumers: 10 streams: [stream-large]
2021/09/06 16:09:43 [info] - Declaring streams: [stream-large]
2021/09/06 16:09:43 [info] - stream stream-large, meta data: leader rabbitmq-server-1.rabbitmq-nodes.lre-3-9:5552, followers rabbitmq-server-0.rabbitmq-nodes.lre-3-9:5552rabbitmq-server-2.rabbitmq-nodes.lre-3-9:5552
2021/09/06 16:09:43 [info] - End Init streams :[stream-large]
2021/09/06 16:09:43 [info] - Starting 10 consumers...
2021/09/06 16:09:43 [info] - Starting consumer number: stream-large-0, form first
2021/09/06 16:09:44 [info] - Starting consumer number: stream-large-1, form first
2021/09/06 16:09:44 [error] - Error creating consumer: dial tcp 10.118.1.70:5552: connect: connection refused

We should probably not return nodes in maintenance mode in the rabbitmq-streams command either - create a new issue if you think this is valid.

Originally posted by @gerhard & @Gsantomaggio in https://github.com/rabbitmq/opportunities/issues/99

gerhard avatar Sep 06 '21 16:09 gerhard