stomp.py
stomp.py copied to clipboard
Reconnect Never Uses Failover?
I am using this client to interface with an AmazonMQ (ActiveMQ) active/standby setup. I have provided the connection host_and_ports
for both the active and standby connections.
self.connection = Connection(host_and_ports=hosts_and_ports, use_ssl=True, reconnect_attempts_max=1000)
self.connection.connect(username, password, wait=True)
To test the failover, I put the connection in an infinite loop where I send a message to the queue, then read from the queue, with a 1s sleep in between.
While that's running, I reboot the AmazonMQ active/standby, which first reboots the active, then the standby, sequentially so at any given point at least 1 ActiveMQ instance is available.
My expectation is that this soon after the active instance disconnects, the connection will swap over to the standby. At the very least I would expect (with such a high reconnect_attempts_max
) that the connection would be reestablished once the active comes back online. But this is not the case, the connection will not fail over and never come back once it is disconnected.
Am I using this wrong?
Do I need to manually reconnect in the on_disconnect
of the ConnectionListener
? If so, what is the purpose of providing multiple connections in the host_and_ports
in the first place?
@micah-williamson - Did you ever come up with the solution to this? Seems like I'm walking in your footsteps right now and came to the same dead end. 😞
For additional color, configuring the connection for n hosts_and_ports (where n is greater than 1) is not enough to configure failover. On initial connect it round robins through the connections until it finds your active broker but will not auto re-connect on failover. I wrote my on_disconnected method to retry the same initial connection and it would give up after two attempts. Adding the following to Connection was...better...:
reconnect_sleep_initial=5, reconnect_sleep_increase=0.5, reconnect_sleep_jitter=0.1, reconnect_sleep_max=120.0, reconnect_attempts_max=10
But still not great. During an AMQ reboot I would expect it to disconnect and reconnect twice (with minimal downtime) to wind up back on the initial broker, but it kept dropping and reconnecting even with the reconnect better configured. It did eventually reconnect to the initial broker though. In a couple of tests, however, it would still fail to reconnect and would eventually throw an exception.
Code below for reference if it helps or shows someone else where I've gone wrong. 😂
import time
import stomp
def connect_and_subscribe(conn):
conn.connect('test-user', 'test-password', wait=True, reconnect_attempts_max=24)
conn.subscribe(destination='/queue/test', id=1, ack='auto')
class MyListener(stomp.ConnectionListener):
def __init__(self, conn):
self.conn = conn
def on_error(self, frame):
print('received an error "%s"' % frame.body)
def on_message(self, frame):
print('Message: ' + frame.body)
def on_connected(self, frame):
print('Connected...')
def on_connecting(self, host_and_port):
print('Connecting to: ' + host_and_port[0] + '...')
def on_disconnected(self):
print('Disconnected...')
connect_and_subscribe(self.conn)
conn = stomp.Connection([('ENDPOINT-1.mq.us-east-2.amazonaws.com', 61614), ('ENDPOINT-2.mq.us-east-2.amazonaws.com', 61614)], use_ssl=True, heartbeats=(4000, 4000), reconnect_sleep_initial=5, reconnect_sleep_increase=0.5, reconnect_sleep_jitter=0.1, reconnect_sleep_max=120.0, reconnect_attempts_max=10)
conn.set_listener('', MyListener(conn))
connect_and_subscribe(conn)
time.sleep(600)
print('Disconnecting...')
conn.disconnect()
@JohnKeippel Checked my implementation and it doesn't look like I came up with anything. We abandoned MQ shortly after this after getting on a call with AWS and finding there is a hard limit of 5 lambda workers per MQ ESM. AWS treats MQ as no-more than a checklist feature. Wish I could offer something actually useful.
Hey, no problem at all. Thanks for the quick reply! Not excited about supporting it either but it is what it is.
AWS treats MQ as no-more than a checklist feature.
It really could not be more obvious.
JohnKeippel Is there any new message ? I have the same problem, i start a failover ActiveMQ at localhost, but the client does not work.
conn = stomp.Connection([('localhost',61613), ('localhost',61614)], reconnect_sleep_initial=5, reconnect_sleep_increase=0.5, reconnect_sleep_jitter=0.1, reconnect_sleep_max=120.0, reconnect_attempts_max=10)
@JohnKeippel @cocowool @micah-williamson Were you able to figure out a solution ? Since I am facing the issue. I have a broker network and failover configured on Amazon MQ (Active MQ) in the XML, but the producer/subscriber do not seem to reconnect to another broker when a broker restarts. Also could you guys were able to figure out how to make the producer retry / change the broker if a broker connection fails ?
@cocowool @a-n-s - Sorry, I didn't have any luck and then was laid off from that role and instantly lost interest in AMQ. 😂
For what it's worth the plan was to eventually retire AMQ entirely for this and other reasons. Good luck!