metrics2mqtt icon indicating copy to clipboard operation
metrics2mqtt copied to clipboard

Reconnection logic needs improvement

Open bachya opened this issue 4 years ago • 7 comments

I daemonize metrics2mqtt via the suggested method (using supervisor). I'm finding that when I restart my MQTT broker, metrics2mqtt errors out quite rapidly – so rapidly, in fact, that at some point, supervisor gives up. Example:

2020-08-12 21:52:20,388 - metrics2mqtt - ERROR - Error while trying to connect to MQTT broker.
2020-08-12 21:52:20,389 - metrics2mqtt - ERROR - [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/local/bin/metrics2mqtt", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 201, in main
    stats.connect()
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 40, in connect
    self.client.connect(self.broker_host)
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 937, in connect
    return self.reconnect()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 1071, in reconnect
    sock = self._create_socket_connection()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 3522, in _create_socket_connection
    return socket.create_connection(addr, source_address=source, timeout=self._keepalive)
  File "/usr/local/lib/python3.8/socket.py", line 808, in create_connection
    raise err
  File "/usr/local/lib/python3.8/socket.py", line 796, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
2020-08-12 21:52:20,432 INFO exited: metrics2mqtt (exit status 1; not expected)
2020-08-12 21:52:23,438 INFO spawned: 'metrics2mqtt' with pid 36
2020-08-12 21:52:23,863 - metrics2mqtt - ERROR - Error while trying to connect to MQTT broker.
2020-08-12 21:52:23,863 - metrics2mqtt - ERROR - [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/local/bin/metrics2mqtt", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 201, in main
    stats.connect()
  File "/usr/local/lib/python3.8/site-packages/metrics2mqtt/base.py", line 40, in connect
    self.client.connect(self.broker_host)
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 937, in connect
    return self.reconnect()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 1071, in reconnect
    sock = self._create_socket_connection()
  File "/usr/local/lib/python3.8/site-packages/paho/mqtt/client.py", line 3522, in _create_socket_connection
    return socket.create_connection(addr, source_address=source, timeout=self._keepalive)
  File "/usr/local/lib/python3.8/socket.py", line 808, in create_connection
    raise err
  File "/usr/local/lib/python3.8/socket.py", line 796, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
2020-08-12 21:52:23,921 INFO exited: metrics2mqtt (exit status 1; not expected)
2020-08-12 21:52:24,923 INFO gave up: metrics2mqtt entered FATAL state, too many start retries too quickly

The only way to fix this is to restart supervisor.

After examining the relevant section of the code, I think the problem is that you raise an exception after logging an error message; too many exceptions too quickly will choke supervisor. I don't see that exception being caught anywhere?

bachya avatar Aug 13 '20 02:08 bachya

FYI, ran into this again today. Any thoughts?

bachya avatar Aug 24 '20 19:08 bachya

I have the same issue when trying to run it manually. However it is failing correctly, because my MQTT server is not listening on default port and you can not set the port. Are you sure you can access the MQTT server?

lipoja avatar Feb 05 '21 14:02 lipoja

@lipoja Definitely. And even after the MQTT server has been up and accessible for a while, this library never recovers.

bachya avatar Feb 06 '21 18:02 bachya

@bachya Do you have patch for this issue? I wanted to give a try to this - I am currently running everything through MQTT at home. Next time I have to check also all the forks, I started refactoring the code as well and I saw that you've already done some changes as well.

Are you using it, would you recommend it or should I search for something else?

lipoja avatar Feb 07 '21 14:02 lipoja

@lipoja I don't personally have a patch for this; given this issue's stagnancy, I've gone back to using Glances.

bachya avatar Feb 07 '21 21:02 bachya

@bachya Oh, I have to check Glance. It looks pretty good. Yes, you are right it seems that this projects froze. Thank you for the suggestion. Have a nice day :]

lipoja avatar Feb 07 '21 22:02 lipoja

The main reason I wanted to use this repo was because it seemed lightweight and used MQTT, but it seems now Glances also has an option to use MQTT. If anyone can help with setting that up in HomeAssistant, preferably with auto 'discovery' I'd appreciate the tip!

seaniedan avatar May 12 '21 20:05 seaniedan