MAD icon indicating copy to clipboard operation
MAD copied to clipboard

MAD crashes when can't connect to MySQL server

Open kamieniarz opened this issue 4 years ago • 6 comments

This issue is not a new thing and I'm not the only one who was affected by it (simply use search function on Discord using code from below).

[11-21 06:16:29.04] [       MainThread] [  MITMDataProcessor:38  ] [   ERROR] An error has been caught in function 'run', process 'MITMReceiver-0' (30365), thread 'MainThread' (140234008209216):
Traceback (most recent call last):
  File "start.py", line 234, in <module>
    mitm_mapper, args, mapping_manager, db_wrapper)
  File "/home/user/MAD/mitm_receiver/MITMReceiver.py", line 102, in __init__
    data_processor.start()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 74, in _launch
    code = process_obj._bootstrap()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
> File "/home/user/MAD/mitm_receiver/MITMDataProcessor.py", line 38, in run
    self.process_data(item[0], item[1], item[2])
  File "/home/user/MAD/mitm_receiver/MITMDataProcessor.py", line 73, in process_data
    origin, data["payload"], received_timestamp)
  File "<string>", line 2, in submit_weather_map_proto
  File "/usr/lib/python3.7/multiprocessing/managers.py", line 811, in _callmethod
    raise convert_to_error(kind, result)
mysql.connector.errors.InterfaceError: Can not reconnect to MySQL after 1 attempt(s): 2003: Can't connect to MySQL server on 'localhost:3306' (111 Connection refused)

So this error causes complete MAD crash - it stops working at all untill instance restart and devices reboot or manual restart of RGC. The reason of that error is (probably) a scheduled restart of MySQL, which is enabled by default. So I've checked error log of MySQL and that's what happened:

2019-11-21  6:16:28 0 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown
2019-11-21  6:16:28 0 [Note] Event Scheduler: Purging the queue. 0 events
2019-11-21  6:16:28 0 [Note] InnoDB: FTS optimize thread exiting.
2019-11-21  6:16:29 0 [Note] InnoDB: Starting shutdown...
2019-11-21  6:16:29 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
2019-11-21  6:16:29 0 [Note] InnoDB: Instance 0, restricted to 2048 pages due to innodb_buf_pool_dump_pct=25
2019-11-21  6:16:29 0 [Note] InnoDB: Buffer pool(s) dump completed at 191121  6:16:29
2019-11-21  6:16:30 0 [Note] InnoDB: Shutdown completed; log sequence number 240614269565; transaction id 125258618
2019-11-21  6:16:30 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2019-11-21  6:16:30 0 [Note] /usr/sbin/mysqld: Shutdown complete

6 seconds after this shutting down sequence, MySQL started launch sequence. Whole thing (from beginning of shutdown to fully operational state) took exactly 11 seconds in my case but MAD had an error at the beginning of shutting down MySQL.

As far as I remember this issue happened to me 2 or 3 times during last 3 months so it's not a big deal to restart everything BUT it's still an issue that should be fixed

kamieniarz avatar Nov 21 '19 11:11 kamieniarz

What kind of behaviour do you expect from MAD once the DB is gone? I mean MAD heavily relies on the DB being alive 24/7 or at least while MAD is running.

I highly recommend to disable that scheduled restart of your database. There's almost no reason to perform such restarts anyway.

sn0opy avatar Nov 21 '19 11:11 sn0opy

I only expect MAD not to crash. I can disable restarts, sure, but it's still an issue that was mentioned several times on Discord. It's not an urgent thing obviously

kamieniarz avatar Nov 21 '19 11:11 kamieniarz

Well, instead of crash, we could just shutdown MAD. It makes no sense to keep MAD running without a DB connection.

sn0opy avatar Nov 21 '19 11:11 sn0opy

Github conversations are the best :P Couldn't hurt to have MAD stop and attempt reconnect could it? I would think a better real case scenario is that your SQL server is on a VPS and there is internet/server fluctuation. Mad crashing over minor unavoidable instability wouldn't be super fun.

Boby360 avatar Nov 21 '19 11:11 Boby360

We are also affected by this bug. Maybe it is possible to handle the Exception an try to reconnect after X minutes until db is back.

TobiTo83 avatar Feb 15 '20 10:02 TobiTo83

Just a friendly reminder - not fixed after a year

kamieniarz avatar Nov 27 '20 21:11 kamieniarz