mha4mysql-manager icon indicating copy to clipboard operation
mha4mysql-manager copied to clipboard

mha_manager doesn't select new master when one of slaves is dead

Open pzmijewski-neducatio opened this issue 7 years ago • 1 comments

I have tried to find any information about this issue but I have failed. The problem I encountered presents like this: I have configuration with 3 servers:

[server1]
hostname=192.168.33.10
candidate_master=1

[server2]
hostname=192.168.33.11
candidate_master=1

[server3]
hostname=192.168.33.12
candidate_master=1

MHA works fine when only master fails, then it picks first slave available on list and promotes it to new master but problem appears when during the script execution one of slaves fails somehow, then if master fails, new master cannot be selected despite the fact there is still a one working slave.

Here is a end of log where error appears:

Thu Oct 19 11:19:37 2017 - [info] MHA::MasterFailover version 0.57. Thu Oct 19 11:19:37 2017 - [info] Starting master failover. Thu Oct 19 11:19:37 2017 - [info] Thu Oct 19 11:19:37 2017 - [info] * Phase 1: Configuration Check Phase.. Thu Oct 19 11:19:37 2017 - [info] Thu Oct 19 11:19:38 2017 - [info] GTID failover mode = 0 Thu Oct 19 11:19:38 2017 - [info] Dead Servers: Thu Oct 19 11:19:38 2017 - [info] 192.168.33.11(192.168.33.11:3306) Thu Oct 19 11:19:38 2017 - [info] 192.168.33.12(192.168.33.12:3306) Thu Oct 19 11:19:38 2017 - [info] Checking master reachability via MySQL(double check)... Thu Oct 19 11:19:38 2017 - [info] ok. Thu Oct 19 11:19:38 2017 - [info] Alive Servers: Thu Oct 19 11:19:38 2017 - [info] 192.168.33.10(192.168.33.10:3306) Thu Oct 19 11:19:38 2017 - [info] Alive Slaves: Thu Oct 19 11:19:38 2017 - [info] 192.168.33.10(192.168.33.10:3306) Version=10.2.9-MariaDB-10.2.9+maria~xenial-log (oldest major version between slaves) log-bin:enabled Thu Oct 19 11:19:38 2017 - [info] Replicating from 192.168.33.11(192.168.33.11:3306) Thu Oct 19 11:19:38 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Oct 19 11:19:38 2017 - [error][/usr/local/share/perl/5.22.1/MHA/ServerManager.pm, ln492] Server 192.168.33.12(192.168.33.12:3306) is dead, but must be alive! Check server settings. Thu Oct 19 11:19:38 2017 - [error][/usr/local/share/perl/5.22.1/MHA/ManagerUtil.pm, ln178] Got ERROR: at /usr/local/share/perl/5.22.1/MHA/MasterFailover.pm line 268.

Any ideas?

pzmijewski-neducatio avatar Oct 19 '17 11:10 pzmijewski-neducatio

you cand set 'ignore_fail=1;' to solve this problem.

JiangyueLiu avatar Sep 28 '20 00:09 JiangyueLiu