replication-manager
replication-manager copied to clipboard
After switchover master.The old master becoming a slave..But the web display mysql is Failed status
Hi
After I switchover master.
Restart old master.The old master becoming a slave by manual using command 'change master to master_user= 'repl',master_host='10.0.0.xxx',master_password='xxx',master_port=3306,MASTER_AUTO_POSITION=1;'.Check the slave status is ok.
But the web display mysql is Failed status. How to slove this.
But after systemctl status replication-manager.service display ok.
How to slove this.thx.
Ok i think i solved this in some late commit hat release is this of repman and mysql so that i try to reproduce
replication-manager-osc version
Hi replication-manager-osc version Replication Manager v2.1.7 for MariaDB 10.x and MySQL 5.7 Series Full Version: v2.1.7 Build Time: 2021-07-20T06:42:24+0000
mysql version mysqld Ver 5.7.22 for linux-glibc2.12 on x86_64 (MySQL Community Server (GPL))
proxysql version ProxySQL version 2.2.0-72-ge14accd, codename Truls
OS version CentOS Linux release 7.9.2009 (Core)
Can you try to reproduce on last 2.2 ...
Hi I install by https://docs.signal18.io/installation/setup-instructions get error: yum install replication-manager-osc
To address this issue please refer to the below wiki article https://wiki.centos.org/yum-errors If above article doesn't help to resolve this issue please use https://bugs.centos.org/. Error downloading packages: 1651663550:replication-manager-osc-2.2.20-1.x86_64: [Errno 256] No more mirrors to try.
please try again
Hi yum install replication-manager-osc. Is working. thx.
Hi
It still the same display Failed status. looke like it need to check the slave status after the slave fixed;
The old master slave is ok;
The web display mysql is Failed status. test logout and login.still the same
replication-manager-osc version Replication Manager v2.2.20 for MariaDB 10.x and MySQL 5.7 Series Full Version: v2.2.20 Build Time: 2022-05-04T11:25:50+0000
Ok that's an issue for me so i'll look deeper into it thanks for reporting
Are you sure you have restarted the replication-manager, i can not reproduce , i found other type of issues like with reloading dump failed because of new warning
"mysql: [Warning] Using a password on the command line interface can be insecure."
please adapt in your setup with the mysql package installed on the repman server
backup-mysqlbinlog-path = "/Users/apple/mysql/bin/mysqlbinlog" backup-mysqldump-path = "/Users/apple/mysql/bin/mysqldump" backup-mysqldump-options = "--hex-blob --single-transaction --verbose --all-databases" backup-mysqlclient-path = "/Users/apple/mysql/bin/mysql"
Hi Yes. It have to restarted the replication-manager. parameter backup-mydumper* path is ok.
This is my step test
1,Current status: 10.0.0.162 is master,10.0.0.231 is slave
2,stop mysql master,failover HA
10.0.0.162 stop master mysql :systemctl stop mysqld
failover HA
failover HA OK.Current status: 10.0.0.231 is new master
3,fix mysql old master becoming slave
10.0.0.162 start mysql:systemctl start mysqld
10.0.0.162 becoming slave: change master to master_user= 'repl',master_host='10.0.0.231',master_password='Repl_2019',master_port=3306,MASTER_AUTO_POSITION=1;
check slave stauts is ok : show slave status\G
Current status: 10.0.0.162 is slave,10.0.0.231 is master
4,check the replication manager web display.10.0.0.162 still display failed status.
Current status: 10.0.0.162 is slave,10.0.0.231 is master
There is not new log messages,and the status still the same
cat config.toml [db3306] title = "db3306" db-servers-hosts = "10.0.0.231:3306,10.0.0.162:3306" db-servers-prefered-master = "10.0.0.231:3306" db-servers-credential = "dbadmin:Nfjd_1234" replication-credential = "repl:Repl_2019" failover-mode = "manual" proxysql=true proxysql-servers="10.0.0.231,10.0.0.162" proxysql-port=6033 proxysql-admin-port=6032 proxysql-writer-hostgroup="10" proxysql-reader-hostgroup="20" proxysql-user="cluster1" proxysql-password="secret1pass" proxysql-bootstrap=false proxysql-bootstrap-hostgroups=false proxysql-bootstrap-users=false
[Default]
include = "/etc/replication-manager/cluster.d"
monitoring-save-config = false monitoring-datadir = "/var/lib/replication-manager" #monitoring-sharedir = "/usr/share/replication-manager" monitoring-ignore-errors = "WARN0091,WARN0084"
Timeout in seconds between consecutive monitoring
monitoring-ticker = 2
#########
LOG
#########
log-file = "/var/log/replication-manager.log" log-heartbeat = false log-syslog = false log-rotate-max-age = 1 log-rotate-max-backup = 7 log-rotate-max-size = 10 #log-sql-in-monitoring = true
#################
ARBITRATION
#################
arbitration-external = false arbitration-external-secret = "13787932529099014144" arbitration-external-hosts = "88.191.151.84:80" arbitration-peer-hosts ="127.0.0.1:10002"
Unique value on each replication-manager
arbitration-external-unique-id = 0
##########
HTTP
##########
http-server = true http-bind-address = "0.0.0.0" http-port = "10001" http-auth = false http-session-lifetime = 3600 http-bootstrap-button = false http-refresh-interval = 4000
#########
API
#########
api-credentials = "admin:repman" api-port = "10005" api-https-bind = false
api-credentials-acl-allow = "admin:cluster proxy db prov,dba:cluster proxy db,foo:" api-credentials-acl-discard = false api-credentials-external = "dba:repman,foo:bar"
############
ALERTS
############
mail-from = "replication-manager@localhost" mail-smtp-addr = "localhost:25" mail-to = "[email protected]" mail-smtp-password="" mail-smtp-user=""
alert-slack-channel = "#support" alert-slack-url = "" alert-slack-user = "svar"
##########
STATS
##########
graphite-metrics = false graphite-carbon-host = "127.0.0.1" graphite-carbon-port = 2003 graphite-embedded = false graphite-carbon-api-port = 10002 graphite-carbon-server-port = 10003 graphite-carbon-link-port = 7002 graphite-carbon-pickle-port = 2004 graphite-carbon-pprof-port = 7007
backup-mydumper-path = "/bin/mydumper" backup-myloader-path = "/bin/myloader" backup-mysqlbinlog-path = "/bin/mysqlbinlog" backup-mysqldump-path = "/bin/mysqldump" backup-mysqldump-options = "--hex-blob --single-transaction --verbose --all-databases"
##############
BENCHMARK
##############
sysbench-binary-path = "/usr/bin/sysbench" sysbench-threads = 4 sysbench-time = 100 sysbench-v1 = true
Hi I strace pid .find Resource temporarily unavailable message ps -ef |grep replication root 29233 1 3 11:24 ? 00:00:27 /usr/bin/replication-manager-osc monitor strace -T -tt -s 100 -o strace.log -p 29233
see the strace.log 11:31:39.340051 epoll_pwait(4, [], 128, 0, NULL, 31357470) = 0 <0.000040> 11:31:39.340194 nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0 <0.000088> 11:31:39.340370 futex(0xc000492550, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000065> 11:31:39.340537 read(37, 0xc0010184f1, 1) = -1 EAGAIN (Resource temporarily unavailable) <0.000045> 11:31:39.340714 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.000937> 11:31:39.341761 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.009686> 11:31:39.351582 epoll_pwait(4, [], 128, 0, NULL, 31357470) = 0 <0.000067> 11:31:39.351821 nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0 <0.000133> 11:31:39.352100 futex(0xc000600150, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000078> 11:31:39.352301 read(37, 0xc0010184f1, 1) = -1 EAGAIN (Resource temporarily unavailable) <0.000038> 11:31:39.352445 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.002761> 11:31:39.355319 epoll_pwait(4, [], 128, 0, NULL, 31357470) = 0 <0.000064> 11:31:39.355538 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000027> 11:31:39.355691 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.028624> 11:31:39.384457 epoll_pwait(4, [], 128, 0, NULL, 31357470) = 0 <0.000068> 11:31:39.384643 nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0 <0.000119> 11:31:39.384924 futex(0xc000600150, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000066> 11:31:39.385077 read(37, 0xc0010184f1, 1) = -1 EAGAIN (Resource temporarily unavailable) <0.000033> 11:31:39.385193 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.000176> 11:31:39.385439 futex(0x3442e70, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 <0.395056>
intresting it would really help to get the full replication-managr.log as an attachement
Now there are things that does not work as expected from the few logs you send us The default of replication-manager is not correct for mysql but was made for mariadb
First the config you declare parameter
backup-mysqlbinlog-path = "/bin/mysqlbinlog"
But in your log we see it does not call it correctly and trying inside /usr/local instead
--backup-logical-type string type of logical backup: river|mysqldump|mydumper (default "mysqldump")
it is suppose to default mysqldump for dump via pipes directly from master to rejon server and using mydumper If you wan't to use mydumper please change it to mydumper
if mydumper logical backup method is used you first need to create a master backup with replication-manager for the replication-manager to be able to use it during rejoin , just click on the master menu and backup make sure you have room in /var/lib/replication/backup directory and that it works first prior to go to more complex rejoin feature
It's later on possible to
-
Auto schedule backup time via replication-manager scheduler
-
Store and Archive those individual backups in S3
--autorejoin Automatic rejoin a failed master (default true)
NC
--autorejoin-backup-binlog backup ahead binlogs events when old master rejoin (default true)
Please set this to false for the time of the testing, it is important to backup delta but from your logs it so far failing
--autorejoin-flashback Automatic rejoin ahead failed master via binlog flashback
NC is false
--autorejoin-flashback-on-sync Automatic rejoin flashback if election status is semisync SYNC (default true)
i'm not sur binlog flashback is implemented in your mysql release , mariadb did it first 5 years ago and the code is for mariadb , if you can't make it work mysql please set to false
--autorejoin-flashback-on-unsync Automatic rejoin flashback if election status is semisync NOT SYNC
NC is false
--autorejoin-logical-backup Automatic rejoin ahead failed master via reseed previous logical backup
Is false by default so please set it to true for a previous created master backup to be restore ( if you wan't to keep mydumper, myloader , that look's like a good idea in my opinion)
--autorejoin-mysqldump Automatic rejoin ahead failed master via direct current master dump
Default false i'm fixing this now as a new warning ilog to stderr >=5.7 is breaking the process this what is supposed to be true if you need to rejoin with stream
--autorejoin-physical-backup Automatic rejoin ahead failed master via reseed previous phyiscal backup
This to be activated need a dedicated cron to process some jobs , or to enable ssh login from repman to the remote database and xtrabackup and socat or mariadb backup to be install there
--autorejoin-script string Path of old master rejoin script
This call a local script with some parameter to let you do the job instead of orchestrated via repman
My theory about the issue is that the code check for all rejoins method can't find any and get stuck in waiting for one of the rejoin method to continue, i indeed never tested such case when nothing is setup
Also can you explain what is you testing methodology . Stop the master , wait for failover , proceed with the failover and restart the master , or do you do other procedure ?
Hi My testing methodology : Stop the master.wait for failover. fix the old master by manual and becoming slave. replication manager auto reconnect the old master status. auto rejoin the cluster. Can proceed with the new failover.
if fix the old master by manual and becoming slave. This is the role dedicated by repman to do this automatically if your plan is to do it yourself please do --autorejoin=false
Hi, saw you closed it did --autorejoin=false fixed the issue ?
Hi
I add the parameter autorejoin = false in config.toml. Test it doesn't work.
After swith failover once. Fix the old master by manual and becoming slave.It still can't reconnect the old master.And the cluster log is stop.
Please i think you get a duplicate of https://github.com/signal18/replication-manager/issues/434 please install mysql-server as well on the replication-manager server as well and provide information where to found the mysql and mysqldump client
Here is what i used for reproducing
autorejoin-mysqldump = true
backup-mysqlbinlog-path = "/Users/apple/mysql/bin/mysqlbinlog"
backup-mysqldump-path = "/Users/apple/mysql/bin/mysqldump"
backup-mysqlclient-path = "/Users/apple/mysql/bin/mysql"
Please report if you can reproduce the issue with 2.2.24
Also was a nightmare to fix as mysqldump on 8.0 report warning directly in dump files :( i think this is solve from the last minor release