MHA,增加管理群集(MySQL Replication[主从])+ 新的增强脚本
在前面的文章里面,我在自己的环境中,搭建了一个MHA的架构:
1 |
URL:http://d-prototype.com/archives/6859 |
当时,这个MHA中只包含一个MySQL的主从架构。
而MHA的架构本身不止于只管理一个MySQL主从架构,所以,本文将演示,如何让MHA接管新的MySQL主从架构。
一、MHA Node:新的MySQL主从架构
总体的配置这里不重复说明,详情可以参阅下面的文章:
1 |
URL:http://d-prototype.com/archives/5353 |
需要注意的是,新的MySQL主从架构中,最初的主库的server_id需要为:10
如下:
主库:
1 2 3 4 5 6 7 8 9 |
mysql> show variables like 'server_id'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | server_id | 10 | +---------------+-------+ 1 row in set (0.00 sec) mysql> |
从库:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
[root@mha4 ~]# mysql -u root -p'Abcd1@34'; mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a Copyright (c) 2009-2016 Percona LLC and/or its affiliates Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.158.1.149 Master_User: replme Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000006 Read_Master_Log_Pos: 154 Relay_Log_File: adamhuan_relay-bin.000005 Relay_Log_Pos: 367 Relay_Master_Log_File: mysql-bin.000006 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 154 Relay_Log_Space: 1506 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 Master_UUID: c8db6e7a-b147-11e6-8700-005056847ecb Master_Info_File: /var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) mysql> |
否则可能会报错:
1 2 3 4 5 6 7 8 9 10 11 12 |
[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 10:29:32 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 10:29:32 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:29:32 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:29:32 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326 Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers. Mon Feb 13 10:29:41 2017 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK! [root@mha1 ~]# |
MySQL的用户权限的配置:
用户:root(所有节点)。
grant all privileges on *.* to root@‘10.158.1.94’ identified by ‘********’; #10.158.1.94,是我的MHA Manager
# MHA Node
grant all privileges on *.* to root@‘10.158.1.149’ identified by ‘********’;
grant all privileges on *.* to root@‘10.158.1.174’ identified by ‘********’;
用户:replme(MHA Node节点)
grant replication slave on *.* to ‘replme’@’10.158.1.173’ identified by ‘************’;
grant replication slave on *.* to ‘replme’@’10.158.1.174’ identified by ‘************’;
如果这里的权限没有设置好,那么会出现这样的问题:
1. repleme,设置不正确会导致MySQL Replication(主从)架构的同步不正常
2. root,配置不正确,会导致:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 10:42:16 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 10:42:16 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:42:16 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:42:16 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] Got MySQL error when connecting 10.158.1.149(10.158.1.149:3306) :1045:Access denied for user 'root'@'10.158.1.94' (using password: YES), but this is not a MySQL crash. Check MySQL server settings. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297 Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] Got MySQL error when connecting 10.158.1.174(10.158.1.174:3306) :1045:Access denied for user 'root'@'10.158.1.94' (using password: YES), but this is not a MySQL crash. Check MySQL server settings. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297 Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326 Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers. Mon Feb 13 10:42:16 2017 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK! [root@mha1 ~]# |
SSH等价关系:
MHA Manager 与 MHA Node 互信。
ssh-copy-id -i .ssh/id_rsa.pub 10.158.1.94
这一部分,在MHA的搭建文档中已经有提到过,不重述。
软件包安装(MHA Node节点):
YUM安装:mha4mysql-node的软件包。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
[root@mha2 ~]# ls -ltr /etc/yum.repos.d/ total 40 -rw-r--r-- 1 root root 1056 Nov 5 2012 epel-testing.repo -rw-r--r-- 1 root root 957 Nov 5 2012 epel.repo -rw-r--r--. 1 root root 6259 May 19 2016 CentOS-Vault.repo -rw-r--r--. 1 root root 630 May 19 2016 CentOS-Media.repo -rw-r--r--. 1 root root 289 May 19 2016 CentOS-fasttrack.repo -rw-r--r--. 1 root root 647 May 19 2016 CentOS-Debuginfo.repo -rw-r--r--. 1 root root 1991 May 19 2016 CentOS-Base.repo -rw-r--r-- 1 root root 70 Nov 23 14:34 percona.repo -rw-r--r-- 1 root root 82 Dec 17 01:34 local.repo [root@mha2 ~]# [root@mha2 ~]# yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes Loaded plugins: fastestmirror, refresh-packagekit, security Setting up Install Process Loading mirror speeds from cached hostfile epel/metalink | 6.2 kB 00:00 * base: mirrors.aliyun.com * epel: mirrors.tuna.tsinghua.edu.cn * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com LocalPercona | 2.9 kB 00:00 ... base | 3.7 kB 00:00 epel | 4.3 kB 00:00 epel/primary_db | 5.9 MB 00:15 extras | 3.4 kB 00:00 local | 2.9 kB 00:00 ... updates | 3.4 kB 00:00 Package perl-DBD-MySQL-4.013-3.el6.x86_64 already installed and latest version Package perl-Config-Tiny-2.12-7.1.el6.noarch already installed and latest version Package perl-Log-Dispatch-2.27-1.el6.noarch already installed and latest version Package perl-Parallel-ForkManager-0.7.9-1.el6.noarch already installed and latest version Package 4:perl-Time-HiRes-1.9721-141.el6_7.1.x86_64 already installed and latest version Nothing to do [root@mha2 ~]# [root@mha2 ~]# ls -ltr /software total 489720 -rw-rw-r-- 1 root root 1281236 Oct 18 16:16 Percona-Server-devel-57-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 8404220 Oct 18 16:16 Percona-Server-client-57-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 145811460 Oct 18 16:16 Percona-Server-57-debuginfo-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 42409500 Oct 18 16:16 Percona-Server-server-57-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 30371004 Oct 18 16:16 Percona-Server-test-57-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 925228 Oct 18 16:16 Percona-Server-shared-57-5.7.15-9.1.el6.x86_64.rpm -rw-rw-r-- 1 root root 1573044 Oct 18 16:16 Percona-Server-tokudb-57-5.7.15-9.1.el6.x86_64.rpm -rw-r--r-- 1 root root 230789120 Nov 7 12:58 Percona-Server-5.7.15-9-r9f0fd0a-el6-x86_64-bundle.tar -rw-r--r-- 1 root root 87119 Nov 21 19:44 mha4mysql-manager-0.56-0.el6.noarch.rpm -rw-r--r-- 1 root root 36326 Nov 21 19:45 mha4mysql-node-0.56-0.el6.noarch.rpm -rw-r--r-- 1 root root 39758750 Nov 23 13:50 nomachine_5.1.62_1_x86_64.rpm drwxr-xr-x 2 root root 4096 Nov 23 14:27 repodata drwxr-xr-x 2 root root 4096 Nov 25 09:28 aria2 drwxr-xr-x 3 root root 4096 Dec 16 15:29 mysql-cluster [root@mha2 ~]# [root@mha2 ~]# rpm -qa | grep --color mha [root@mha2 ~]# [root@mha2 ~]# rpm -ivh /software/mha4mysql-node-0.56-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%] [root@mha2 ~]# [root@mha2 ~]# rpm -qa | grep --color mha mha4mysql-node-0.56-0.el6.noarch [root@mha2 ~]# |
如果你没有在MHA Node安装上面的软件包,那么你会遇到下面的错误:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 10:49:50 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 10:49:50 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:49:50 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 10:49:50 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 10:49:56 2017 - [info] GTID failover mode = 0 Mon Feb 13 10:49:56 2017 - [info] Dead Servers: Mon Feb 13 10:49:56 2017 - [info] Alive Servers: Mon Feb 13 10:49:56 2017 - [info] 10.158.1.149(10.158.1.149:3306) Mon Feb 13 10:49:56 2017 - [info] 10.158.1.174(10.158.1.174:3306) Mon Feb 13 10:49:56 2017 - [info] Alive Slaves: Mon Feb 13 10:49:56 2017 - [info] 10.158.1.174(10.158.1.174:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 10:49:56 2017 - [info] Replicating from 10.158.1.149(10.158.1.149:3306) Mon Feb 13 10:49:56 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 10:49:56 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306) Mon Feb 13 10:49:56 2017 - [info] Checking slave configurations.. Mon Feb 13 10:49:56 2017 - [info] Checking replication filtering settings.. Mon Feb 13 10:49:56 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 10:49:56 2017 - [info] Replication filtering check ok. Mon Feb 13 10:49:56 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 10:49:56 2017 - [info] Starting SSH connection tests.. Mon Feb 13 10:49:57 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 10:49:57 2017 - [info] Checking MHA Node version.. Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln122] Got error when getting node version. Error: Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln123] bash: apply_diff_relay_logs: command not found Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln150] node version on 10.158.1.174 not found! Is MHA Node package installed ? at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 374 Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. Died at /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm line 151. Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers. Mon Feb 13 10:49:57 2017 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK! [root@mha1 ~]# |
这一部分,YUM的具体配置,MHA的搭建文档中也有过提及,不重述。
————————————————
如果,顺利完成了上面的步骤,那么MHA的MySQL这一部分的配置就完成了。
二、MHA Manager:配置
默认配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
[root@mha1 ~]# cat /etc/masterha_default.cnf [server default] user=root password=Abcd1@34 ssh_user=root master_binlog_dir=/var/lib/mysql remote_workdir=/work_dir/mha_node repl_user=replme repl_password=Or@cle123 master_ip_failover_script=/script/mha/master_ip_failover master_ip_online_change_script=/script/mha/master_ip_online_change ping_interval=1 [root@mha1 ~]# |
对于上面的这个问题件,要注意:
对于同一个MHA,MySQL的口令最好一样。
这样,默认的配置文件跟应用配置文件就可以很好的分开。
如果没有办法做到这一点,那么,默认配置与应用配置就要写成一个文件。
在我的环境中,可以看到我是分开管理的,因此我所有的被MHA管理的MySQL的口令都是一样的。
应用配置文件:
masterha_application_1.cnf,这是之前的搭建文档配置的应用配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
[root@mha1 ~]# cat /etc/masterha_application_1.cnf [server default] manager_workdir=/work_dir/mha_manager manager_log=/work_dir/mha_manager/mha_manager.log [server1] hostname=mha2 candidate_master=1 [server2] hostname=mha3 candidate_master=1 [server3] hostname=mha4 #candidate_master=1 no_master=1 [root@mha1 ~]# |
/etc/masterha_application_2_149_174.cnf,是本次(本文)新增的MySQL主从架构:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[root@mha1 ~]# cat /etc/masterha_application_2_149_174.cnf [server default] manager_workdir=/work_dir/mha_manager manager_log=/work_dir/mha_manager/mha_manager_2.log [server1] hostname=10.158.1.149 candidate_master=1 [server2] hostname=10.158.1.174 candidate_master=1 [root@mha1 ~]# |
这样,MHA的配置的新增操作就完成了。
三、MHA测试:
测试SSH等价关系:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[root@mha1 ~]# masterha_check_ssh --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 13:46:44 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 13:46:44 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 13:46:44 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 13:46:44 2017 - [info] Starting SSH connection tests.. Mon Feb 13 13:46:48 2017 - [debug] Mon Feb 13 13:46:45 2017 - [debug] Connecting via SSH from root@10.158.1.174(10.158.1.174:22) to root@10.158.1.149(10.158.1.149:22).. Mon Feb 13 13:46:48 2017 - [debug] ok. Mon Feb 13 13:46:53 2017 - [debug] Mon Feb 13 13:46:44 2017 - [debug] Connecting via SSH from root@10.158.1.149(10.158.1.149:22) to root@10.158.1.174(10.158.1.174:22).. Mon Feb 13 13:46:53 2017 - [debug] ok. Mon Feb 13 13:46:53 2017 - [info] All SSH connection tests passed successfully. [root@mha1 ~]# |
测试MySQL Replication(主从):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 13:47:18 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 13:47:18 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 13:47:18 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 13:47:18 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 13:47:21 2017 - [info] GTID failover mode = 0 Mon Feb 13 13:47:21 2017 - [info] Dead Servers: Mon Feb 13 13:47:21 2017 - [info] Alive Servers: Mon Feb 13 13:47:21 2017 - [info] 10.158.1.149(10.158.1.149:3306) Mon Feb 13 13:47:21 2017 - [info] 10.158.1.174(10.158.1.174:3306) Mon Feb 13 13:47:21 2017 - [info] Alive Slaves: Mon Feb 13 13:47:21 2017 - [info] 10.158.1.174(10.158.1.174:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 13:47:21 2017 - [info] Replicating from 10.158.1.149(10.158.1.149:3306) Mon Feb 13 13:47:21 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 13:47:21 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306) Mon Feb 13 13:47:21 2017 - [info] Checking slave configurations.. Mon Feb 13 13:47:21 2017 - [info] Checking replication filtering settings.. Mon Feb 13 13:47:21 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 13:47:21 2017 - [info] Replication filtering check ok. Mon Feb 13 13:47:21 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 13:47:21 2017 - [info] Starting SSH connection tests.. Mon Feb 13 13:47:22 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 13:47:22 2017 - [info] Checking MHA Node version.. Mon Feb 13 13:47:22 2017 - [info] Version check ok. Mon Feb 13 13:47:22 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Feb 13 13:47:23 2017 - [info] HealthCheck: SSH to 10.158.1.149 is reachable. Mon Feb 13 13:47:23 2017 - [info] Master MHA Node version is 0.56. Mon Feb 13 13:47:23 2017 - [info] Checking recovery script configurations on 10.158.1.149(10.158.1.149:3306).. Mon Feb 13 13:47:23 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000006 Mon Feb 13 13:47:23 2017 - [info] Connecting to root@10.158.1.149(10.158.1.149:22).. Creating /work_dir/mha_node if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000006 Mon Feb 13 13:47:23 2017 - [info] Binlog setting check done. Mon Feb 13 13:47:23 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Feb 13 13:47:23 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.174 --slave_ip=10.158.1.174 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 13:47:23 2017 - [info] Connecting to root@10.158.1.174(10.158.1.174:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to adamhuan_relay-bin.000005 Temporary relay log file is /var/lib/mysql/adamhuan_relay-bin.000005 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 13:47:24 2017 - [info] Slaves settings check done. Mon Feb 13 13:47:24 2017 - [info] 10.158.1.149(10.158.1.149:3306) (current master) +--10.158.1.174(10.158.1.174:3306) Mon Feb 13 13:47:24 2017 - [info] Checking replication health on 10.158.1.174.. Mon Feb 13 13:47:24 2017 - [info] ok. Mon Feb 13 13:47:24 2017 - [info] Checking master_ip_failover_script status: Mon Feb 13 13:47:24 2017 - [info] /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.158.1.149 --orig_master_ip=10.158.1.149 --orig_master_port=3306 Mon Feb 13 13:47:24 2017 - [info] OK. Mon Feb 13 13:47:24 2017 - [warning] shutdown_script is not defined. Mon Feb 13 13:47:24 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. [root@mha1 ~]# |
四、部署增强脚本
这一部分会跟之前的搭建文档不一样。
在搭建文档中,只涉及到一个MySQL主从,所以增强脚本只需要过滤一个manager即可。
而在现在的环境中,我们有两个MySQL主从,如果再用以前的方法过滤MHA Manager上的manager进程,将会列出所有的manager进程,影响所有的MHA接管的MySQL Replication。
先停掉当前的增强脚本:
杀掉【sh call_do_mha.sh】的进程,即可。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
[root@mha1 ~]# ps -ef | grep --color do_mha root 6626 24113 0 13:51 ? 00:00:00 sh /script/shell/do_mha.sh root 6633 14322 0 13:51 pts/2 00:00:00 grep --color do_mha root 24113 1 0 Feb07 ? 01:13:06 sh call_do_mha.sh [root@mha1 ~]# [root@mha1 ~]# ps -ef | grep --color nohup root 23093 14322 0 13:52 pts/2 00:00:00 grep --color nohup [root@mha1 ~]# [root@mha1 ~]# ps -ef | grep --color manager root 1476 1 0 2016 ? 00:00:18 /usr/sbin/modem-manager root 3376 1 0 Feb12 ? 00:03:49 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 12318 12181 0 2016 ? 00:02:19 gnome-power-manager root 32527 14322 0 13:52 pts/2 00:00:00 grep --color manager [root@mha1 ~]# [root@mha1 ~]# kill -9 24113 [root@mha1 ~]# [root@mha1 ~]# ps -ef | grep --color do_mha root 17354 14322 0 13:52 pts/2 00:00:00 grep --color do_mha [root@mha1 ~]# [root@mha1 ~]# ps -ef | grep --color nohup root 17376 14322 0 13:53 pts/2 00:00:00 grep --color nohup [root@mha1 ~]# [root@mha1 ~]# ps -ef | grep --color manager root 1476 1 0 2016 ? 00:00:18 /usr/sbin/modem-manager root 3376 1 0 Feb12 ? 00:03:49 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 12318 12181 0 2016 ? 00:02:19 gnome-power-manager root 17383 14322 0 13:53 pts/2 00:00:00 grep --color manager [root@mha1 ~]# |
过滤方法:
1 2 3 4 5 6 7 8 |
[root@mha1 shell]# ps -ef | grep masterha_manager root 3376 1 0 Feb12 ? 00:03:50 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 17635 14322 0 13:56 pts/2 00:00:00 grep masterha_manager [root@mha1 shell]# [root@mha1 shell]# ps -ef | grep masterha_manager | grep application_1 root 3376 1 0 Feb12 ? 00:03:50 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover [root@mha1 shell]# [root@mha1 shell]# |
新的脚本:
文件:do_mha.sh
使用方法:do_mha.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 |
[root@mha1 shell]# pwd /script/shell [root@mha1 shell]# [root@mha1 shell]# ls -ltr total 3099176 -rw-r--r-- 1 root root 7068 Dec 19 23:46 do_mha.sh_orig -rw-r--r-- 1 root root 76 Dec 19 23:56 call_do_mha.sh -rw------- 1 root root 3173528130 Feb 13 13:52 nohup.out -rw-r--r-- 1 root root 7233 Feb 13 13:56 do_mha.sh [root@mha1 shell]# [root@mha1 shell]# cat do_mha.sh #!/bin/bash # Script Name: MySQL_MHA_Enhance_shell.sh # About: auto register in MySQL Replication # Script Type: Bash Shell # OS: RHEL6 / CentOS6 # Architecture: MySQL MasterHA # Auther: adamhuan # Blog: d-prototype.com # -------------------------- # variable and file path # Part:: this script str_mha_application=$1 # Part:: Linux # Account info str_linux_username="root" # Part:: MySQL # Account info str_mysql_username="root" str_mysql_password=Abcd1@34 str_repl_username="replme" str_repl_password=Or@cle123 # Part:: MHA # pid #str_pid_masterha_manager=`ps -ef | grep masterha_manager | grep perl | awk '{print $2}'` str_pid_masterha_manager=`ps -ef | grep masterha_manager | grep "$str_mha_application" | grep perl | awk '{print $2}'` # file file_conf_mha_global="/etc/masterha_default.cnf" #file_conf_mha_application="/etc/masterha_application_1.cnf" file_conf_mha_application="/etc/$str_mha_application.cnf" # file: relation / by computed file_log_mha_manager=`cat $file_conf_mha_application | grep --color manager_log | cut -d'=' -f2` # variable: ip info # 如果MHA中MySQL主库的候选服务器数量超过了两台,也许下面这个list参数,就会排上用场 list_ip_candicate=`cat $file_conf_mha_application | grep -B 2 "^candidate" | grep "hostname" | cut -d'=' -f2` str_ip_orig_master=`cat $file_log_mha_manager | grep --color "MySQL Master failover" | cut -d'(' -f2 | cut -d':' -f1 | tail -n 1` str_ip_new_master=`cat $file_log_mha_manager | grep --color "MySQL Master failover" | cut -d'(' -f3 | cut -d':' -f1 | tail -n 1` str_ip_mha_manager="10.158.1.94" # 为[change master]准备的参数 str_log_file_new_master="" str_log_pos_new_master="" # Part:: String SQL str_sql_mysql_change_master="" # -------------------------- # function function do_sql() { # variable func_str_ip="$1" func_str_sql="$2" # action # 本场景中不涉及到对MySQL某个库的操作,所以没有选择[db] # mysql -u $user -p"$password" $db -N -e "$f_sql_str" mysql -u $str_mysql_username -h $func_str_ip -p"$str_mysql_password" -N -e "$func_str_sql" } # 获取主库状态信息 #function get_info_mysql_master_new_master() { # version ONE #str_log_file_new_master=`do_sql "$str_ip_new_master" "show master status" | awk '{print $1}'` #str_log_pos_new_master=`do_sql "$str_ip_new_master" "show master status" | awk '{print $2}'` #} # 生成orig_master作为slave加入new_master的[change master]SQL命令 function gen_sql_mysql_change_master() { #if [[ "$str_log_file_new_master" == "" || $str_log_pos_new_master == "" ]] #then # get_info_mysql_master_new_master #fi #str_sql_mysql_change_master="CHANGE MASTER TO MASTER_HOST='$str_ip_new_master',MASTER_USER='$str_repl_username',MASTER_PASSWORD='$str_repl_password',MASTER_LOG_FILE='$str_log_file_new_master',MASTER_LOG_POS=$str_log_pos_new_master;" # version TWO func_temp_master_host_sed=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | cut -d',' -f1 | cut -d'=' -f2 | cut -d\' -f2` func_temp_repl_password_sed=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | rev | cut -d\' -f2` echo "======================" echo "@@ func variable: func_temp_repl_password_sed = $func_temp_repl_password_sed" echo "======================" str_sql_mysql_change_master=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | sed "s/'$func_temp_repl_password_sed'/'$str_repl_password'/g" | cut -d':' -f4` str_sql_mysql_change_master=`echo $str_sql_mysql_change_master | sed "s/'$func_temp_master_host_sed'/'$str_ip_new_master'/g"` } # 对指定主机执行Linux命令 # 前提: # 1. IP可达 # 2. SSH等价关系 function do_linux_by_ssh() { # variable func_str_ip="$1" func_str_user="$2" func_str_command="$3" # action ssh -t $func_str_user@$func_str_ip "$func_str_command" } # 处理VIP的事宜 function do_part_vip() { do_linux_by_ssh "$str_ip_new_master" "root" "service keepalived start" do_linux_by_ssh "$str_ip_orig_master" "root" "service keepalived stop" } function do_part_orig_master_is_new_slave() { do_linux_by_ssh "$str_ip_orig_master" "root" "service mysql start" do_sql "$str_ip_orig_master" "set global read_only=1;" do_sql "$str_ip_orig_master" "$str_sql_mysql_change_master" do_sql "$str_ip_orig_master" "start slave;" } function do_part_mha_master_manager_start() { do_linux_by_ssh "$str_ip_mha_manager" "root" "nohup masterha_manager --conf=$file_conf_mha_application --ignore_last_failover &" } # 如果PID不存在,则执行该脚本,否则,退出 function runable_by_mha_manager_pid() { echo "-----------------" echo "Script for MySQL Master HA" echo "-----------------" echo "Begin:: "`date "+|%Y-%m-%d|%H:%M:%S|"` if [[ "$str_pid_masterha_manager" == "" ]] then echo "## masterha_manager is [NOT ALIVED]." else echo "## masterha_manager is [ALIVED]." echo "[masterha_manager] PID is:: $str_pid_masterha_manager" # do something. echo "## Exit Script" exit 0 fi } # -------------------------- # action # 如果PID不存在,则执行该脚本,否则,退出 echo "------------------" echo "app: runable_by_mha_manager_pid" runable_by_mha_manager_pid echo "" echo "------------------" echo "app: gen_sql_mysql_change_master" gen_sql_mysql_change_master echo "" #echo "------------------" #echo "app: do_part_vip" #do_part_vip #echo "" echo "------------------" echo "app: do_part_orig_master_is_new_slave" do_part_orig_master_is_new_slave echo "" echo "------------------" echo "app: do_part_mha_master_manager_start" #do_part_mha_master_manager_start nohup masterha_manager --conf=$file_conf_mha_application --ignore_last_failover & echo "" # -------------------------- # Show time # --------- # version one # --------- #echo "new master is:: $str_ip_new_master" #echo "Master log file is:: $str_log_file_new_master" #echo "Master log POS is:: $str_log_pos_new_master" #echo "orig master --> new master ## SQL: CHANGE MASTER ## is:: $str_sql_mysql_change_master" # --------- # version two # --------- echo "=================" echo "MySQL info:" echo "## Account and Password" echo "username @ $str_mysql_username" echo "password @ $str_mysql_password" echo "--- for REPLICATION ---" echo "repl @ username ## $str_repl_username" echo "repl @ password ## $str_repl_password" echo "## Master Server info" echo "log file @ Master ## $str_log_file_new_master" echo "log pos @ Master ## $str_log_pos_new_master" echo "" echo "=================" echo "SQL statement:" echo "[CHANGE MASTER] -->" echo "$str_sql_mysql_change_master" echo "" echo "=================" echo "MasterHA info:" echo "## File and Path" echo "MHA Global config file @ $file_conf_mha_global" echo "MHA Application config file @ $file_conf_mha_application" echo "MHA Log file:: masterha_manager @ $file_log_mha_manager" echo "## Architecture" echo "Candicate Server list::" echo "$list_ip_candicate" echo "## IP" echo "MHA Manager Server:: $str_ip_mha_manager" echo "Last:: new master:: $str_ip_new_master" echo "Last:: orig master:: $str_ip_orig_master" echo "" # -------------------------- echo "-----------------" echo "Finished:: "`date "+|%Y-%m-%d|%H:%M:%S|"` # Done [root@mha1 shell]# |
文件:call_do_mha.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[root@mha1 shell]# cat call_do_mha.sh # name: call_do_mha.sh #variables str_mha_app=$1 while [ true ] do sh /script/shell/do_mha.sh $str_mha_app done [root@mha1 shell]# |
启动新的MHA增强脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
[root@mha1 ~]# ps -ef | grep masterha_manager root 2481 2459 0 14:18 pts/0 00:00:00 grep masterha_manager [root@mha1 ~]# [root@mha1 ~]# ls -ltr /etc/ | grep masterha -rw-r--r-- 1 root root 250 Dec 13 13:50 masterha_application_1.cnf -rw-r--r-- 1 root root 312 Dec 13 20:31 masterha_default.cnf -rw-r--r-- 1 root root 212 Feb 13 10:25 masterha_application_2_149_174.cnf [root@mha1 ~]# [root@mha1 ~]# cd /script/shell/ [root@mha1 shell]# ls -ltr total 3099280 -rw-r--r-- 1 root root 7068 Dec 19 23:46 do_mha.sh_orig -rw-r--r-- 1 root root 117 Feb 13 13:59 call_do_mha.sh -rw-r--r-- 1 root root 7293 Feb 13 14:10 do_mha.sh -rw------- 1 root root 3173637364 Feb 13 14:12 nohup.out [root@mha1 shell]# 应用1: [root@mha1 shell]# nohup sh call_do_mha.sh masterha_application_1 & [1] 2493 nohup: ignoring input and appending output to `nohup.out' [root@mha1 shell]# [root@mha1 shell]# [root@mha1 shell]# ps -ef | grep --color masterha_manager root 2549 1 1 14:18 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 28289 2459 0 14:19 pts/0 00:00:00 grep --color masterha_manager [root@mha1 shell]# 应用2: [root@mha1 shell]# nohup sh call_do_mha.sh masterha_application_2_149_174 & [3] 16662 nohup: ignoring input and appending output to `nohup.out' [root@mha1 shell]# [root@mha1 shell]# [root@mha1 shell]# ps -ef | grep --color masterha_manager root 2549 1 0 14:18 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 3513 1 1 14:19 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_2_149_174.cnf --ignore_last_failover root 30124 2459 0 14:19 pts/0 00:00:00 grep --color masterha_manager [root@mha1 shell]# |
五,测试:新的增强脚本:
首先看看当前的两个MySQL主从的状态:
应用1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_1.cnf Mon Feb 13 14:24:32 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 14:24:32 2017 - [info] Reading application default configuration from /etc/masterha_application_1.cnf.. Mon Feb 13 14:24:32 2017 - [info] Reading server configuration from /etc/masterha_application_1.cnf.. Mon Feb 13 14:24:32 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 14:24:32 2017 - [info] GTID failover mode = 0 Mon Feb 13 14:24:32 2017 - [info] Dead Servers: Mon Feb 13 14:24:32 2017 - [info] Alive Servers: Mon Feb 13 14:24:32 2017 - [info] mha2(10.158.1.95:3306) Mon Feb 13 14:24:32 2017 - [info] mha3(10.158.1.96:3306) Mon Feb 13 14:24:32 2017 - [info] mha4(10.158.1.97:3306) Mon Feb 13 14:24:32 2017 - [info] Alive Slaves: Mon Feb 13 14:24:32 2017 - [info] mha2(10.158.1.95:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:24:32 2017 - [info] Replicating from 10.158.1.96(10.158.1.96:3306) Mon Feb 13 14:24:32 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 14:24:32 2017 - [info] mha4(10.158.1.97:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:24:32 2017 - [info] Replicating from 10.158.1.96(10.158.1.96:3306) Mon Feb 13 14:24:32 2017 - [info] Not candidate for the new Master (no_master is set) Mon Feb 13 14:24:32 2017 - [info] Current Alive Master: mha3(10.158.1.96:3306) Mon Feb 13 14:24:32 2017 - [info] Checking slave configurations.. Mon Feb 13 14:24:32 2017 - [info] Checking replication filtering settings.. Mon Feb 13 14:24:32 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 14:24:32 2017 - [info] Replication filtering check ok. Mon Feb 13 14:24:32 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 14:24:32 2017 - [info] Starting SSH connection tests.. Mon Feb 13 14:24:34 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 14:24:34 2017 - [info] Checking MHA Node version.. Mon Feb 13 14:24:34 2017 - [info] Version check ok. Mon Feb 13 14:24:34 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Feb 13 14:24:34 2017 - [info] HealthCheck: SSH to mha3 is reachable. Mon Feb 13 14:24:35 2017 - [info] Master MHA Node version is 0.56. Mon Feb 13 14:24:35 2017 - [info] Checking recovery script configurations on mha3(10.158.1.96:3306).. Mon Feb 13 14:24:35 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000028 Mon Feb 13 14:24:35 2017 - [info] Connecting to root@10.158.1.96(mha3:22).. Creating /work_dir/mha_node if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000028 Mon Feb 13 14:24:35 2017 - [info] Binlog setting check done. Mon Feb 13 14:24:35 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Feb 13 14:24:35 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha2 --slave_ip=10.158.1.95 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:24:35 2017 - [info] Connecting to root@10.158.1.95(mha2:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-bin.000002 Temporary relay log file is /var/lib/mysql/relay-bin.000002 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:24:35 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha4 --slave_ip=10.158.1.97 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:24:35 2017 - [info] Connecting to root@10.158.1.97(mha4:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-bin.000002 Temporary relay log file is /var/lib/mysql/relay-bin.000002 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:24:36 2017 - [info] Slaves settings check done. Mon Feb 13 14:24:36 2017 - [info] mha3(10.158.1.96:3306) (current master) +--mha2(10.158.1.95:3306) +--mha4(10.158.1.97:3306) Mon Feb 13 14:24:36 2017 - [info] Checking replication health on mha2.. Mon Feb 13 14:24:36 2017 - [info] ok. Mon Feb 13 14:24:36 2017 - [info] Checking replication health on mha4.. Mon Feb 13 14:24:36 2017 - [info] ok. Mon Feb 13 14:24:36 2017 - [info] Checking master_ip_failover_script status: Mon Feb 13 14:24:36 2017 - [info] /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=mha3 --orig_master_ip=10.158.1.96 --orig_master_port=3306 Mon Feb 13 14:24:36 2017 - [info] OK. Mon Feb 13 14:24:36 2017 - [warning] shutdown_script is not defined. Mon Feb 13 14:24:36 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. [root@mha1 shell]# |
可以看到:
1 2 3 4 |
Mon Feb 13 14:24:36 2017 - [info] mha3(10.158.1.96:3306) (current master) +--mha2(10.158.1.95:3306) +--mha4(10.158.1.97:3306) |
应用2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 14:25:18 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 14:25:18 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 14:25:18 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 14:25:18 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 14:25:18 2017 - [info] GTID failover mode = 0 Mon Feb 13 14:25:18 2017 - [info] Dead Servers: Mon Feb 13 14:25:18 2017 - [info] Alive Servers: Mon Feb 13 14:25:18 2017 - [info] 10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:25:18 2017 - [info] 10.158.1.174(10.158.1.174:3306) Mon Feb 13 14:25:18 2017 - [info] Alive Slaves: Mon Feb 13 14:25:18 2017 - [info] 10.158.1.174(10.158.1.174:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:25:18 2017 - [info] Replicating from 10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:25:18 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 14:25:18 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:25:18 2017 - [info] Checking slave configurations.. Mon Feb 13 14:25:18 2017 - [info] Checking replication filtering settings.. Mon Feb 13 14:25:18 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 14:25:18 2017 - [info] Replication filtering check ok. Mon Feb 13 14:25:18 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 14:25:18 2017 - [info] Starting SSH connection tests.. Mon Feb 13 14:25:24 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 14:25:24 2017 - [info] Checking MHA Node version.. Mon Feb 13 14:25:24 2017 - [info] Version check ok. Mon Feb 13 14:25:24 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Feb 13 14:25:24 2017 - [info] HealthCheck: SSH to 10.158.1.149 is reachable. Mon Feb 13 14:25:25 2017 - [info] Master MHA Node version is 0.56. Mon Feb 13 14:25:25 2017 - [info] Checking recovery script configurations on 10.158.1.149(10.158.1.149:3306).. Mon Feb 13 14:25:25 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000006 Mon Feb 13 14:25:25 2017 - [info] Connecting to root@10.158.1.149(10.158.1.149:22).. Creating /work_dir/mha_node if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000006 Mon Feb 13 14:25:25 2017 - [info] Binlog setting check done. Mon Feb 13 14:25:25 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Feb 13 14:25:25 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.174 --slave_ip=10.158.1.174 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:25:25 2017 - [info] Connecting to root@10.158.1.174(10.158.1.174:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to adamhuan_relay-bin.000005 Temporary relay log file is /var/lib/mysql/adamhuan_relay-bin.000005 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:25:26 2017 - [info] Slaves settings check done. Mon Feb 13 14:25:26 2017 - [info] 10.158.1.149(10.158.1.149:3306) (current master) +--10.158.1.174(10.158.1.174:3306) Mon Feb 13 14:25:26 2017 - [info] Checking replication health on 10.158.1.174.. Mon Feb 13 14:25:26 2017 - [info] ok. Mon Feb 13 14:25:26 2017 - [info] Checking master_ip_failover_script status: Mon Feb 13 14:25:26 2017 - [info] /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.158.1.149 --orig_master_ip=10.158.1.149 --orig_master_port=3306 Mon Feb 13 14:25:26 2017 - [info] OK. Mon Feb 13 14:25:26 2017 - [warning] shutdown_script is not defined. Mon Feb 13 14:25:26 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. [root@mha1 shell]# |
可以看到:
1 2 3 |
Mon Feb 13 14:25:26 2017 - [info] 10.158.1.149(10.158.1.149:3306) (current master) +--10.158.1.174(10.158.1.174:3306) |
测试:停掉当前的主库的服务
应用1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
[root@mha3 ~]# mysql -u root -p'***************' mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2165 Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a Copyright (c) 2009-2016 Percona LLC and/or its affiliates Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> show master status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql-bin.000028 | 154 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) mysql> show slave status\G Empty set (0.00 sec) mysql> mysql> exit Bye [root@mha3 ~]# [root@mha3 ~]# service mysql stop Stopping mysqld: [ OK ] [root@mha3 ~]# [root@mha3 ~]# service mysql status mysqld (pid 2680) is running... [root@mha3 ~]# |
应用2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
[root@mha2 ~]# mysql -u root -p'***************'; mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 24 Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a Copyright (c) 2009-2016 Percona LLC and/or its affiliates Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show master status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql-bin.000006 | 1916 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) mysql> show slave status\G Empty set (0.00 sec) mysql> mysql> exit Bye [root@mha2 ~]# [root@mha2 ~]# service mysql stop Stopping mysqld: [ OK ] [root@mha2 ~]# [root@mha2 ~]# service mysql status mysqld (pid 1038) is running... [root@mha2 ~]# |
可以看到,上面的服务被停掉之后,就自己起来了。(这是如预期的)
再次查看MySQL主从的状态:
应用1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_1.cnf Mon Feb 13 14:29:29 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 14:29:29 2017 - [info] Reading application default configuration from /etc/masterha_application_1.cnf.. Mon Feb 13 14:29:29 2017 - [info] Reading server configuration from /etc/masterha_application_1.cnf.. Mon Feb 13 14:29:29 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 14:29:36 2017 - [info] GTID failover mode = 0 Mon Feb 13 14:29:36 2017 - [info] Dead Servers: Mon Feb 13 14:29:36 2017 - [info] Alive Servers: Mon Feb 13 14:29:36 2017 - [info] mha2(10.158.1.95:3306) Mon Feb 13 14:29:36 2017 - [info] mha3(10.158.1.96:3306) Mon Feb 13 14:29:36 2017 - [info] mha4(10.158.1.97:3306) Mon Feb 13 14:29:36 2017 - [info] Alive Slaves: Mon Feb 13 14:29:36 2017 - [info] mha3(10.158.1.96:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:29:36 2017 - [info] Replicating from 10.158.1.95(10.158.1.95:3306) Mon Feb 13 14:29:36 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 14:29:36 2017 - [info] mha4(10.158.1.97:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:29:36 2017 - [info] Replicating from 10.158.1.95(10.158.1.95:3306) Mon Feb 13 14:29:36 2017 - [info] Not candidate for the new Master (no_master is set) Mon Feb 13 14:29:36 2017 - [info] Current Alive Master: mha2(10.158.1.95:3306) Mon Feb 13 14:29:36 2017 - [info] Checking slave configurations.. Mon Feb 13 14:29:36 2017 - [info] Checking replication filtering settings.. Mon Feb 13 14:29:36 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 14:29:36 2017 - [info] Replication filtering check ok. Mon Feb 13 14:29:36 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 14:29:36 2017 - [info] Starting SSH connection tests.. Mon Feb 13 14:29:38 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 14:29:38 2017 - [info] Checking MHA Node version.. Mon Feb 13 14:29:38 2017 - [info] Version check ok. Mon Feb 13 14:29:38 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Feb 13 14:29:38 2017 - [info] HealthCheck: SSH to mha2 is reachable. Mon Feb 13 14:29:38 2017 - [info] Master MHA Node version is 0.56. Mon Feb 13 14:29:38 2017 - [info] Checking recovery script configurations on mha2(10.158.1.95:3306).. Mon Feb 13 14:29:38 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000054 Mon Feb 13 14:29:38 2017 - [info] Connecting to root@10.158.1.95(mha2:22).. Creating /work_dir/mha_node if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000054 Mon Feb 13 14:29:39 2017 - [info] Binlog setting check done. Mon Feb 13 14:29:39 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Feb 13 14:29:39 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha3 --slave_ip=10.158.1.96 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:29:39 2017 - [info] Connecting to root@10.158.1.96(mha3:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-bin.000002 Temporary relay log file is /var/lib/mysql/relay-bin.000002 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:29:40 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha4 --slave_ip=10.158.1.97 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:29:40 2017 - [info] Connecting to root@10.158.1.97(mha4:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-bin.000002 Temporary relay log file is /var/lib/mysql/relay-bin.000002 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:29:41 2017 - [info] Slaves settings check done. Mon Feb 13 14:29:41 2017 - [info] mha2(10.158.1.95:3306) (current master) +--mha3(10.158.1.96:3306) +--mha4(10.158.1.97:3306) Mon Feb 13 14:29:41 2017 - [info] Checking replication health on mha3.. Mon Feb 13 14:29:41 2017 - [info] ok. Mon Feb 13 14:29:41 2017 - [info] Checking replication health on mha4.. Mon Feb 13 14:29:41 2017 - [info] ok. Mon Feb 13 14:29:41 2017 - [info] Checking master_ip_failover_script status: Mon Feb 13 14:29:41 2017 - [info] /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=mha2 --orig_master_ip=10.158.1.95 --orig_master_port=3306 Mon Feb 13 14:29:41 2017 - [info] OK. Mon Feb 13 14:29:41 2017 - [warning] shutdown_script is not defined. Mon Feb 13 14:29:41 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK. [root@mha1 shell]# |
可以看到:
1 2 3 4 |
Mon Feb 13 14:29:41 2017 - [info] mha2(10.158.1.95:3306) (current master) +--mha3(10.158.1.96:3306) +--mha4(10.158.1.97:3306) |
应用2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf Mon Feb 13 14:32:00 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Mon Feb 13 14:32:00 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 14:32:00 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf.. Mon Feb 13 14:32:00 2017 - [info] MHA::MasterMonitor version 0.56. Mon Feb 13 14:32:00 2017 - [info] GTID failover mode = 0 Mon Feb 13 14:32:00 2017 - [info] Dead Servers: Mon Feb 13 14:32:00 2017 - [info] Alive Servers: Mon Feb 13 14:32:00 2017 - [info] 10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:32:00 2017 - [info] 10.158.1.174(10.158.1.174:3306) Mon Feb 13 14:32:00 2017 - [info] Alive Slaves: Mon Feb 13 14:32:00 2017 - [info] 10.158.1.149(10.158.1.149:3306) Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled Mon Feb 13 14:32:00 2017 - [info] Replicating from 10.158.1.174(10.158.1.174:3306) Mon Feb 13 14:32:00 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Feb 13 14:32:00 2017 - [info] Current Alive Master: 10.158.1.174(10.158.1.174:3306) Mon Feb 13 14:32:00 2017 - [info] Checking slave configurations.. Mon Feb 13 14:32:00 2017 - [info] read_only=1 is not set on slave 10.158.1.149(10.158.1.149:3306). Mon Feb 13 14:32:00 2017 - [info] Checking replication filtering settings.. Mon Feb 13 14:32:00 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Feb 13 14:32:00 2017 - [info] Replication filtering check ok. Mon Feb 13 14:32:00 2017 - [info] GTID (with auto-pos) is not supported Mon Feb 13 14:32:00 2017 - [info] Starting SSH connection tests.. Mon Feb 13 14:32:01 2017 - [info] All SSH connection tests passed successfully. Mon Feb 13 14:32:01 2017 - [info] Checking MHA Node version.. Mon Feb 13 14:32:01 2017 - [info] Version check ok. Mon Feb 13 14:32:01 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Feb 13 14:32:01 2017 - [info] HealthCheck: SSH to 10.158.1.174 is reachable. Mon Feb 13 14:32:01 2017 - [info] Master MHA Node version is 0.56. Mon Feb 13 14:32:01 2017 - [info] Checking recovery script configurations on 10.158.1.174(10.158.1.174:3306).. Mon Feb 13 14:32:01 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000007 Mon Feb 13 14:32:01 2017 - [info] Connecting to root@10.158.1.174(10.158.1.174:22).. Creating /work_dir/mha_node if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000007 Mon Feb 13 14:32:02 2017 - [info] Binlog setting check done. Mon Feb 13 14:32:02 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Feb 13 14:32:02 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.149 --slave_ip=10.158.1.149 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Feb 13 14:32:02 2017 - [info] Connecting to root@10.158.1.149(10.158.1.149:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to mysql-relay.000003 Temporary relay log file is /var/lib/mysql/mysql-relay.000003 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Feb 13 14:32:03 2017 - [info] Slaves settings check done. Mon Feb 13 14:32:03 2017 - [info] 10.158.1.174(10.158.1.174:3306) (current master) +--10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:32:03 2017 - [info] Checking replication health on 10.158.1.149.. Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln485] Slave IO thread is not running on 10.158.1.149(10.158.1.149:3306) Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed! Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 417 Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers. Mon Feb 13 14:32:03 2017 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK! [root@mha1 shell]# [root@mha1 shell]# |
可以看到:
1 2 3 |
Mon Feb 13 14:32:03 2017 - [info] 10.158.1.174(10.158.1.174:3306) (current master) +--10.158.1.149(10.158.1.149:3306) |
都做了切换。
当前,MHA Manager的manager服务状态:
1 2 3 4 5 6 7 8 9 10 11 12 |
[root@mha1 shell]# ps -ef | grep masterha_manager root 7851 1 0 14:27 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover root 21892 1 0 14:28 pts/0 00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_2_149_174.cnf --ignore_last_failover root 31057 2459 0 14:34 pts/0 00:00:00 grep masterha_manager [root@mha1 shell]# [root@mha1 shell]# ps -ef | grep --color do_mha root 2493 2459 0 14:18 pts/0 00:00:06 sh call_do_mha.sh masterha_application_1 root 3576 2493 0 14:34 pts/0 00:00:00 sh /script/shell/do_mha.sh masterha_application_1 root 3579 3576 0 14:34 pts/0 00:00:00 sh /script/shell/do_mha.sh masterha_application_1 root 3595 2459 0 14:34 pts/0 00:00:00 grep --color do_mha root 31368 2459 0 14:19 pts/0 00:00:07 sh call_do_mha.sh masterha_application_2_149_174 [root@mha1 shell]# |
至此,为MHA新增MySQL Replication(主从),完成。
——————————————
Done。
Keep Alive不是必须的,但是有更好,… 上面的描述中,缺少的这一部分的描述,其实和搭建文档中的做法一样。