MHA high availability configuration and failover of MySQL

MHA high availability configuration and failover of MySQL

1, MHA overview

1. Introduction to MHA

A set of excellent MySQL high availability environment failover and master-slave replication software

The emergence of MHA is to solve the problem of MySQL single point

In the process of MySQL failure, MHA can automatically complete the failover within 0 ~ 30 seconds

MHA can ensure the consistency of data to the greatest extent, so as to achieve high availability in the real sense

2. Composition of MHA

MHA Manager (management node)

MHA Manager can be deployed on a separate machine to manage multiple master slave sets

Group; It can also be deployed on a slave node.

The MHA Manager will periodically probe the master node in the cluster. When the master fails, it can

To automatically promote the slave of the latest data to the new master, and then re point all other slave

To the new master. The entire failover process is completely transparent to the application.

MHA Node

The MHA Node runs on each MySQL server

3. MHA features

During automatic failover, the MHA view saves binary logs from the down primary server to the greatest extent

Guaranteed data not lost

Using semi synchronous replication can greatly reduce the risk of data loss if only one slave has been received

The latest binary log. MHA can apply the latest binary log to all other slave servers

Therefore, the data consistency of all nodes can be guaranteed

At present, MHA supports a master-slave architecture with at least three services, i.e. one master and two slaves

2, MHA experiment

1. Experimental environment preparation

The serverIPInstall components
Master node server192.168.132.10mysql5.7. MHA node assembly
Slave 1 node server192.168.132.20mysql5.7. MHA node assembly
Slave2 node server192.168.132.30mysql5.7. MHA node assembly
MHA manager node server192.168.132.40MHA node and manager components

2. Experimental purpose
Monitor the MySQL server through MHA, and automatically switch when the server fails, without affecting the business
When the primary server goes down, the alternative primary server automatically becomes the primary server

3. Experimental steps
1. Turn off the firewall of all servers

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

2. Install mysql5 on Master, Slave1 and Slave2 nodes seven

3. Modify the hostname of Master, Slave1 and Slave2 nodes

hostnamectl set-hostname Mysql1
hostnamectl set-hostname Mysql2
hostnamectl set-hostname Mysql3

4. Modify the Mysql main configuration file of Master, Slave1 and Slave2 nodes / etc / my cnf

##Master node##
vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true

systemctl restart mysqld

##slave1,Slave2 node##
vim /etc/my.cnf
server-id = 2 						#The server IDs of three servers cannot be the same
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

systemctl restart mysqld

5. Create two soft links on the Master, Slave1 and Slave2 nodes

ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

6. Configure mysql with one master and two slaves

(1) All database nodes are authorized with mysql

mysql -uroot -p
grant replication slave on *.* to 'myslave'@'192.168.132.%' identified by '123';		#Synchronize use from database
grant all privileges on *.* to 'mha'@'192.168.132.%' identified by 'manager';		#Authorize manager to use

grant all privileges on *.* to 'mha'@'Mysql1' identified by 'manager';				#Prevent the slave library from connecting to the master library through the host name
grant all privileges on *.* to 'mha'@'Mysql2' identified by 'manager';
grant all privileges on *.* to 'mha'@'Mysql3' identified by 'manager';
flush privileges;

(2) View binaries and synchronization points in the Master node

show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 | 1747     |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+

(3) Perform synchronization operations at Slave1 and Slave2 nodes

change master to master_host='192.168.132.10',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=2751; 

start slave;

(4) View the data synchronization results at Slave1 and Slave2 nodes

show slave status\G		
//Ensure that both IO and SQL threads are Yes, which means the synchronization is normal.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

(5) Both slave libraries must be set to read-only mode:

set global read_only=1;

(6) Insert data to test database synchronization

##stay Master Insert a piece of data into the main database to test whether it is synchronized##
create database test_db;
use test_db;
create table test(id int);
insert into test(id) values (1);

7. Install MHA software

(1)Installed on all servers MHA Dependent environment, first install epel Source (note need to use) yum Online source)
yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

(2)install MHA The software package must be installed on all servers first node assembly
 The version of each operating system is different, here CentOS7.4 You must select 0.57 edition.
Must be installed on all servers node Components, and finally MHA-manager Install on node manager Component, because manager rely on node Components.
cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install

(3)stay MHA manager Install on node manager assembly
cd /opt
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install

################################################################
#After the manager component is installed, several tools will be generated under / usr/local/bin, mainly including the following:
masterha_check_ssh inspect MHA of SSH Configuration status
masterha_check_repl inspect MySQL Replication status
masterha_manger start-up manager Script for
masterha_check_status Detect current MHA running state 
masterha_master_monitor testing master Is it down
masterha_master_switch Control failover (automatic or manual)
masterha_conf_host Add or remove configured server information
masterha_stop  close manager

#After the node component is installed, several scripts will also be generated under / usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation). The main contents are as follows:
save_binary_logs Save and copy master Binary log for
apply_diff_relay_logs Identify differential relay log events and apply their differential events to other events slave
filter_mysqlbinlog Remove unnecessary ROLLBACK Events( MHA (this tool is no longer used)
purge_relay_logs Clear relay log (no blocking) SQL Thread)

8. Configure password less authentication on all servers

(1)stay manager Configure password less authentication to all database nodes on the node
ssh-keygen -t rsa 				#Press enter all the way
ssh-copy-id 192.168.132.10
ssh-copy-id 192.168.132.20
ssh-copy-id 192.168.132.30

(2)stay mysql1 Configure to database node on mysql2 and mysql3 Password free authentication
ssh-keygen -t rsa
ssh-copy-id 192.168.132.20
ssh-copy-id 192.168.132.30

(3)stay mysql2 Configure to database node on mysql1 and mysql3 No password authentication for
ssh-keygen -t rsa
ssh-copy-id 192.168.132.10
ssh-copy-id 192.168.132.30

(4)stay mysql3 Configure to database node on mysql1 and mysql2 Password free authentication
ssh-keygen -t rsa
ssh-copy-id 192.168.132.10
ssh-copy-id 192.168.132.20

9. Configure MHA on the manager node

(1)stay manager Copy related scripts on node to/usr/local/bin catalogue
cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
//After copying, there will be four execution files
ll /usr/local/bin/scripts/
###########################################
master_ip_failover  		#Script of VIP management during automatic switching
master_ip_online_change 	#Management of vip during online switching
power_manager 				#Script to shut down the host after failure
send_report 				#Script for sending alarms after failover

(2)When copying the above automatic switching VIP Managed script to /usr/local/bin Directory, used here master_ip_failover Script to manage VIP And failover
cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin



(3)The modified contents are as follows: (delete the original contents, copy and modify directly) vip Relevant parameters)
Full text comments will appear after copying and pasting, which can be in the last line mode( shift key+ : )Use 2 under,88 s/^#//g remove the comment character # at the beginning of each line
vim /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################Add content section#########################################
my $vip = '192.168.132.200';									#Specify the address of the vip
my $brdc = '192.168.132.255';								#Specify the broadcast address of vip
my $ifdev = 'ens33';										#Specify the network card bound by vip
my $key = '1';												#Specify the serial number of virtual network card bound by vip
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";		#Represents that the value of this variable is ifconfig ens33:1 192.168.200.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";		#Represents that the value of this variable is ifconfig ens33:1 192.168.200.200 down
my $exit_code = 0;											#Specifies that the exit status code is 0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}

 ## A simple system call that disable the VIP on the old_master

sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}



``
(4)establish MHA Software directory and copy the configuration file, which is used here app1.cnf Configuration files to manage mysql Node server

```c
mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha

vim /etc/masterha/app1.cnf						#Delete the original content, directly copy and modify the IP address of the node server
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.132.20 -s 192.168.132.30
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.132.10
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.132.20
port=3306

[server3]
hostname=192.168.132.30
port=3306
##################################################
[server default]
manager_log=/var/log/masterha/app1/manager.log      #manager log
manager_workdir=/var/log/masterha/app1.log        #manager working directory
master_binlog_dir=/usr/local/mysql/data/         #The location where the master saves binlog. The path here should be consistent with the path of binlog configured in the master so that MHA can find it
master_ip_failover_script=/usr/local/bin/master_ip_failover  #The switching script when setting automatic failover, that is, the script above
master_ip_online_change_script=/usr/local/bin/master_ip_online_change  #Set the switching script for manual switching
password=manager			#Set the password of the root user in mysql, which is the password of the monitoring user created in the previous article
ping_interval=1				#Set the time interval for monitoring the main database and sending ping packets. The default is 3 seconds. When there is no response after three attempts, the failover will be carried out automatically
remote_workdir=/tmp			#Set the save location of binlog when remote mysql switches
repl_password=123			#Set the password of the replication user
repl_user=myslave			#Set the user of the replication user
report_script=/usr/local/send_report     #Set the script of the alarm sent after switching
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.132.20 -s 192.168.132.30	#Specifies the IP address of the slave server to check
shutdown_script=""			#Set the script to close the failed host after the failure occurs (the main function of this script is to close the host to prevent brain crack, which is not used here)
ssh_user=root				#Set the login user name of ssh
user=mha					#Set monitoring user root

[server1]
hostname=192.168.132.10
port=3306

[server2]
hostname=192.168.132.20
port=3306
candidate_master=1
#Set as the candidate master. After setting this parameter, the slave database will be promoted to the master database after the master-slave switch occurs, even if the master database is not the latest slave in the cluster

check_repl_delay=0
#By default, if a slave lags behind the master by more than 100M of relay logs, MHA will not choose the slave as a new master, because the recovery of the slave takes a long time; By setting check_repl_delay=0, MHA triggers the switch. When selecting a new master, the replication delay will be ignored. This parameter is set to candidate_ The host with master = 1 is very useful because the candidate master must be a new master in the process of switching

[server3]
hostname=192.168.132.30
port=3306

10,The first configuration needs to be in Master Manually turn on virtual on node IP

/sbin/ifconfig ens33:192.168.132.200/24

11. Test ssh password free authentication on the manager node. If it is normal, it will output successfully

masterha_check_ssh -conf=/etc/masterha/app1.cnf

12. The mysql master node is connected normally, and the mysql replication slave node is connected normally.

masterha_check_repl -conf=/etc/masterha/app1.cnf

13. Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

###############################################
--remove_dead_master_conf: This parameter represents the of the old master database after master-slave switching ip Will be removed from the configuration file.
--manger_log: Log storage location.
--ignore_last_failover: By default, if MHA If continuous downtime is detected and the interval between two outages is less than 8 hours, it will not be Failover, The reason for this restriction is to avoid ping-pong effect. This parameter represents ignoring the last MHA Files generated by trigger switching. By default, MHA After switching, the log directory will be recorded in the log directory, that is, the log set above app1.failover.complete File. If it is found that the file exists in this directory during the next switching, the switching will not be allowed unless the file is deleted after the first switching. For convenience, it is set to--ignore_last_failover. 

14. Viewing the MHA status, you can see that the current master is Mysql1 node

masterha_check_status --conf=/etc/masterha/app1.cnf

15. Check the MHA log to see that the current master is 192.168.132.10

cat /var/log/masterha/app1/manager.log | grep "current master"

16. Check whether the VIP address 192.168.132.200 of Mysql1 exists. This VIP address will not disappear because the manager node stops MHA service.
17. Fault simulation

#Monitor observation log records on the manager node
tail -f /var/log/masterha/app1/manager.log

#Stop MySQL service on Master node Mysql1
systemctl stop mysqld
 or
pkill -9 mysql

#After a normal automatic switching, the MHA process will exit. HMA will automatically modify app1 CNF file content, delete the down mysql1 node. Check whether mysql2 takes over VIP
ifconfig

Algorithm of failover alternative master database:

1. Generally, the judgment from the library is to judge the advantages and disadvantages from (position/GTID). There are differences in the data, which is closest to

The slave of the master becomes an alternative master.

2. If the data is consistent, select the alternative main database according to the order of the configuration file.

3. Set the weight (candidate_master=1), and force to specify the alternative master according to the weight.

(1) By default, if a slave lags behind the master's 100m relay logs, even if it has the right to

If it is heavy, it will also fail.

(2) If check_ repl_ If delay = 0, even if it lags behind many logs, it is forced to choose it as an alternative

Lord.

18. Fault repair

1.repair mysql
systemctl restart mysqld

2.Repair master-slave
#Check binary files and synchronization points on the current main database server Mysql2
mysql -uroot -p    #Enter database
show master status;

#Perform synchronization on the original master database server mysql1
mysql -uroot -p    #Enter database
change master to master_host='192.168.132.20',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=2751;

start slave;

3.stay manager Modify profile on node app1.cnf(Add this record again, because it will disappear automatically when it fails)
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.132.10 -s 192.168.132.30
......
[server1]
hostname=192.168.132.20
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.132.10
port=3306

[server3]
hostname=192.168.132.30
port=3306

4.stay manager Start on node MHA
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

#Solve the problem of incompatibility and error reporting between Chinese and English characters
dos2unix /usr/local/bin/master_ip_failover  

Tags: Linux

Posted by daftdog on Sat, 16 Apr 2022 09:26:32 +0930