mysql高可用方案
一、mysql高可用方案框架
mysql高可用方案,相对于mysql单节点方案,能够解决在该服务出现故障的情况下,无法自动进行故障转移、系统恢复耗时长以及严重影响用户体验的问题。
mysql高可用框架如上图所示,在两台独立的机器上部署两个mysql组件,这两个mysql组件之间进行双主同步配置。两个机器上再分别部署keepalived(master)服务与keepalived(backup)服务,keepalived内部配置了一个虚拟ip供用户使用。
在mysql主节点服务异常的情况下,该高可用方案,可以自动进行主备切换,在最短时间内恢复业务,具体过程描述如下:
(1)正常情况:keepalived(master)服务运行在mysql主机器上,虚拟ip跟随该服务也在主机器上,用户实际使用的是在主机器上的msyql服务;
(2)异常情况:keepalived服务配置了检活脚本,每隔5秒会在主节点机器上用命令使用虚拟ip,登录一次mysql,执行一条查询语句和检查该机器上是否有mysqld进程,如果两者中其中一个出现异常,该脚本会通过计数的方法,共检查5次,如果5次之后,这两个检查项还是有异常状态,则认为主节点上的mysql服务已经无法访问,所以执行停止keepalived(master)服务的命令,使虚拟ip飘到keepalived(backup)服务所在机器上,此时用户实际上使用的是从机器上的mysql服务。
如果发现了主机器上的mysql服务异常后,需要及时安排相关人员对其进行修复,待主节点上的mysql服务修复后,需要将从节点的数据全量导出,然后再导入进主节点的mysql数据库中,导入完毕后,检查主主状态,如果状态正常,则手动启动keepalived(master)服务,虚拟ip会自动飘到主机器上,此时用户实际上使用的是主机器上的mysql服务。
二、数据分析
mysql的数据可通过/etc/my.cnf查看“datadir”来获得,一般为“/data/mysql/data”,该目录存放的是mysql运行过程中创建的各种数据库、表、用户数据等文件,这些数据对系统业务是否正常,起到了至关重要的作用。
在mysql高可用方案中,mysql配置了双主模式,这种配置的优点在于保证了这两个mysql的数据一致性,并且在主节点mysql服务异常被修复后,能够快速同步从节点的数据。
三、主备切换风险分析
mysql双主模式下,为了保证服务性能,两个mysql服务之间的数据同步模式采用的是异步同步的方式,即数据只有在主节点机器上的mysql入库之后,才能在从机器上的mysql入库,如果主机器上的mysql服务在其数据入库之后,出现故障,导致从机器上的mysql服务还未将数据入库,这种情况下,从节点上的数据会丢失这部分将要入库的数据。示意图,如下图所示。
四、检活机制
要实现故障自动转移,就需要定时检查主节点上mysql服务与虚拟ip是否正常, mysql和虚拟ip的检活机制,描述如下:
采用的检查命令为“mysql -h 虚拟ip -u root -p’数据库密码’ -e “select version();” >/dev/null 2>&1”和“ps aux | grep mysqld | grep -v grep > /dev/null 2>&1”,即在主节点上定时利用虚拟ip登录mysql执行一条查询mysql版本的命令和检查该机器上是否有mysqld进程。如果这两个检查都正常,则不进行其他操作,如果这两个检查其中一个出现异常,则通过计数的方式,再检查5次,如果5次过后,这两个检查项依旧返回异常结果,则停止主节点上的keepalived服务,让虚拟ip跳转到从节点上,使用户使用从节点的mysql服务。流程图如下所示:
五、资源配置
500人用户使用系统的情况下,mysql高可用方案需要的资源情况如下表所示:
cpu/内存 | 系统盘 | 外挂磁盘 | 用途 | |
机器1 | 16c/32g | 50G | 1T | mysql、keepalived |
机器2 | 16c/32g | 50G | 1T | msyql、keepalived |
六、MYSQL双主部署
Node1、node2执行:
-
wget https://cdn.mysql.com//archives/mysql-5.7/mysql-5.7.21-linux-glibc2.12-x86_64.tar.gz
-
tar -zxvf mysql-5.7.21-linux-glibc2.12-x86_64.tar.gz -C /usr/local/
-
groupadd mysql
-
useradd -r -g mysql mysql
-
chown -R mysql:mysql /usr/local/mysql/
Node1执行:
mkdir -p /usr/local/mysql/tmp
vi /usr/local/mysql/tmp/error.log (保存)
chmod 777 /usr/local/mysql/tmp/error.log
vim /etc/my.cnf
[mysqld]
socket=/usr/local/mysql/tmp/mysql.sock
datadir=/data/mysql/data
port=3306
#sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
symbolic-links=0
max_connections=600
innodb_file_per_table=1
lower_case_table_names=0
character_set_server=utf8
default-storage-engine=INNODB
pid-file=/usr/local/mysql/mysql.pid
log-error=/usr/local/mysql/tmp/error.log ##文件需要手动构建,并赋权Chmod -r 777
user=mysql
##############################主从同步配置开始#################
#设置server-id,集群中不可重复
server-id=1
max_allowed_packet=100M
log-bin = mysql-bin #开启mysql的binlog日志功能
sync_binlog = 1 #控制数据库的binlog刷到磁盘上去 , 0 不控制,性能最好,1每次事物提交都会刷到日志文件中,性能最差,最安全
binlog_format = mixed #binlog日志格式,mysql默认采用statement,建议使用mixed
expire_logs_days = 7 #binlog过期清理时间
max_binlog_size = 100m #binlog每个日志文件大小
binlog_cache_size = 4m #binlog缓存大小
max_binlog_cache_size= 512m #最大binlog缓存大
binlog-ignore-db=mysql #不生成日志文件的数据库,多个忽略数据库可以用逗号拼接,或者 复制这句话,写多行
max_allowed_packet=100M
relay-log = mysql-relay-bin
binlog-ignore-db=mysql,test,information_schema ##不同步mysql库下的所有表
##binlog-do-db = game ##只同步那个库
##############################主从同步配置结束#################
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
[mysqld_safe]
Node2执行:
mkdir -p /usr/local/mysql/tmp
vi /usr/local/mysql/tmp/error.log (保存)
chmod 777 /usr/local/mysql/tmp/error.log
[mysqld]
socket=/usr/local/mysql/tmp/mysql.sock
datadir=/data/mysql/data
port=3306
#sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
symbolic-links=0
max_connections=600
innodb_file_per_table=1
lower_case_table_names=0
character_set_server=utf8
default-storage-engine=INNODB
pid-file=/usr/local/mysql/mysql.pid
log-error=/usr/local/mysql/tmp/error.log ##文件需要手动构建,并赋权Chmod -r 777
user=mysql
##############################主从同步配置开始#################
#设置server-id,集群中不可重复
server-id=1
max_allowed_packet=100M
log-bin = mysql-bin #开启mysql的binlog日志功能
sync_binlog = 2 #控制数据库的binlog刷到磁盘上去 , 0 不控制,性能最好,1每次事物提交都会刷到日志文件中,性能最差,最安全
binlog_format = mixed #binlog日志格式,mysql默认采用statement,建议使用mixed
expire_logs_days = 7 #binlog过期清理时间
max_binlog_size = 100m #binlog每个日志文件大小
binlog_cache_size = 4m #binlog缓存大小
max_binlog_cache_size= 512m #最大binlog缓存大
binlog-ignore-db=mysql #不生成日志文件的数据库,多个忽略数据库可以用逗号拼接,或者 复制这句话,写多行
max_allowed_packet=100M
relay-log = mysql-relay-bin
binlog-ignore-db=mysql,test,information_schema ##不同步mysql库下的所有表
##binlog-do-db = game ##只同步那个库
##############################主从同步配置结束#################
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
[mysqld_safe]
Node1、node2执行:
mkdir /data/mysql
chown mysql:mysql /data/mysql
cd /usr/local/mysql/bin
./mysqld --initialize --user=mysql --basedir=/usr/local/mysql --datadir=/data/mysql/data
/usr/local/mysql/support-files/mysql.server start
cd /usr/local/mysql/bin
./mysql -u root -p
登录后修改root密码:
update mysql.user set authentication_string=PASSWORD(‘123456’) where User=‘root’;
退出mysql;
vim /etc/my.conf,注释免密登录:
/usr/local/mysql/support-files/mysql.server restart
alter user user() identified by “123456”;
CREATE USER ‘root’@‘%’ IDENTIFIED BY ‘123456’;
grant all privileges on *.* to ‘root’@‘%’ identified by ‘123456’;
flush privileges;
设置开机自启:
cp /usr/local/mysql/support-files/mysql.server /etc/rc.d/init.d/mysqld
chmod +x /etc/init.d/mysqld
chkconfig --add mysqld
chkconfig --list
Node1执行:
cd /usr/local/mysql/bin
./mysql -uroot -p
CREATE USER repl_user IDENTIFIED BY ‘123456’;
GRANT REPLICATION SLAVE ON *.* TO ‘repl_user’@‘%’ identified by ‘123456’;
FLUSH PRIVILEGES;
show master status;
Node2执行:
cd /usr/local/mysql/bin
./mysql -uroot -p
CHANGE MASTER TO
MASTER_HOST = ‘192.168.92.16’,
MASTER_USER = ‘repl_user’,
MASTER_PASSWORD = ‘123456’,
MASTER_PORT = 3306,
MASTER_LOG_FILE=‘mysql-bin.000003’,
MASTER_LOG_POS=844,
MASTER_RETRY_COUNT = 60,
MASTER_HEARTBEAT_PERIOD = 10000;
start slave;
查看状态:
show slave status\G
Node2执行:
cd /usr/local/mysql/bin
./mysql -uroot -p
CREATE USER repl_user IDENTIFIED BY ‘123456’;
GRANT REPLICATION SLAVE ON *.* TO ‘repl_user’@‘%’ identified by ‘123456’;
FLUSH PRIVILEGES;
show master status;
Node1执行:
cd /usr/local/mysql/bin
./mysql -uroot -p
CHANGE MASTER TO
MASTER_HOST = ‘192.168.92.17’,
MASTER_USER = ‘repl_user’,
MASTER_PASSWORD = ‘123456’,
MASTER_PORT = 3306,
MASTER_LOG_FILE=‘mysql-bin.000003’,
MASTER_LOG_POS=844,
MASTER_RETRY_COUNT = 60,
MASTER_HEARTBEAT_PERIOD = 10000;
start slave;
show slave status\G
七、keepalived部署与配置
7.1 主节点
7.1.1 安装keepalived服务
安装包下载https://pan.baidu.com/s/13kW_Bz6RGSo4ewZ68BwCVA 提取码: 6ktn
rpm -ivh *.rpm --nodeps --force
7.1.2 修改keepalived.conf
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
}
}
vrrp_script check_script {
#check VIP
script “/etc/keepalived/checkVip.sh 192.168.92.66”
interval 10
}
vrrp_script chk_mysql{
script “/etc/keepalived/check_mysql.sh”
interval 20
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
nopreempt
unicast_src_ip 192.168.92.16
unicast_peer {
192.168.92.17
}
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_mysql
}
virtual_ipaddress {
192.168.92.66
}
}
virtual_server 192.168.92.66 3306 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
real_server 192.168.92.16 3306 {
weight 10
MISC_CHECK {
misc_path “/etc/keepalived/check_mysql.sh”
misc_timeout 20
}
}
}
7.1.3 创建mysql检活脚本
vim /etc/keepalived/check_mysql.sh
#!/bin/bash
CHECK_COUNT=5
counter=1
LOG_FILE=/etc/keepalived/check_mysql.log
while true
do
#ip为虚拟ip
mysql -h 192.168.92.66 -u root -p’2w#E4r%T6y&Uqwas’ -e “select version();” >/dev/null 2>&1
i=$?
ps aux | grep mysqld | grep -v grep > /dev/null 2>&1
j=$?
if [ $i = 0 ] && [ $j = 0 ]
then
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - 状态正常! >> $LOG_FILE
exit 0
else
if [ $counter -gt $CHECK_COUNT ]
then
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - 状态异常!第 $counter 次检查!停止检查,退出!!! >> $LOG_FILE
break
fi
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - 状态异常!第 $counter 次检查! >> $LOG_FILE
let counter++
continue
fi
done
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - 状态异常!停止主机器keepalived服务!!! >> $LOG_FILE
systemctl stop keepalived.service
exit 1
chmod 777 /etc/keepalived/check_mysql.sh
7.1.4 创建mysql VIP的检活脚本
vim /etc/keepalived/checkVip.sh
#!/bin/bash
source /etc/profile
VIP=$1
# 日志文件
LOG_FILE=/etc/keepalived/checkVip.log
LOG_FILE2=/etc/keepalived/checkVip2.log
# 检查次数
CHECK_TIME=3
#VIP is working VIP_OK is 1 , VIP down VIP_OK is 0
VIP_OK=1
function check_vip_helth (){
ip a |grep $VIP >/dev/null 2>&1
#curl $VIP:8080 >/dev/null 2>&1
if [ $? = 0 ] ;then
VIP_OK=1
else
VIP_OK=0
fi
return $VIP_OK
}
while [ $CHECK_TIME -ne 0 ]
do
let “CHECK_TIME -= 1”
check_vip_helth
if [ $VIP_OK = 1 ] ; then
CHECK_TIME=0
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - VIP $VIP available: success[$VIP_OK] >> $LOG_FILE
exit 0
else
service keepalived restart
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] service keepalived restart >> $LOG_FILE2
fi
if [ $VIP_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - VIP $VIP invaild. >> $LOG_FILE
exit 1
fi
sleep 10
done
chmod 777 /etc/keepalived/checkVip.sh
7.1.5启keepalived服务
systemctl restart keepalived
注意:为了解决因为服务器重启,导致虚拟ip在短时间内异常漂移的问题。不能将keepalived服务设置为自启动,注意是手动启动,而非自启动!!!
7.2 从节点
从节点按照上述步骤配置keepalived,只需要把其中的ip换成slave机器的ip,priority值要比master小即可。
7.2.1 安装keepalived服务
安装包下载https://pan.baidu.com/s/13kW_Bz6RGSo4ewZ68BwCVA 提取码: 6ktn
rpm -ivh *.rpm --nodeps --force
7.2.2 修改keepalived.conf
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
}
}
vrrp_script chk_mysql {
script “/etc/keepalived/check_mysql.sh”
interval 20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
nopreempt
unicast_src_ip 192.168.92.17
unicast_peer {
192.168.92.16
}
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_mysql
}
virtual_ipaddress {
192.168.92.66
}
}
virtual_server 192.168.92.66 3306 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
real_server 192.168.92.17 3306 {
weight 10
MISC_CHECK {
misc_path “/etc/keepalived/check_mysql.sh”
misc_timeout 20
}
}
}
7.2.3 创建mysql检活脚本
创建检测脚本
vim /etc/keepalived/check_mysql.sh
#!/bin/bash
Return_code=`netstat -tnlp | grep 3306 | wc -l`
Return_code1=`ip a | grep 192.168.92.66 | wc -l`
LOG_FILE=/etc/keepalived/check_mysql.log
echo `date --date=today +“%Y-%m-%d %H:%M:%S”` - [INFO] - sudo systemctl status keepalived.service >> $LOG_FILE
echo “Return_code:” $Return_code code is “netstat -tnlp | grep 3306 | wc -l” >> $LOG_FILE
echo “Return_code1:” $Return_code1 code is “ip a | grep 192.168.92.66 | wc -l” >> $LOG_FILE
7.2.4 重启keepalived服务
systemctl restart keepalived
八、故障恢复
当主节点mysql服务异常时,检活脚本会自动停止主节点上的keepalived服务,将虚拟ip飘到从节点上,此时用户使用的是从节点上的mysql,当主节点上的mysql服务修复后,进行如下操作:
8.1 导出从节点上的全部数据
mysqldump -uroot -p --all-databases -h 192.168.92.17 -P 3306 --max_allowed_packet=1024M > allbackup20220707_slave.sql
(输入mysql密码)
8.2 将从节点导出的全量数据,导入进主节点的mysql数据库中
mysql -uroot -p -h 192.168.92.16 -P 3306 --max_allowed_packet=1024M < ./allbackup20220707_slave.sql
8.3 重新配置主主状态
主节点执行:
mysql -uroot -p
stop slave;
show master status;
Node2执行:
mysql -uroot -p
stop slave;
CHANGE MASTER TO
MASTER_HOST = ‘192.168.92.16’,
MASTER_USER = ‘repl_user’,
MASTER_PASSWORD = ‘123456’,
MASTER_PORT = 3306,
MASTER_LOG_FILE=‘mysql-bin.000006’,
MASTER_LOG_POS=4430,
MASTER_RETRY_COUNT = 60,
MASTER_HEARTBEAT_PERIOD = 10000;
start slave;
查看状态:
show slave status\G
Node2执行:
mysql -uroot -p
show master status;
Node1执行:
mysql -uroot -p
CHANGE MASTER TO
MASTER_HOST = ‘192.168.92.17’,
MASTER_USER = ‘repl_user’,
MASTER_PASSWORD = ‘123456’,
MASTER_PORT = 3306,
MASTER_LOG_FILE=‘mysql-bin.000005’,
MASTER_LOG_POS=154,
MASTER_RETRY_COUNT = 60,
MASTER_HEARTBEAT_PERIOD = 10000;
start slave;
show slave status\G
8.4 启动主节点上的keepalived服务
systemctl restart keepalived
九、keepalived配置邮件通知
对外环境如果可以连接外网,并且未计划部署监控服务时,如果虚拟ip自动发生漂移,如何立即通知相关人员修复主节点上的异常服务,这对组件时刻保持高可用状态具有重要意义,为此,利用keepalived自带的邮件通知工具,可解决这个问题。
(1)编写邮件通知脚本(主机器)
vim /etc/keepalived/notify.sh
#!/bin/bash
#配置的接收邮件的邮箱信息
contact=‘[email protected]’
notify() {
local mailsubject=“警告!!!警告!!!警告!!!”
local mailbody=“$(date +‘%F %T’): 注意:虚拟ip发生了转移,机器192.168.92.16 成为了 $1 !!!”
echo “$mailbody” | mail -s “$mailsubject” $contact
}
case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo “Usage: $(basename $0) {master|backup|fault}”
exit 1
;;
esac
chmod 777 /etc/keepalived/notify.sh
(2)编写邮件通知脚本(从机器)
vim /etc/keepalived/notify.sh
#!/bin/bash
#配置的接收邮件的邮箱信息
contact=‘[email protected]’
notify() {
local mailsubject=“警告!!!警告!!!警告!!!”
local mailbody=“$(date +‘%F %T’): 注意:虚拟ip发生了转移,机器192.168.92.17 成为了 $1 !!!”
echo “$mailbody” | mail -s “$mailsubject” $contact
}
case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo “Usage: $(basename $0) {master|backup|fault}”
exit 1
;;
esac
chmod 777 /etc/keepalived/notify.sh
(3)修改keepalived配置文件,添加标红的部分(主从一致)
vim /etc/keepalived/keepalived.conf (注意,以下keepalived配置文件只截取了部分!!!)
global_defs {
notification_email {
root@localhost #邮件通知接收者
}
notification_email_from [email protected] #邮件发送者
smtp_server 127.0.0.1 #邮件服务器地址为127.0.0.1
smtp_connect_timeout 30 #超时时长为30秒
router_id LVS_DEVEL # route_id
vrrp_skip_check_adv_addr # 所有报文都检查比较消耗性能,此配置为如果收到的报文和上一个报文是同一个路由器则跳过检查报文中的源地址
# vrrp_strict # 严格遵守VRRP协议,不允许状况:1,没有VIP地址,2.配置了单播邻居,3.在VRRP版本2中有IPv6地址.
vrrp_garp_interval 0 # ARP报文发送延迟
vrrp_gna_interval 0 # 消息发送延迟
#vrrp_iptables # yum安装会自动生成防火墙策略,可以删除或禁止生成
}
vrrp_instance VI_1 {
#虚拟ip地址
virtual_ipaddress {
192.168.92.66
}
#配置keepalived发生故障转移时,触发执行的脚本
notify_master “/etc/keepalived/notify.sh master”
notify_backup “/etc/keepalived/notify.sh backup”
notify_fault “/etc/keepalived/notify.sh fault”
}
(4)安装mail工具(主从一致)
rpm -ivh mailx-12.5-19.el7.x86_64.rpm --nodeps --force
rpm -ivh net-snmp-agent-libs-5.7.2-33.el7_5.2.x86_64.rpm --nodeps --force
rpm -ivh net-snmp-libs-5.7.2-33.el7_5.2.x86_64.rpm --nodeps --force
(5)配置邮件客户端(主从一致)
vim /etc/mail.rc
set [email protected] #邮件服务器账号
set smtp=smtp.qq.com #邮件服务器地址
set [email protected] #邮件服务器账号
set smtp-auth-password=voyrzvzizjlwcaba #邮件服务器授权码
set smtp-auth=login
set ssl-verify-ignore
(6)测试邮件通知功能
(1)启动主节点keepalived服务,检查是否收到邮件通知
(2)启动从节点keepalived服务,检查是否收到邮件通知
(3)关闭主节点keepalived服务,检查是否收到邮件通知
存在问题
(1)如果应用层一直往主机器上写数据,主机器上的mysql服务突然停止,导致虚拟ip切换到从机器上的mysql上,此时应用正在写的进程是否出现异常,数据是否出现丢失?如何处理?
(2)如果keepalived设置了自启动,如果主节点机器突然重启,虚拟ip会跳到从机器上,用户会使用一段时间从机器上的服务,但是如果主机器启动了,主机器上的keepalived服务应为自启动设置,会重新启动,此时虚拟ip又会重新跳到主机器上,那么用户使用的这段时间从机器上的数据能够保证不丢失???
解决方法:在设置keepalived启动的时候,不将其设置成为自启动,这样就能避免因为服务器重启,导致虚拟ip在短时间内异常漂移的问题。