0%

说明

prometheus监控插件mysqld_exporter可以采集mysql指标并告警展示.

步骤

本文使用docker部署的方式

  1. 创建用户并授权
CREATE USER 'exporter'@'%' IDENTIFIED BY 'exporter#2020';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'exporter'@'%';
GRANT SELECT ON performance_schema.* TO 'exporter'@'%';
-- GRANT ALL ON *.* TO 'exporter'@'%';
  1. 运行mysqld_exporter
docker run -d \
  --name mysql_exporter \
  --restart always \
  -p 9104:9104 \
  -e DATA_SOURCE_NAME="exporter:123456@(192.168.41.141:3306)/" \
  prom/mysqld-exporter
  1. 查看指标

如果没有问题,可以将指标采集到prometheus

  1. 修改prometheus配置文件,scrape_configs下面添加
    ssh no
    如果需要,部署prometheus
# prometheus.yml
global:
  scrape_interval:     15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093
rule_files:
  # - "rules.yml"
  # - "second_rules.yml"
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['192.168.41.128:9090']
  - job_name: 'mysql'
    static_configs:
      - targets: ['192.168.41.141:9104']
# 启动
docker run --name=prometheus -d -p 9090:9090  -v /root/prometheus/config/prometheus.yml:/etc/prometheus/prometheus.yml hub.deri.org.cn/k8s_monitor/prometheus:latest
# vim /usr/local/prometheus/prometheus.yml
scrape_configs:
  - job_name: 'mysql'
    static_configs:
      - targets: ['192.168.41.141:9104']

注意:如果是采用prometheus-operator方式安装prometheus,新增target示例:
新增mysql-exporter-endpoints.yaml,引入外部服务

apiVersion: v1
kind: Endpoints
metadata:
  name: mysql-exporter
  namespace: monitoring
subsets:
- addresses:
  - ip: 192.168.1.47
    targetRef:
      name: huawei-mysql
      namespace: default
    ports:
      - name: metrics
        port: 9104
        protocol: TCP

新增mysql-exporter-service.yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
  labels:
    app: prometheus-mysql-exporter
    chart: prometheus-mysql-exporter
    heritage: Tiller
    jobLabel: mysql-exporter
    release: prometheus-operator
  # name与endpoints name保持一致
  name: mysql-exporter
  namespace: monitoring
spec:
  ports:
  - name: metrics
    port: 9104
    targetPort: 9104
    protocol: TCP
  type: ClusterIP

新增mysql-exporter-ServiceMonitor.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  # 注意标签 当前版本prometheus-operator是通过release: prometheus-operator标签来发现ServiceMonitor的
  labels:
    app: prometheus-operator-mysql-exporter
    chart: prometheus-operator-8.2.4
    heritage: Tiller
    release: prometheus-operator
  name: prometheus-operator-mysql-exporter
  namespace: monitoring
spec:
  endpoints:
  - port: metrics
  jobLabel: jobLabel
  selector:
    # 通过标签发现service
    matchLabels:
      app: prometheus-mysql-exporter
      release: prometheus-operator
kubectl apply -f .
# 新增上面三个服务之后,prometheus页面就可以发现新增的mysql-exporter.
# prometheus数据库中也能查询到mysql指标数据
  1. 设置报警规则
groups:
- name: GaleraAlerts
  rules:
  - alert: MySQLGaleraNotReady
    expr: mysql_global_status_wsrep_ready != 1
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is not ready.'
      summary: Galera cluster node not ready
  - alert: MySQLGaleraOutOfSync
    expr: (mysql_global_status_wsrep_local_state != 4 and mysql_global_variables_wsrep_desync
      == 0)
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is not in sync ({{$value}}
        != 4).'
      summary: Galera cluster node out of sync
  - alert: MySQLGaleraDonorFallingBehind
    expr: (mysql_global_status_wsrep_local_state == 2 and mysql_global_status_wsrep_local_recv_queue
      > 100)
    for: 5m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.job}} on {{$labels.instance}} is a donor (hotbackup)
        and is falling behind (queue size {{$value}}).'
      summary: xtradb cluster donor node falling behind
  - alert: MySQLReplicationNotRunning
    expr: mysql_slave_status_slave_io_running == 0 or mysql_slave_status_slave_sql_running
      == 0
    for: 2m
    labels:
      severity: critical
    annotations:
      description: Slave replication (IO or SQL) has been down for more than 2 minutes.
      summary: Slave replication is not running
  - alert: MySQLReplicationLag
    expr: (mysql_slave_lag_seconds > 30) and on(instance) (predict_linear(mysql_slave_lag_seconds[5m],
      60 * 2) > 0)
    for: 1m
    labels:
      severity: critical
    annotations:
      description: The mysql slave replication has fallen behind and is not recovering
      summary: MySQL slave replication is lagging
  - alert: MySQLReplicationLag
    expr: (mysql_heartbeat_lag_seconds > 30) and on(instance) (predict_linear(mysql_heartbeat_lag_seconds[5m],
      60 * 2) > 0)
    for: 1m
    labels:
      severity: critical
    annotations:
      description: The mysql slave replication has fallen behind and is not recovering
      summary: MySQL slave replication is lagging
  - alert: MySQLInnoDBLogWaits
    expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10
    labels:
      severity: warning
    annotations:
      description: The innodb logs are waiting for disk at a rate of {{$value}} /
        second
      summary: MySQL innodb log writes stalling
# 告警规则示例
groups:
- name: MySQLStatsAlert
  rules:
  - alert: MySQL is down
    expr: mysql_up == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} MySQL is down"
      description: "MySQL database is down. This requires immediate action!"
  - alert: open files high
    expr: mysql_global_status_innodb_num_open_files > (mysql_global_variables_open_files_limit) * 0.75
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} open files high"
      description: "Open files is high. Please consider increasing open_files_limit."
  - alert: Read buffer size is bigger than max. allowed packet size
    expr: mysql_global_variables_read_buffer_size > mysql_global_variables_slave_max_allowed_packet 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Read buffer size is bigger than max. allowed packet size"
      description: "Read buffer size (read_buffer_size) is bigger than max. allowed packet size (max_allowed_packet).This can break your replication."
  - alert: Sort buffer possibly missconfigured
    expr: mysql_global_variables_innodb_sort_buffer_size <256*1024 or mysql_global_variables_read_buffer_size > 4*1024*1024 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Sort buffer possibly missconfigured"
      description: "Sort buffer size is either too big or too small. A good value for sort_buffer_size is between 256k and 4M."
  - alert: Thread stack size is too small
    expr: mysql_global_variables_thread_stack <196608
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Thread stack size is too small"
      description: "Thread stack size is too small. This can cause problems when you use Stored Language constructs for example. A typical is 256k for thread_stack_size."
  - alert: Used more than 80% of max connections limited 
    expr: mysql_global_status_max_used_connections > mysql_global_variables_max_connections * 0.8
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Used more than 80% of max connections limited"
      description: "Used more than 80% of max connections limited"
  - alert: InnoDB Force Recovery is enabled
    expr: mysql_global_variables_innodb_force_recovery != 0 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Force Recovery is enabled"
      description: "InnoDB Force Recovery is enabled. This mode should be used for data recovery purposes only. It prohibits writing to the data."
  - alert: InnoDB Log File size is too small
    expr: mysql_global_variables_innodb_log_file_size < 16777216 
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Log File size is too small"
      description: "The InnoDB Log File size is possibly too small. Choosing a small InnoDB Log File size can have significant performance impacts."
  - alert: InnoDB Flush Log at Transaction Commit
    expr: mysql_global_variables_innodb_flush_log_at_trx_commit != 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Flush Log at Transaction Commit"
      description: "InnoDB Flush Log at Transaction Commit is set to a values != 1. This can lead to a loss of commited transactions in case of a power failure."
  - alert: Table definition cache too small
    expr: mysql_global_status_open_table_definitions > mysql_global_variables_table_definition_cache
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Table definition cache too small"
      description: "Your Table Definition Cache is possibly too small. If it is much too small this can have significant performance impacts!"
  - alert: Table open cache too small
    expr: mysql_global_status_open_tables >mysql_global_variables_table_open_cache * 99/100
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Table open cache too small"
      description: "Your Table Open Cache is possibly too small (old name Table Cache). If it is much too small this can have significant performance impacts!"
  - alert: Thread stack size is possibly too small
    expr: mysql_global_variables_thread_stack < 262144
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Thread stack size is possibly too small"
      description: "Thread stack size is possibly too small. This can cause problems when you use Stored Language constructs for example. A typical is 256k for thread_stack_size."
  - alert: InnoDB Buffer Pool Instances is too small
    expr: mysql_global_variables_innodb_buffer_pool_instances == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Buffer Pool Instances is too small"
      description: "If you are using MySQL 5.5 and higher you should use several InnoDB Buffer Pool Instances for performance reasons. Some rules are: InnoDB Buffer Pool Instance should be at least 1 Gbyte in size. InnoDB Buffer Pool Instances you can set equal to the number of cores of your machine."
  - alert: InnoDB Plugin is enabled
    expr: mysql_global_variables_ignore_builtin_innodb == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} InnoDB Plugin is enabled"
      description: "InnoDB Plugin is enabled"
  - alert: Binary Log is disabled
    expr: mysql_global_variables_log_bin != 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "Instance {{ $labels.instance }} Binary Log is disabled"
      description: "Binary Log is disabled. This prohibits you to do Point in Time Recovery (PiTR)."
  - alert: Binlog Cache size too small
    expr: mysql_global_variables_binlog_cache_size < 1048576
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Cache size too small"
      description: "Binlog Cache size is possibly to small. A value of 1 Mbyte or higher is OK."
  - alert: Binlog Statement Cache size too small
    expr: mysql_global_variables_binlog_stmt_cache_size <1048576 and mysql_global_variables_binlog_stmt_cache_size > 0
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Statement Cache size too small"
      description: "Binlog Statement Cache size is possibly to small. A value of 1 Mbyte or higher is typically OK."
  - alert: Binlog Transaction Cache size too small
    expr: mysql_global_variables_binlog_cache_size  <1048576
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Binlog Transaction Cache size too small"
      description: "Binlog Transaction Cache size is possibly to small. A value of 1 Mbyte or higher is typically OK."
  - alert: Sync Binlog is enabled
    expr: mysql_global_variables_sync_binlog == 1
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
      description: "Sync Binlog is enabled. This leads to higher data security but on the cost of write performance."
  - alert: IO thread stopped
    expr: mysql_slave_status_slave_io_running != 1
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} IO thread stopped"
      description: "IO thread has stopped. This is usually because it cannot connect to the Master any more."
  - alert: SQL thread stopped 
    expr: mysql_slave_status_slave_sql_running == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} SQL thread stopped"
      description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."
  - alert: SQL thread stopped
    expr: mysql_slave_status_slave_sql_running != 1
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Instance {{ $labels.instance }} Sync Binlog is enabled"
      description: "SQL thread has stopped. This is usually because it cannot apply a SQL statement received from the master."
  - alert: Slave lagging behind Master
    expr: rate(mysql_slave_status_seconds_behind_master[1m]) >30 
    for: 1m
    labels:
      severity: warning 
    annotations:
      summary: "Instance {{ $labels.instance }} Slave lagging behind Master"
      description: "Slave is lagging behind Master. Please check if Slave threads are running and if there are some performance issues!"
  - alert: Slave is NOT read only(Please ignore this warning indicator.)
    expr: mysql_global_variables_read_only != 0
    for: 1m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} Slave is NOT read only"
      description: "Slave is NOT set to read only. You can accidentally manipulate data on the slave and get inconsistencies..."
  1. 监控展示

如果需要,部署grafana

docker run -d -p 3000:3000 --name grafana hub.deri.org.cn/k8s_monitor/grafana:6.4.2

Grafana dashboard一些可用的ID,可直接通过ID导入dashboard.

  • 6239
  • 7362
  • 11329
  • 10654
  • 11323

参考链接

说明

  • 使用的docker镜像mysql:5.7.18;
  • 容器中mysql配置文件目录/etc/mysql/;

步骤

  1. 启动镜像,将镜像中配置目录拷贝到宿主机
#启动,并设置root密码
docker run -d --name mysql  -e MYSQL_ROOT_PASSWORD=123456 hub.deri.org.cn/library/mysql:5.7

#复制
docker cp mysql:/etc/mysql /root
#配置文件目录结构
[root@mysql01 mysql]# tree
.
├── conf.d
│   ├── docker.cnf
│   └── mysql.cnf
├── my.cnf -> /etc/alternatives/my.cnf
├── my.cnf.fallback
├── mysql.cnf
└── mysql.conf.d
    └── mysqld.cnf

2 directories, 6 files
  1. 查看mysql data目录,便于将docker中数据持久化到宿主机
-- 登录
docker exec -it mysql bash
mysql -uroot -p123456
-- 查询
mysql> show variables like '%datadir%';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| datadir       | /var/lib/mysql/ |
+---------------+-----------------+
1 row in set (0.01 sec)
  1. 修改mysql.cnf
#参考mysql主从配置
[mysqld]
server-id=1 # 主从设置不一样
log-bin=mysql-bin #备无需开启
auto-increment-increment=2
auto-increment-offset=2 # 主从设置不一样:1,2
log-slave-updates=true
#replicate-ignore-db=information_schema
max_binlog_size=1024M
binlog-format=ROW
#binlog_rows_query_log_events=on # 在row 模式的binlog中包含SQL EVENTS(即SQL语句也会保留)
#master-info-repository=TABLE
#relay-log-info-repository=TABLE
#slave-skip-errors=all # 忽略复制产生的错误
#slave-skip-errors=1062,1032,1060 # 跳过已知错误,主键冲突、表已存在等错误代码如1062,1032,1060等
#relay-log-purge = 1    #是否自动清空不再需要中继日志时。默认值为1(启用)
#expire_logs_days = 30  #超过 30 天的 binlog 删除
  1. 启动mysql
#映射端口、配置文件和data
docker run -d --name mysql -v /root/mysql:/etc/mysql -v /var/lib/mysql/:/var/lib/mysql/ -e MYSQL_ROOT_PASSWORD=123456 -p 3306:3306  hub.deri.org.cn/library/mysql:5.7

#也可以按需只映射一个配置文件mysql.cnf
docker run -d --name mysql -v /root/mysql/mysql.cnf:/etc/mysql/mysql.cnf -v /var/lib/mysql/:/var/lib/mysql/ -e MYSQL_ROOT_PASSWORD=123456 -p 3307:3306  hub.deri.org.cn/library/mysql:5.7
  1. 其它主从配置,参考非docker安装配置
  • 添加主从帐号
  • 配置主从信息
  • 开启半同步复制
  • 开启从节点只读模式
  1. 使用keepalive工具

写一个脚本/etc/docker_mysql.sh判断dockermysql是否存活

#! /bin/bash
count=$(docker ps |grep hub.deri.org.cn/library/mysql:5.7 | grep Up| wc -l)
if [ $count == "1" ];then
    exit 0
else
    exit 1
fi


#测试这个脚本
[root@mysql01 ~]# docker stop mysql
mysql
[root@mysql01 ~]# ./docker_mysql.sh 
[root@mysql01 ~]# ./docker_mysql.sh ;echo $?
1
[root@mysql01 ~]# docker start mysql 
mysql
[root@mysql01 ~]# ./docker_mysql.sh ;echo $?
0

#配置keepalived
vrrp_script chk_mysql {
    script "/etc/docker_mysql.sh"
    interval 2
    timeout 2
    fall 3
}

使用docker run运行一个新的容器的时候,通过参数 --add-host来添加域名和IP信息到容器的/etc/hosts文件中。例如:

docker run -d --name test --add-host=addr.com:10.10.18.11 redis:latest 

mysql慢查询

查看mysql慢查询日志

-- 慢查询日志
show variables like '%slow_query%';
+---------------------+--------------------------------------+
| Variable_name       | Value                                |
+---------------------+--------------------------------------+
| slow_query_log      | OFF                                  |
| slow_query_log_file | /var/lib/mysql/7b32a384231d-slow.log |
+---------------------+--------------------------------------+
2 rows in set

-- 不使用索引查询日志
mysql> show variables like '%log_que%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| log_queries_not_using_indexes | OFF   |
+-------------------------------+-------+
1 row in set

-- 查询时间设置,查询超过多少秒才记录
mysql> show variables like '%long_que%';
+-----------------+-----------+
| Variable_name   | Value     |
+-----------------+-----------+
| long_query_time | 10.000000 |
+-----------------+-----------+
1 row in set

开启慢查询日志

-- 方法一
set global  slow_query_log=ON;
set global  log_queries_not_using_indexes=ON;

-- 方法二,修改配置文件my.cnf
slow_query_log = ON
slow_query_log_file = /usr/local/mysql/data/slow.log
long_query_time = 1
-- 方法二需要重启
service mysqld restart

慢查询分析工具

mysqldumpslow

pt-query-digest

复制模式

MySQL主从复制包括异步模式、半同步模式、GTID模式以及多源复制模式,默认是异步模式.

  • 异步复制(Asynchronous replication

所谓异步模式指的是MySQL 主服务器上I/O thread 线程将二进制日志写入binlog文件之后就返回客户端结果,不会考虑二进制日志是否完整传输到从服务器以及是否完整存放到从服务器上的relay log(中继日志)中,这种模式一旦主服务(器)宕机,数据就可能会发生丢失。

  • 全同步复制(Fully synchronous replication

指当主库执行完一个事务,所有的从库都执行了该事务才返回给客户端。因为需要等待所有从库执行完该事务才能返回,所以全同步复制的性能必然会收到严重的影响。

  • 半同步复制(Semisynchronous replication

介于异步复制和全同步复制之间,主库在执行完客户端提交的事务后不是立刻返回给客户端,而是等待至少一个从库接收到并写到relay log中才返回给客户端。相对于异步复制,半同步复制提高了数据的安全性,同时它也造成了一定程度的延迟,这个延迟最少是一个TCP/IP往返的时间。所以,半同步复制最好在低延时的网络中使用。

半同步潜藏问题

mysql5.7版本前,客户端事务在存储引擎层提交后,在得到从库确认的过程中,主库宕机了,客户端会收到事务提交失败的信息,此时,可能的情况有两种:

  • 事务还没发送到从库上
  • 事务已经发送到从库上

此时客户端由于提交失败,再次提交事务,出现问题.

mysql5.7版本引入新的半同步方式,事务提交需要从库同步完成.通过rpl_semi_sync_master_wait_point参数控制:

  • AFTER_SYNC , 这个是新的半同步方案
  • AFTER_COMMIT, 这个是老的半同步方案

开启半同步复制

半同步复制的安装部署条件:

  • MySQL 5.5及以上版本
  • 变量have_dynamic_loadingYES (查看命令:show variables like "have_dynamic_loading";)
  • 主从复制已经配置完成

半同步复制是以插件的形式安装的,默认安装包在/usr/local/mysql/lib/plugin/目录下.可以通过命令查看

-- 查看所有插件
show plugins;

-- 模糊查询
SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS  WHERE PLUGIN_NAME LIKE '%semi%';
  • 启动方式一:通过命令
-- 主库执行安装插件
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
-- 主库执行开启
SET GLOBAL rpl_semi_sync_master_enabled = 1;


-- 从库执行安装插件
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
-- 从库执行开启
SET GLOBAL rpl_semi_sync_master_enabled = 1;


-- 从库重启IO线程
STOP SLAVE IO_THREAD;
START SLAVE IO_THREAD;

-- 主库卸载插件(先关闭半同步复制功能)
UNINSTALL PLUGIN rpl_semi_sync_master;
-- 从库卸载插件(先关闭半同步复制功能)
UNINSTALL PLUGIN rpl_semi_sync_slave;
  • 启动方式二:通过my.cnf,需要重启mysql
# 主库配置
plugin-load=rpl_semi_sync_master=semisync_master.so
rpl_semi_sync_master_enabled=1

# 从库配置
plugin-load=rpl_semi_sync_slave=semisync_slave.so
rpl_semi_sync_slave_enabled=1

# 如果既是主又是从
plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl-semi-sync-master-enabled=1
rpl-semi-sync-slave-enabled=1

需要注意,Mysql半同步复制并不是严格意义上的半同步复制。当半同步复制发生超时时(由rpl_semi_sync_master_timeout参数控制,单位是毫秒,默认为10000,即10s),会暂时关闭半同步复制,转而使用异步复制。当master dump线程发送完一个事务的所有事件之后,如果在rpl_semi_sync_master_timeout内,收到了从库的响应,则主从又重新恢复为半同步复制。一旦有一次超时自动降级为异步.

查看半同步复制状态

-- 查看插件
mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS  WHERE PLUGIN_NAME LIKE '%semi%';
+----------------------+---------------+
| PLUGIN_NAME          | PLUGIN_STATUS |
+----------------------+---------------+
| rpl_semi_sync_master | ACTIVE        |
+----------------------+---------------+
1 row in set (0.00 sec)

-- 主库
mysql> show status like 'Rpl_semi_sync_master_status';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_status | ON    |
+-----------------------------+-------+
1 row in set (0.01 sec)

-- 从库
mysql> show status like 'Rpl_semi_sync_slave_status';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | ON    |
+----------------------------+-------+
1 row in set (0.00 sec)


-- 查看主库半同步插件运行参数
-- rpl_semi_sync_master_wait_for_slave_count 设置主需要等待多少个slave应答,才能返回给客户端,默认为1
mysql> show variables like '%Rpl%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          | 
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_stop_slave_timeout                    | 31536000   |
+-------------------------------------------+------------+

-- 查看从库半同步插件运行参数
mysql> show variables like '%Rpl%';
+---------------------------------+----------+
| Variable_name                   | Value    |
+---------------------------------+----------+
| rpl_semi_sync_slave_enabled     | ON       |
| rpl_semi_sync_slave_trace_level | 32       |
| rpl_stop_slave_timeout          | 31536000 |
+---------------------------------+----------+
3 rows in set (0.00 sec)


-- 查看主库半同步插件运行状态status
mysql> show status like '%Rpl_semi%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 1     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 1     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | ON    |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 461   |
| Rpl_semi_sync_master_tx_wait_time          | 461   |
| Rpl_semi_sync_master_tx_waits              | 1     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 1     |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)

简单测试

-- 正常情况下插入数据
mysql> insert into test.testttt values(1,'22');
Query OK, 1 row affected (0.00 sec)

-- 从库关闭主从复制
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)

-- 再次插入数据,时间与设置的超时时间吻合
mysql> insert into test.testttt values(1,'222');
Query OK, 1 row affected (10.01 sec)

-- 查看半同步插件状态
mysql> show status like 'Rpl_semi_sync_master_status';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_status | OFF   |
+-----------------------------+-------+
1 row in set (0.00 sec)

mysql> show status like 'Rpl_semi_sync_slave_status';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | OFF   |
+----------------------------+-------+
1 row in set (0.00 sec)

-- 从库开启同步
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

-- 再次插入数据,恢复正常
mysql> insert into test.testttt values(1,'222');
Query OK, 1 row affected (0.00 sec)

-- 查看半同步插件状态也是ON

其它注意参数

  • sync_binlog

默认,sync_binlog=0,表示MySQL不控制binlog的刷新,由文件系统自己控制它的缓存的刷新。这时候的性能是最好的,但是风险也是最大的。因为一旦系统Crash,在binlog_cache中的所有binlog信息都会被丢失。

如果sync_binlog>0,表示每sync_binlog次事务提交,MySQL调用文件系统的刷新操作将缓存刷下去。最安全的就是sync_binlog=1了,表示每次事务提交,MySQL都会把binlog刷下去,是最安全但是性能损耗最大的设置。

  • sync_relay_log

默认为10000,即每10000sync_relay_log事件会刷新到磁盘。为0则表示不刷新,交由系统的cache控制

  • 同步刷盘配置
# my.cnf,最安全配置,牺牲性能
sync_binlog=1
sync_master_info=1
sync_relay_log=1
sync_relay_log_info=1

参考链接

问题

双主之间复制如果网络中断或者其它原因导致不可用,此时在不同主上修改了同一条数据,就会存在数据不一致的情况。

解决办法

  1. 过滤到已知错误
#slave-skip-errors=all # 忽略复制产生的错误
#slave-skip-errors=1062,1032,1060 # 跳过已知错误,主键冲突、表已存在等错误代码如1062,1032,1060等

但是这样数据不一致的问题还是存在.

  1. 不用双主模式,改成主从模式,且从节点只能读不能写
    关闭双向复制,设置单向复制,一主一从
# 从节点设置不能写,设置完需要重启mysql服务
read_only=ON
super_read_only=ON

# SQL
FLUSH TABLES WITH READ LOCK;