踏雪寻梅 发表于 2018-10-4 11:11:36

MySQL MHA高可用环境部署

  一,安装MHA基本环境
  安装MHA节点
  (1)基本环境说明
  角色IP地址主机名
  =========================================
  主机192.168.1.121节点1
  从机192.168.1.122节点2
  从机192.168.1.123节点3
  监视主机192.168.1.125节点5
  (2)在node1,node2,node3,node5中操作:
  #vi / etc / hosts
  192.168.1.121 node1
  192.168.1.122 node2
  192.168.1.123 node3
  192.168.1.125 node5
  安装MHA节点节点软件包:
  #rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm#yum
  install perl-DBD-MySQL perl-CPAN -y
  #tar xf mha4mysql -node-0.56.tar.gz
  #cd mha4mysql-node-0.56
  #perl Makefile.PL
  #make && make install
  安装MHA
  在节点5管理节点上操作:注:MHA管理器主机也是需要安装MHA节点,MHA管理器
  #yum install perl-DBD-MySQL perl-CPAN perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-time-HiRes -y
  #tar xf mha4mysql-manager-0.56.tar.gz
  #cd mha4mysql-manager -0.56
  #perl Makefile.PL
  #make && make install
  #说明:安装的脚本程序都在/ usr / local / bin /目录下。
  3.节点间配置SSH登录无密码验证(MHA主机之间使用密钥登录)
  在node5(Monitor)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-   copy-id -i /root/.ssh/id_rsa.pub root @ node3
  在node1(Master)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node3
  #ssh-   copy-id -i /root/.ssh/id_rsa.pub root @ node5
  在node2(slave)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node3
  #ssh-   copy-id -i /root/.ssh/id_rsa.pub root @ node5
  在node3(slave)中:
  #ssh-keygen -t rsa
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node1
  #ssh-copy-id -i /root/.ssh/id_rsa.pub root @ node2
  #ssh-   copy-id -i /root/.ssh/id_rsa.pub root @ node5
  二,搭建主从复制环境
  主从复制环境配置过程
  (1)mysql安装过程略,但是三节点要创建如下链接
  node1(主),node2(主备从),node3(从)
  注意:创建如下链接:
  ln -s / usr / local / mysql / bin / * / usr / local / bin /
  node1 my.cnf
  server-id = 1
  binlog-format = ROW
  log-bin = master-bin
  log-bin-index = master-bin.index
  log-slave-updates = true
  relay_log_purge = 0
  node2 my.cnf
  server-id = 2
  binlog-format = ROW
  log-bin = master-bin
  log-bin-index = master-bin.index
  log-slave-updates = true
  relay_log_purge = 0
  node3 my.cnf
  binlog-format = ROW
  log-bin = mysql-bin
  relay-log = slave-relay-bin
  relay-log-index = slave-relay-bin.index
  log-slave-updates = true
  server-id = 11
  skip-name- resolve
  relay_log_purge = 0
  (2)在node1(Master)上备份一份完整的数据:
  #mysqldump -uroot -p123456 --master-data = 2 - 单事务-R - triggers -A> all.sql
  其中--master-data = 2代表备份时刻记录主的Binlog位置和位置。
  (3)在node1(Master)上创建复制用户:
  mysql>授予复制从机*。*到'123456'确定的'repl'@'192.168.1.%';
  刷新权限;
  (4)查看主库备份时的binlog名称和位置,MASTER_LOG_FILE和MASTER_LOG_POS:
  #head -n 30 all.sql | grep'CHANGE MASTER TO'
  - CHANGE MASTER TO MASTER_LOG_FILE ='master-bin.000004',MASTER_LOG_POS = 120;
  (5)把备份复制到192.168.1.122和192.168.1.123
  #scp all.sql
  192.168.1.122:/root/#scp all.sql 192.168.1.123:/root/
  (6)分别在两台服务器上导入备份,执行复制相关命令
  在node2,node3主机上操作:
  #mysql -uroot -p123456start slave;
  显示从属状态\ G
  创建MHA管理用户,在主上创建。
  将*。*的所有权限授予'123456'标识为'root'@'192.168.1.%'的权限;
  刷新权限;
  三,配置Keepal VIP
  vip配置可以采用两种方式,一种通过keepalived的方式管理虚拟ip的浮动;一人是通过脚本方式,本文通过keepalived方式实现
  1.在node1(Master)与node2(备选主节点)安装keepalived。
  #wget的http://www.keepalived.org/software/keepalived-1.2.12.tar.gz
  #焦油XF的keepalived-1.2.12.tar.gz
  #CD的keepalived-1.2.12
  #的./configure前缀= / usr / local / keepalived
  #make && make install
  #cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/#cp
  / usr / local / keepalived / etc / sysconfig / keepalived / etc / sysconfig /
  #mkdir / etc / keepalived
  #cp /usr/local/keepalived/etc/keepalived/keepalived.conf / etc / keepalived /
  #cp / usr / local / keepalived / sbin / keepalived / usr / sbin /
  配置keepalived的配置文件,在node1(master)上配置操作如下:
  注意:keepalived配置成备份 - 备份,即IP地址切换后,主起来后IP地址不切换,本文监控脚本由MHA提供,keepalived不提供对mysqld的监控。
  #vi /etc/keepalived/keepalived.conf
  !保持配置文件
  global_defs {
  notification_email {
  abc@163.com
  }
  notification_email_from dba@dbserver.com
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id MySQL-HA
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 150
  advert_int 1
  nopreempt
  身份验证{
  auth_type PASS
  auth_pass 1111
  }
  virtual_ipaddress {
  192.168.1.130
  }
  }
  配置keepalived的配置文件,在node2(备用节点)上配置操作如下:
  #vi /etc/keepalived/keepalived.conf
  !保持配置文件
  global_defs {
  notification_email {
  abc@163.com
  }
  notification_email_from dba@dbserver.com
  smtp_server 127.0.0.1
  smtp_connect_timeout 30
  router_id MySQL-HA
  }
  vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 120
  advert_int 1
  nopreempt
  身份验证{
  auth_type PASS
  auth_pass 1111
  }
  virtual_ipaddress {
  192.168.1.130
  }
  }
  4. node1,node2启动keepalived服务
  #service keepalived start
  #chkconfig keepalived on
  5. node1查看VIP启动情况
  [root @
  node1]#ip a   1:lo: mtu 65536 qdisc noqueue state UNKNOWN
  link / loopback 00:00:00:00:00:00 brd 00:00:00:00:00 :00
  inet 127.0.0.1/8 scope host lo
  inet6 :: 1/128 scope host
  valid_lft forever preferred_lft forever
  2:eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
  link / ether 00:0c :29:4e:53:71 brd ff:ff:ff:ff:ff:ff
  inet 192.168.1.121/24 brd 192.168.1.255范围全局eth0
  inet 192.168.1.130/32范围全局eth0
  inet6 fe80 :: 20c:29ff: fe4e:5371/64范围链接
  valid_lft永远preferred_lft永远
  四,配置MHA
  监控创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)。
  #mkdir -p / etc / masterha
  #mkdir -p / var / log / masterha / app1
  #cp mha4mysql-manager-0.56 / samples / conf / app1.cnf / etc / masterha /
  修改app1.cnf配置文件,修改后的文件内容如下:
  #cat /etc/masterha/app1.cnf
  
  manager_workdir = / var / log / masterha / app1
  manager_log = / var / log / masterha / app1 / manager.log
  master_binlog_dir = / usr / local / mysql / data /
  master_ip_failover_script = / usr / local / bin / master_ip_failover
  master_ip_online_change_script = / usr / local / bin / master_ip_online_change
  password = 123456
  user = root
  ping_interval = 1
  remote_workdir = / tmp
  repl_password = 123456
  repl_user = repl
  report_script = / usr / local / bin / send_report
  ssh_user = root
  
  hostname = 192.168.1.121
  port = 3306
  
  hostname = 192.168.1.122
  port = 3306
  candidate_master = 1
  check_repl_delay = 0
  
  hostname = 192.168.1.123
  port = 3306
  说明:
  master_ip_failover_script = / usr / local / bin / master_ip_failover            #MHA自动切换执行的脚本,需要修改
  master_ip_online_change_script = / usr / local / bin / master_ip_online_change #手动
  切换需要执行的脚本,需要修改   report_script = / usr / local / bin / send_report #切换                              时发送邮件进行报告,需要修改
  2.设置中继日志的清除方式(在每个从节点上):
  (1)在节点2,节点3从节点上操作:
  将relay_log_purge = 0加入my.cnf配置文件,前面已经配置。
  (2)设定定期清理继电器脚本(node2,node3上操作):
  #猫purge_relay_log.sh
  #!/斌/ bash的
  用户=根
  的passwd = 123456
  端口= 3306
  LOG_DIR = '/数据/ masterha /日志'
  WORK_DIR = '/数据'
  清除= '在/ usr / local / bin目录/ purge_relay_logs'
  如果[!-d $ log_dir]
  then
  mkdir $ log_dir -p
  fi
  $ purge --user = $ user --password = $ passwd --disable_relay_log_purge --port = $ port --workdir = $ work_dir >> $ log_dir / purge_relay_logs.log 2>&1
  配置定时计划任务
  #crontab -e   0 4 * * * / bin / bash /root/purge_relay_log.sh
  要求把keepalived服务引入MHA,我们只需要修改切换是触发的脚本文件master_ip_failover即可,在该脚本中添加在master发生宕机时对keepalived的处理。
  (1)编辑脚本/ usr / local / bin / master_ip_failover,修改后如下:
  #vi / usr / local / bin / master_ip_failover
  #!/ usr / bin / env perl
  use strict;
  使用警告FATAL =>'all';
  使用Getopt :: Long;
  我的(
  $命令,$ ssh_user,$ orig_master_host,$ orig_master_ip,
  $ orig_master_port,$ new_master_host,$ new_master_ip,$ new_master_port
  );
  我的$ vip ='192.168.1.130';
  我的$ ssh_start_vip =“/etc/init.d/keepalived start”;
  我的$ ssh_stop_vip =“/etc/init.d/keepalived stop”;
  GetOptions(
  'command = s'=> \ $ command,
  'ssh_user = s'=> \ $ ssh_user,
  'orig_master_host = s'=> \ $ orig_master_host,
  'orig_master_ip = s'=> \ $ orig_master_ip,
  'orig_master_port = '=> \ $ orig_master_port,
  'new_master_host = s'=> \ $ new_master_host,
  'new_master_ip = s'=> \ $ new_master_ip,
  'new_master_port = i'=> \ $ new_master_port,
  );
  exit&main();
  sub main {
  print“\ n \ nIN SCRIPT TEST ==== $ ssh_stop_vip == $ ssh_start_vip === \ n \ n”;
  if($ command eq“stop”|| $ command eq“stopssh”){
  my $ exit_code = 1;
  eval {
  print“禁用旧主机上的VIP:$ orig_master_host \ n”;
  &stop_vip();
  $ exit_code = 0;
  };
  if($ @){
  warn“Got Error:$ @ \ n”;
  退出$ exit_code;
  }
  exit $ exit_code;
  }
  elsif($ command eq“start”){
  我的$ exit_code = 10;
  eval {
  print“启用VIP - $ vip on the new master - $ new_master_host \ n”;
  &start_vip();
  $ exit_code = 0;
  };
  if($ @){
  warn $ @;
  退出$ exit_code;
  }
  exit $ exit_code;
  } {
  el } {
  “ } ”{   “ }   ”\“”
  退出0;
  }
  else {
  &usage();
  出口1;
  }
  }
  sub start_vip(){
  `ssh $ ssh_user \ @ $ new_master_host \“$ ssh_start_vip \”`;
  }
  #一个简单的系统调用,禁用在old_master
  子上的VIP   stop_vip(){
  `ssh $ ssh_user \ @ $ orig_master_host \“$ ssh_stop_vip \”`;
  }}
  sub usage {
  print
  “用法:master_ip_failover --command = start | stop | stopssh | status --orig_master_host = host --orig_master_ip = ip --orig_master_port = port --new_master_host = host --new_master_ip = ip --new_master_port = port \ n“;
  }
  (2)编辑脚本master_ip_online_change,修改后如下:
  #!/ usr / bin / env perl
  #版权所有(C)2011 DeNA有限公司
  ##
  这个程序是免费的软件; 您可以
  根据
  #自由软件基金会发布的GNU通用公共许可证的条款重新分配和/或修改   #    许可证的版本2或
  #(根据您的选择)任何更高版本。
  ##
  这个程序是分发的,希望它是有用的,
  但没有任何的保证; 甚至没有
  #适销性或适用于特定用途的默示保证   。有关
  详细信息,请参阅   #GNU通用公共许可证。
  ##
  您应该已经收到了GNU通用公共许可证
  #的副本   以及该程序; 如果不,
  ##注意:这是一个示例脚本,不完整。根据您的环境修改脚本。
  使用严格
  使用警告FATAL =>'all';
  使用Getopt :: Long;
  使用MHA :: DBHelper;
  使用MHA :: NodeUtil;
  使用Time :: HiRes qw(sleep gettimeofday tv_interval);
  使用Data :: Dumper;
  我的$ _tstart;
  我的$ _running_interval = 0.1;
  我的(
  $命令,$ orig_master_is_new_slave,$ orig_master_host,
  $ orig_master_ip,$ orig_master_port,$ orig_master_user,
  $ orig_master_password,$ orig_master_ssh_user,$ new_master_host,
  $ new_master_ip,$ new_master_port,$ new_master_user,
  $ new_master_password,$ new_master_ssh_user
  )
  我的$ vip ='192.168.1.130/24';
  我的$ key ='1';
  我的$ ssh_start_vip =“/ sbin / ifconfig eth0:$ key $ vip”;
  我的$ ssh_stop_vip =“/ sbin / ifconfig eth0:$ key down”;
  我的$ orig_master_ssh_port = 22;
  我的$ new_master_ssh_port = 22;
  exit&main();
  sub current_time_us {
  my($ sec,$ microsec)= gettimeofday();
  我的$ curdate = localtime($ sec);
  返回$ curdate。“”。sprintf(“%06d”,$ microsec);
  }
  sub sleep_until {
  my $ elapsed = tv_interval($ _ tstart);
  if($ _running_interval> $ elapsed){
  sleep($ _running_interval - $ elapsed);
  }
  }
  sub get_threads_util {
  my $ dbh = shift;
  我的$ my_connection_id = shift;
  我的$ running_time_threshold = shift;
  我的$ type = shift;
  $ running_time_threshold = 0,除非($ running_time_threshold);
  $ type = 0,除非($ type);
  我的@threads;
  我的$ sth = $ dbh-> prepare(“SHOW PROCESSLIST”);
  $ sth-> execute();
  while(my $ ref = $ sth-> fetchrow_hashref()){

  my $>  我的$ user = $ ref - > {User};
  我的$ host = $ ref - > {Host};
  我的$ command = $ ref - > {Command};
  我的$ state = $ ref - > {State};
  我的$ query_time = $ ref - > {Time};
  我的$ info = $ ref - > {Info};
  $ info =s / ^ \ s *(。*?)\ s * $ / $ 1 / if defined($ info);

  next if($ my_connection_id == $>  next if(defined($ query_time)&& $ query_time= 1){
  next if(defined($ command)&& $ command eq“Sleep”);
  next if(defined($ command)&& $ command eq“Connect”);
  }
  if($ type> = 2){
  next if(defined($ info)&& $ info =m / ^ select / i);
  next if(defined($ info)&& $ info =m / ^ show / i);
  }
  推送@threads,$ ref;
  }
  return @threads;
  }
  sub main {
  if($ command eq“stop”){
  ##正当地杀死当前主机上的连接
  #1.在新主机
  #2 上设置read_only = 1。DROP   USER使得没有应用用户可以建立新的连接
  #3。在当前主机
  #4 上设置read_only = 1。杀死当前查询
  #*任何数据库访问失败都会导致脚本死机   。
  我的$ exit_code = 1;
  eval {
  ##在新主机上设置read_only = 1(以避免意外)
  我的$ new_master_handler = new MHA :: DBHelper();
  #args:hostname,port,user,password,raise_error(die_on_error)_or_not
  $ new_master_handler-> connect($ new_master_ip,$ new_master_port,
  $ new_master_user,$ new_master_password,1);
  打印current_time_us()。“设置read_only在新的主人..”;
  $ new_master_handler-> enable_read_only();
  if($ new_master_handler-> is_read_only()){
  print“ok。\ n”;
  }
  else {
  die“Failed!\ n”;
  }
  $ new_master_handler-> disconnect();
  #连接到原始主机,如果发生任何数据库错误,则会死亡
  $ orig_master_handler = new MHA :: DBHelper();
  $ orig_master_handler-> connect($ orig_master_ip,$ orig_master_port,
  $ orig_master_user,$ orig_master_password,1);
  ##删除应用程序用户,以便没有人可以连接。事先禁用每会话binlog
  $ orig_master_handler-> disable_log_bin_local();
  打印current_time_us()。“在原始主机上吸引应用用户.. \ n”;
  #FIXME_xxx_drop_app_user($ orig_master_handler);
  等待N * 100毫秒,以便当前的连接可以退出
  我的$ time_until_read_only = 15;
  $ _tstart = ;
  我的@threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  while($ time_until_read_only> 0 && $#threads> = 0){
  if($ time_until_read_only%5 == 0){
  printf
  “%s等待所有正在运行的%d线程断开连接..(最大%d毫秒)\ n”
  current_time_us(),$#threads + 1,$ time_until_read_only * 100;
  if($#threadsnew([$ _]) - > Indent(0) - > Terse(1) - > Dump。
  “\ n” foreach(@threads);
  }
  }
  sleep_until();
  $ _tstart = ;
  $ time_until_read_only--;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  }}
  ##在当前主设备上设置read_only = 1,以便没有人(SUPER除外)可以写入
  print_time_us()。“在原始主机上设置read_only = 1”。
  $ orig_master_handler-> enable_read_only();
  if($ orig_master_handler-> is_read_only()){
  print“ok。\ n”;
  }
  else {
  die“Failed!\ n”;
  }}
  等待M * 100毫秒,以便当前的更新查询可以完成
  我的$ time_until_kill_threads = 5;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  while($ time_until_kill_threads> 0 && $#threads> = 0){
  if($ time_until_kill_threads%5 == 0){
  printf
  “%s等待所有运行的%d查询断开连接..(最大%d毫秒)\ n”
  current_time_us(),$#threads + 1,$ time_until_kill_threads * 100;
  if($#threadsnew([$ _]) - > Indent(0) - > Terse(1) - > Dump。“\ n”
  foreach(@threads);
  }
  }
  sleep_until();
  $ _tstart = ;
  $ time_until_kill_threads--;
  @threads = get_threads_util($ orig_master_handler - > {dbh},
  $ orig_master_handler - > {connection_id});
  }}
  ## Terminating all threads
  print current_time_us() . " Killing all application threads..\n";
  $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
  print current_time_us() . " done.\n";
  $orig_master_handler->enable_log_bin_local();
  $orig_master_handler->disconnect();
  ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
  eval {
  `ssh -p$orig_master_ssh_port $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
  };
  if ($@) {
  warn $@;
  }
  $exit_code = 0;
  };
  if ($@) {
  warn "Got Error: $@\n";
  exit $exit_code;
  }
  exit $exit_code;
  }
  elsif ( $command eq "start" ) {
  ## Activating master ip on the new master
  # 1. Create app user with write privileges
  # 2. Moving backup script if needed
  # 3. Register new master's ip to the catalog database
  # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
  # If exit code is 0 or 10, MHA does not abort
  my $exit_code = 10;
  eval {
  my $new_master_handler = new MHA::DBHelper();
  # args: hostname, port, user, password, raise_error_or_not
  $new_master_handler->connect( $new_master_ip, $new_master_port,
  $new_master_user, $new_master_password, 1 );
  ## Set read_only=0 on the new master
  $new_master_handler->disable_log_bin_local();
  print current_time_us() . " Set read_only=0 on the new master.\n";
  $new_master_handler->disable_read_only();
  ## Creating an app user on the new master
  print current_time_us() . " Creating app user on the new master..\n";
  #FIXME_xxx_create_app_user($new_master_handler);
  $new_master_handler->enable_log_bin_local();
  $new_master_handler->disconnect();
  ## Update master ip on the catalog database, etc
  `ssh -p$new_master_ssh_port $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
  $exit_code = 0;
  };
  if ($@) {
  warn "Got Error: $@\n";
  exit $exit_code;
  }
  exit $exit_code;
  }
  elsif ( $command eq "status" ) {
  # do nothing
  exit 0;
  }
  else {
  &usage();
  exit 1;
  }
  }
  sub usage {
  print
  "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
  }
  (3) 编辑脚本send_report,修改后如下:
  #!/usr/bin/perl
  use strict;
  use warnings FATAL => 'all';
  use Mail::Sender;
  use Getopt::Long;
  #new_master_host and new_slave_hosts are set only when recovering master succeeded
  my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
  my $smtp='smtp.163.com';
  my $mail_from='xxxx';
  my $mail_user='xxxxx';
  my $mail_pass='xxxxx';
  my $mail_to=['xxxx','xxxx'];
  GetOptions(
  'orig_master_host=s' => \$dead_master_host,
  'new_master_host=s'=> \$new_master_host,
  'new_slave_hosts=s'=> \$new_slave_hosts,
  'subject=s'          => \$subject,
  'body=s'             => \$body,
  );
  mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
  sub mailToContacts {
  my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
  open my $DEBUG, "> /tmp/monitormail.log"
  or die "Can't open the debug      file:$!\n";
  my $sender = new Mail::Sender {
  ctype       => 'text/plain; ',
  encoding    => 'utf-8',
  smtp      => $smtp,
  from      => $mail_from,
  auth      => 'LOGIN',
  TLS_allowed => '0',
  authid      => $user,
  authpwd   => $passwd,
  to          => $mail_to,
  subject   => $subject,
  debug       => $DEBUG
  };
  $sender->MailMsg(
  {   msg   => $msg,
  debug => $DEBUG
  }
  ) or print $Mail::Sender::Error;
  return 1;
  }
  # Do whatever you want here
  exit 0;
  五、MHA的日常管理
  1. 检查SSH配置(node5 Monitor 监控节点上操作),如下:
  # masterha_check_ssh --conf=/etc/masterha/app1.cnf
  Sun May1 22:05:12 2016 - Global configuration file /etc/masterha_default.cnf not found. Skipping.
  Sun May1 22:05:12 2016 - Reading application default configuration from /etc/masterha/app1.cnf..
  Sun May1 22:05:12 2016 - Reading server configuration from /etc/masterha/app1.cnf..
  Sun May1 22:05:12 2016 - Starting SSH connection tests..
  Sun May1 22:05:14 2016 -
  Sun May1 22:05:12 2016 - Connecting via SSH from root@192.168.1.121(192.168.1.121:22) to root@192.168.1.122(192.168.1.122:22)..
  Sun May1 22:05:13 2016 -    ok.
  Sun May1 22:05:13 2016 - Connecting via SSH from root@192.168.1.121(192.168.1.121:22) to root@192.168.1.123(192.168.1.123:22)..
  Sun May1 22:05:13 2016 -    ok.
  Sun May1 22:05:14 2016 -
  Sun May1 22:05:13 2016 - Connecting via SSH from root@192.168.1.122(192.168.1.122:22) to root@192.168.1.121(192.168.1.121:22)..
  Sun May1 22:05:13 2016 -    ok.
  Sun May1 22:05:13 2016 - Connecting via SSH from root@192.168.1.122(192.168.1.122:22) to root@192.168.1.123(192.168.1.123:22)..
  Sun May1 22:05:14 2016 -    ok.
  Sun May1 22:05:14 2016 -
  Sun May1 22:05:13 2016 - Connecting via SSH from root@192.168.1.123(192.168.1.123:22) to root@192.168.1.121(192.168.1.121:22)..
  Sun May1 22:05:14 2016 -    ok.
  Sun May1 22:05:14 2016 - Connecting via SSH from root@192.168.1.123(192.168.1.123:22) to root@192.168.1.122(192.168.1.122:22)..
  Sun May1 22:05:14 2016 -    ok.
  Sun May1 22:05:14 2016 - All SSH connection tests passed successfully.
  2. 检查整个复制环境状况(node5 监控节点上操作),如下:
  # masterha_check_repl --conf=/etc/masterha/app1.cnf
  Sun May1 22:46:44 2016 - Global configuration file /etc/masterha_default.cnf not found. Skipping.
  Sun May1 22:46:44 2016 - Reading application default configuration from /etc/masterha/app1.cnf..
  Sun May1 22:46:44 2016 - Reading server configuration from /etc/masterha/app1.cnf..
  Sun May1 22:46:44 2016 - MHA::MasterMonitor version 0.56.
  Sun May1 22:46:45 2016 - GTID failover mode = 0
  Sun May1 22:46:45 2016 - Dead Servers:
  Sun May1 22:46:45 2016 - Alive Servers:
  Sun May1 22:46:45 2016 -    192.168.1.121(192.168.1.121:3306)
  Sun May1 22:46:45 2016 -    192.168.1.122(192.168.1.122:3306)
  Sun May1 22:46:45 2016 -    192.168.1.123(192.168.1.123:3306)
  Sun May1 22:46:45 2016 - Alive Slaves:
  Sun May1 22:46:45 2016 -    192.168.1.122(192.168.1.122:3306)Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
  Sun May1 22:46:45 2016 -    Replicating from 192.168.1.121(192.168.1.121:3306)
  Sun May1 22:46:45 2016 -    Primary candidate for the new Master (candidate_master is set)
  Sun May1 22:46:45 2016 -    192.168.1.123(192.168.1.123:3306)Version=5.6.29-log (oldest major version between slaves) log-bin:enabled
  Sun May1 22:46:45 2016 -    Replicating from 192.168.1.121(192.168.1.121:3306)
  Sun May1 22:46:45 2016 - Current Alive Master: 192.168.1.121(192.168.1.121:3306)
  Sun May1 22:46:45 2016 - Checking slave configurations..
  Sun May1 22:46:45 2016 - read_only=1 is not set on slave 192.168.1.122(192.168.1.122:3306).

  Sun May1 22:46:45 2016 - >  Sun May1 22:46:45 2016 - read_only=1 is not set on slave 192.168.1.123(192.168.1.123:3306).

  Sun May1 22:46:45 2016 - >  Sun May1 22:46:45 2016 - Checking replication filtering settings..
  Sun May1 22:46:45 2016 - binlog_do_db= , binlog_ignore_db=
  Sun May1 22:46:45 2016 - Replication filtering check ok.
  Sun May1 22:46:45 2016 - GTID (with auto-pos) is not supported
  Sun May1 22:46:45 2016 - Starting SSH connection tests..
  Sun May1 22:46:46 2016 - All SSH connection tests passed successfully.
  Sun May1 22:46:46 2016 - Checking MHA Node version..
  Sun May1 22:46:47 2016 - Version check ok.
  Sun May1 22:46:47 2016 - Checking SSH publickey authentication settings on the current master..
  Sun May1 22:46:47 2016 - HealthCheck: SSH to 192.168.1.121 is reachable.
  Sun May1 22:46:47 2016 - Master MHA Node version is 0.56.
  Sun May1 22:46:47 2016 - Checking recovery script configurations on 192.168.1.121(192.168.1.121:3306)..
  Sun May1 22:46:47 2016 -    Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data/ --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000008
  Sun May1 22:46:47 2016 -    Connecting to root@192.168.1.121(192.168.1.121:22)..
  Creating /tmp if not exists..    ok.
  Checking output directory is accessible or not..
  ok.
  Binlog found at /usr/local/mysql/data/, up to master-bin.000008
  Sun May1 22:46:48 2016 - Binlog setting check done.
  Sun May1 22:46:48 2016 - Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
  Sun May1 22:46:48 2016 -    Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.1.122 --slave_ip=192.168.1.122 --slave_port=3306 --workdir=/tmp --target_version=5.6.29-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info--relay_dir=/usr/local/mysql/data/--slave_pass=xxx
  Sun May1 22:46:48 2016 -    Connecting to root@192.168.1.122(192.168.1.122:22)..
  Checking slave recovery environment settings..
  Opening /usr/local/mysql/data/relay-log.info ... ok.

  >
  Temporary>  Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
  done.
  Testing mysqlbinlog output.. done.
  Cleaning up test file(s).. done.
  Sun May1 22:46:48 2016 -    Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.1.123 --slave_ip=192.168.1.123 --slave_port=3306 --workdir=/tmp --target_version=5.6.29-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info--relay_dir=/usr/local/mysql/data/--slave_pass=xxx
  Sun May1 22:46:48 2016 -    Connecting to root@192.168.1.123(192.168.1.123:22)..
  Checking slave recovery environment settings..
  Opening /usr/local/mysql/data/relay-log.info ... ok.

  >
  Temporary>  Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
  done.
  Testing mysqlbinlog output.. done.
  Cleaning up test file(s).. done.
  Sun May1 22:46:48 2016 - Slaves settings check done.
  Sun May1 22:46:48 2016 -
  192.168.1.121(192.168.1.121:3306) (current master)
  +--192.168.1.122(192.168.1.122:3306)
  +--192.168.1.123(192.168.1.123:3306)
  Sun May1 22:46:48 2016 - Checking replication health on 192.168.1.122..
  Sun May1 22:46:48 2016 - ok.
  Sun May1 22:46:48 2016 - Checking replication health on 192.168.1.123..
  Sun May1 22:46:48 2016 - ok.
  Sun May1 22:46:48 2016 - Checking master_ip_failover_script status:
  Sun May1 22:46:48 2016 -    /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.121 --orig_master_ip=192.168.1.121 --orig_master_port=3306
  IN SCRIPT TEST====/etc/init.d/keepalived stop==/etc/init.d/keepalived start===
  Checking the Status of the script.. OK
  Sun May1 22:46:48 2016 - OK.
  Sun May1 22:46:48 2016 - shutdown_script is not defined.
  Sun May1 22:46:48 2016 - Got exit code 0 (Not master dead).
  MySQL Replication Health is OK.
  #
  3. 开启MHA Manager监控(node5操作)如下:
  # mkdir -p/var/log/masterha/app1/
  # nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
  参数说明:
  --remove_dead_master_conf   #该参数代表当发生主从切换后,老的主库的ip将会从配置文件中移除。
  --manger_log                  #日志存放位置
  --ignore_last_failover      #在缺省情况下,如果MHA检测到连续发生宕机,会生成app1.failover.complete文件,会造成MHA管理进程无法启动。
  4. 查看MHA Manager监控是否正常:
  # masterha_check_status --conf=/etc/masterha/app1.cnf
  app1 (pid:2480) is running(0:PING_OK), master:192.168.1.121
  5. 查看启动日志(node5操作)如下:
  # tail -n20 /var/log/masterha/app1/manager.log
  6. 关闭MHA Manage监控:
  (1) 关闭
  # masterha_stop --conf=/etc/masterha/app1.cnf
  (2) 启动
  # nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1
  六、MHA Failover切换
  1. 自动Failover切换
  (1) 模拟master mysql关闭
  (2) VIP将会切换到node2
  (3) /etc/masterha/app1.cnf中将原主服务器配置文件清掉。
  (4) masterha_manager监控进程会自动退出关闭,并在/var/log/masterha/app1下生成app1.failover.complete文件,manager.log会记录全过程,从服务器会自动从新的主服务器复制。
  (5) 原主服务器mysqld启动的,需要清掉/var/log/masterha/app1下生成app1.failover.complete文件,添加node1配置文件到/etc/masterha/app1.cnf,通过manager.log中的记录的故障点,重新同步主服务器,成为从节点。
  2. 手动Failover切换
  (1) 先停MHA Manager进程。
  masterha_stop --conf=/etc/masterha/app1.cnf
  (2) 停掉master mysqld
  (3) 手动切换,在Manager主机上操作如下:
  # masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.cnf --dead_master_host=192.168.1.122 --dead_master_port=3306 --new_master_host=192.168.1.121 --new_master_port=3306 --ignore_last_failover
  通过观察日志可以观察切换全过程。
  (4) 如上节方式恢复节点为从服务器。
  3.正常运行情况下切换(Master正在运行)
  等补充。
  4. 小结
  通过对MMM,MHA的环境搭建测试,MHA由于采用复制架构,原理简单,在一些对数据要求比较高的环境,为了保证可靠性,最好与半同步结合使用。

页: [1]
查看完整版本: MySQL MHA高可用环境部署