基于heartbeat v2 和 heartbeat

814247614 发表于 2019-1-6 06:49:21

　　针对LVS的Director的高可用集群测试
　　实验目的：
　　1. 测试 Director的高可用集群
　　2. 观察heartbeat-ldirectord 对后端 Real Server 健康状况的检测
　　实验环境：
　　Redhat 5.8
　　VIP          192.168.0.2
Real Server：
RS1       192.168.0.5
RS2       192.168.0.6
Director：
node1    192.168.0.11
node2    192.168.0.12
　　需要用到的rpm包：
　　heartbeat-2.1.4-9.el5.i386.rpm
heartbeat-stonith-2.1.4-10.el5.i386.rpm
heartbeat-gui-2.1.4-9.el5.i386.rpm
libnet-1.1.4-3.el5.i386.rpm
heartbeat-ldirectord-2.1.4-9.el5.i386.rpm
perl-MailTools-1.77-1.el5.noarch.rpm
heartbeat-pils-2.1.4-10.el5.i386.rpm
　　另外还要准备好系统光盘,作为yum源
　　下面来说具体实验过程：
　　一. 先配置 Real Server
1. 同步两台Real Server的时间
# hwclock -s
2. 安装 apache
# yum -y install httpd
　　为两台Real Server提供网页文件

[*]# echo "Real Server 1" > /var/www/html/index.html
[*][root@RS2 ~]# echo "Real Server 2" > /var/www/html/index.html

[*]# vi /etc/httpd/conf/httpd.conf
[*]             更改：ServerName RS1.yue.com
[*]
[*][root@RS2 ~]# vi /etc/httpd/conf/httpd.conf
[*]             更改：ServerName RS2.yue.com

　　# /etc/init.d/httpd start
　　3. 在RS1上编辑内核的相关参数

[*]# echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore
[*]# echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
[*]# echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce
[*]# echo 2 > /proc/sys/net/ipv4/conf/eth0/arp_announce
[*]# ifconfig lo:0 192.168.0.2 broadcast 192.168.0.255 netmask 255.255.255.255 up
[*]# ifconfig
[*]eth0    Link encap:EthernetHWaddr 00:0C:29:7E:8B:C6
[*]       inet addr:192.168.0.5Bcast:192.168.0.255Mask:255.255.255.0
[*]       UP BROADCAST RUNNING MULTICASTMTU:1500Metric:1
[*]       RX packets:2719 errors:0 dropped:0 overruns:0 frame:0
[*]       TX packets:3628 errors:0 dropped:0 overruns:0 carrier:0
[*]       collisions:0 txqueuelen:1000
[*]       RX bytes:200533 (195.8 KiB)TX bytes:644821 (629.7 KiB)
[*]       Interrupt:67 Base address:0x2000
[*]
[*]lo    Link encap:Local Loopback
[*]       inet addr:127.0.0.1Mask:255.0.0.0
[*]       UP LOOPBACK RUNNINGMTU:16436Metric:1
[*]       RX packets:71 errors:0 dropped:0 overruns:0 frame:0
[*]       TX packets:71 errors:0 dropped:0 overruns:0 carrier:0
[*]       collisions:0 txqueuelen:0
[*]       RX bytes:5699 (5.5 KiB)TX bytes:5699 (5.5 KiB)
[*]
[*]lo:0    Link encap:Local Loopback
[*]       inet addr:192.168.0.2Mask:255.255.255.255
[*]       UP LOOPBACK RUNNINGMTU:16436Metric:1
[*]
[*]
[*]# elinks -dump http://192.168.0.5
[*]Real Server 1
[*]# elinks -dump http://192.168.0.2
[*]Real Server 1
[*]# route add -host 192.168.0.2 dev lo:0
[*]# route -n
[*]Kernel IP routing table
[*]Destination Gateway       Genmask       Flags Metric Ref Use Iface
[*]192.168.0.2 0.0.0.0       255.255.255.255 UH 0    0    0 lo
[*]192.168.0.0 0.0.0.0       255.255.255.0 U 0    0    0 eth0
[*]169.254.0.0 0.0.0.0       255.255.0.0 U 0    0    0 eth0
[*]0.0.0.0       192.168.0.1 0.0.0.0       UG 0    0    0 eth0

http://blog.运维网.com/attachment/201208/112906971.jpg
　　设定服务开机自动启动

[*]# chkconfig --add httpd
[*]# chkconfig httpd on
[*]# chkconfig --list httpd
[*]httpd          0:off 1:off 2:on 3:on 4:on 5:on 6:off

　　4. 在RS2 上做同样的设置
http://blog.运维网.com/attachment/201208/120054688.jpg二、配置Director
节点的主机名要与 "uname -n "的结果一致
1. 先在 node1 上配置
同步时间：
   # hwclock -s
主机名解析：
   # vi /etc/hosts    添加如下内容
         192.168.0.11 node1.yue.com node1
         192.168.0.12 node2.yue.com node2
主机名：
　　# hostname RS1

[*]# vi /etc/sysconfig/network
[*]
[*]NETWORKING=yes
[*]NETWORKING_IPV6=no
[*]HOSTNAME=node1.yue.com


　　IP地址：

[*]# vi /etc/sysconfig/network-scripts/ifcfg-eth0
[*]
[*]# Advanced Micro Devices 79c970
[*]DEVICE=eth0
[*]BOOTPROTO=none
[*]ONBOOT=yes
[*]HWADDR=00:0c:29:e7:3d:5c
[*]IPADDR=192.168.0.11
[*]GATEWAY=192.168.0.1
[*]NETAMASK=255.255.255.0

　　双机互信：

[*]# ssh-keygen -t rsa
[*]Generating public/private rsa key pair.
[*]Enter file in which to save the key (/root/.ssh/id_rsa):    密码为空（直接回车）
[*]Created directory '/root/.ssh'.
[*]Enter passphrase (empty for no passphrase):                再次输入密码
[*]Enter same passphrase again:
[*]Your identification has been saved in /root/.ssh/id_rsa.
[*]Your public key has been saved in /root/.ssh/id_rsa.pub.
[*]The key fingerprint is:
[*]0f:c8:62:6b:2e:68:4c:8b:ce:0f:25:52:23:93:c7:0a root@node1.yue.com
[*]# ssh-copy-id -i .ssh/id_rsa.pubroot@node2.yue.com 将公钥传送到node2（默认在root用户的家目录下的.ssh目录下）
[*]15
[*]The authenticity of host 'node2.yue.com (192.168.0.12)' can't be established.
[*]RSA key fingerprint is 9d:d9:14:94:81:c2:7b:d5:7b:af:2c:64:58:8f:e3:49.
[*]Are you sure you want to continue connecting (yes/no)? yes       提示是否接受连接
[*]root@node2.yue.com's password:                                  输入node2的密码
[*]Now try logging into the machine, with "ssh 'root@node2.yue.com'", and check in:
[*]
[*].ssh/authorized_keys
[*]
[*]to make sure we haven't added extra keys that you weren't expecting.

　　测试一下效果：

[*]# ssh node2 'ifconfig'                   远程在node2上执行命令令
[*]The authenticity of host 'node2 (192.168.0.12)' can't be established.
[*]RSA key fingerprint is 9d:d9:14:94:81:c2:7b:d5:7b:af:2c:64:58:8f:e3:49.
[*]Are you sure you want to continue connecting (yes/no)? yes
[*]Warning: Permanently added 'node2' (RSA) to the list of known hosts.
[*]eth0    Link encap:EthernetHWaddr 00:0C:29:D9:75:DF
[*]       inet addr:192.168.0.12 Bcast:192.168.0.255Mask:255.255.255.0
[*]       UP BROADCAST RUNNING MULTICASTMTU:1500Metric:1
[*]       RX packets:629 errors:0 dropped:0 overruns:0 frame:0
[*]       TX packets:528 errors:0 dropped:0 overruns:0 carrier:0
[*]       collisions:0 txqueuelen:1000
[*]       RX bytes:60497 (59.0 KiB)TX bytes:56572 (55.2 KiB)
[*]       Interrupt:67 Base address:0x2000
[*]
[*]lo    Link encap:Local Loopback
[*]       inet addr:127.0.0.1Mask:255.0.0.0
[*]       UP LOOPBACK RUNNINGMTU:16436Metric:1
[*]       RX packets:10 errors:0 dropped:0 overruns:0 frame:0
[*]       TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
[*]       collisions:0 txqueuelen:0
[*]       RX bytes:692 (692.0 b)TX bytes:692 (692.0 b)

　　2. 在 node2 上做相应的配置
　　3. 安装相关软件包：

[*]
[*]# ls
[*]heartbeat-2.1.4-9.el5.i386.rpm
[*]heartbeat-stonith-2.1.4-10.el5.i386.rpm
[*]heartbeat-gui-2.1.4-9.el5.i386.rpm
[*]libnet-1.1.4-3.el5.i386.rpm
[*]heartbeat-ldirectord-2.1.4-9.el5.i386.rpm
[*]perl-MailTools-1.77-1.el5.noarch.rpm
[*]heartbeat-pils-2.1.4-10.el5.i386.rpm
[*]
[*]# yum --nogpgcheck localinstall *.rpm
[*]
[*]# chkconfig --list ldirectord
[*]ldirectord       0:off 1:off 2:off 3:on 4:off 5:on 6:off
[*]# chkconfigldirectord off
[*]
[*]
[*][root@node2 tmp]# yum --nogpgcheck localinstall *.rpm
[*][root@node2 tmp]# chkconfig ldirectord off
[*][root@node2 tmp]# chkconfig --list ldirectord
[*]ldirectord       0:off 1:off 2:off 3:off 4:off 5:off 6:off

　　配置文件：

[*]# cp /usr/share/doc/heartbeat-ldirectord-2.1.4/ldirectord.cf/etc/ha.d/
[*]# cd/usr/share/doc/heartbeat-2.1.4/
[*]# cp ha.cf authkeys haresources/etc/ha.d/
[*]# cd /etc/ha.d/
[*]# chmod 600 authkeys          一定要改权限，否则启动的时候会报错

　　(1)
# vi /etc/ha.d/ha.cf

[*]logfile /var/log/ha-log
[*]logfacility local0
[*]keepalive 2                   多长时间传送一次心跳信息
[*]deadtime 30                   多长时间收不到心跳信息，就认为死亡
[*]warntime 10                   警告时间
[*]initdead 120                第一次启动时等待多长时间，认为死亡，通常就为deadtime 的2倍
[*]udpport 694
[*]bcast eth0    # Linux          广播方式传送心跳信息
[*]auto_failback on                是否自动回收资源
[*]node node1.yue.com                节点列表.要与/etc/hosts文件中的定义相同
[*]node node2.yue.com
[*]ping    192.168.0.1       指向一个IP（通常是离我们最近的网关），检查自身的网络连接，以确定对方是否已经死亡
[*]compression bz2             传送的信息要压缩
[*]compression_threshold 2       压缩的下限
[*]
[*]并添加： crm respawn 这一行
[*]

　　(2)

[*]# dd if=/dev/urandomcount=1 bs=512 | md5sum
[*]1+0 records in
[*]1+0 records out
[*]512 bytes (512 B) copied, 0.00025866 seconds, 2.0 MB/s
[*]4faf8724bc49da78b21fc04ceb7b5bc3-

　　# vi /etc/ha.d/authkeys

[*]#auth 1             使用哪种加密机制，并指定其编号
[*]#1 crc
[*]#2 sha1 HI!
[*]#3 md5 Hello!
[*]
[*]auth 1                使用1这个编号的算法
[*]1 sha1 1daea09a52368d9fe65a37163d4ae3ea          1 号算法为sha1

　　(3).
# vi /etc/ha.d/ldirectord.cf

[*]virtual=192.168.0.2:80                Vip
[*]    real=192.168.0.5:80 gate          gate： dr模型
[*]    real=192.168.0.6:80 gate
[*]
[*]    fallback=127.0.0.1:80 gate    若两个Real Server都挂掉，是否通过本机给客户端一个提示信息
[*]    service=http                   基于什么协议检测后端的Real Server
[*]    request="test.html"             检测哪个网页
[*]    receive="Real Server OK"       期望从检测网页得到什么样内容
[*]
[*]    scheduler=rr
[*]#    persistent=600                持久性
[*]    netmask=255.255.255.255          定义广播域
[*]    protocol=tcp
[*]    checktype=negotiate             检测的方式，协商
[*]    checkport=80

　　提供检测页面：

[*]# vi /var/www/html/test.html
[*]                Real Server OK
[*]
[*]
[*][root@RS2 ~]# vi /var/www/html/test.html
[*]                Real Server OK

　　传送配置文件：

[*]
[*]# scp authkeys ha.cf haresourcesldirectord.cfnode2:/etc/ha.d/
[*]authkeys                                                             100%692 0.7KB/s 00:00
[*]ha.cf                                                                100% 10KB10.4KB/s 00:00
[*]haresources                                                          100% 5905 5.8KB/s 00:00
[*]ldirectord.cf                                                       100% 7689 7.5KB/s 00:00

　　启动heartbeat:
　　启动是有顺序的：必须先在node 1 上启动，然后在node1 上远程启动node2 上的heartbeat；关闭node2的时候必须是在node1远程进行

[*]# /etc/init.d/heartbeat start                   先在node 1 上启动
[*]Starting High-Availability services:                      [ OK]
[*]# ssh node2 '/etc/init.d/heartbeat start'       在node 1 上远程启动node 2 上的heartbeat
[*]Starting High-Availability services:
[*]

　　查看当前集群节点的工作状况：

[*]# crm_mon -1          显示当前集群的工作状况，只显示一次
[*]
[*]============
[*]Last updated: Sun Aug5 09:12:40 2012
[*]Current DC: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70)
[*]2 Nodes configured.                2个节点
[*]0 Resources configured.          0个资源
[*]============
[*]
[*]Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): online
[*]Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online

[*]# netstat -tnlp    查看5560端口是否已经开启
[*]tcp    0    0 0.0.0.0:5560             0.0.0.0:*                LISTEN    31039/mgmtd

[*]# crmadmin --status node1.yue.com          查看状态
[*]Status of crmd@node1.yue.com: S_IDLE (ok)             主节点DC
[*]# crmadmin --status node2.yue.com
[*]Status of crmd@node2.yue.com: S_NOT_DC (ok)

[*]# tail -1 /etc/passwd       给hacuster用户添加密码
[*]hacluster:x:101:157:heartbeat user:/var/lib/heartbeat/cores/hacluster:/sbin/nologin
[*]
[*]# passwd hacluster
[*]Changing password for user hacluster.
[*]New UNIX password:                         输入密码
[*]BAD PASSWORD: it is based on a dictionary word
[*]Retype new UNIX password:                再输入一次
[*]passwd: all authentication tokens updated successfully.

　　配置资源：
　　web_ip
web_ldirectord
　　# hb_gui &
3535
http://blog.运维网.com/attachment/201208/132154513.jpg
http://blog.运维网.com/attachment/201208/132840867.jpg
http://blog.运维网.com/attachment/201208/133255333.jpg
　　新建资源组：web_server
http://blog.运维网.com/attachment/201208/133702661.jpg
http://blog.运维网.com/attachment/201208/133419557.jpg
　　在组中新建资源:
http://blog.运维网.com/attachment/201208/133901111.jpg
　　创建资源web_ip
http://blog.运维网.com/attachment/201208/134215712.jpg
　　创建资源web_ldirectord
http://blog.运维网.com/attachment/201208/134433430.jpg
　　启动资源：
http://blog.运维网.com/attachment/201208/134745875.jpg
http://blog.运维网.com/attachment/201208/135054401.jpg

[*]# crm_mon -1
[*]
[*]
[*]============
[*]Last updated: Sun Aug5 10:03:17 2012
[*]Current DC: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff)
[*]2 Nodes configured.
[*]1 Resources configured.
[*]============
[*]
[*]Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): online
[*]Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online
[*]
[*]Resource Group: Web_server
[*] web_ldirectord (ocf::heartbeat:ldirectord): Started node1.yue.com
[*] web_ip (o cf::heartbeat:IPaddr2): Started node1.yue.com

[*]# ip addr show
[*]1: lo:mtu 16436 qdisc noqueue
[*] link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[*] inet 127.0.0.1/8 scope host lo
[*]2: eth0:mtu 1500 qdisc pfifo_fast qlen 1000
[*] link/ether 00:0c:29:e7:3d:5c brd ff:ff:ff:ff:ff:ff
[*] inet 192.168.0.11/24 brd 192.168.0.255 scope global eth0
[*] inet 192.168.0.2/32 brd 192.168.0.255 scope global eth0

[*]# ipvsadm -Ln
[*]IP Virtual Server version 1.2.1 (size=4096)
[*]Prot LocalAddress:Port Scheduler Flags
[*]-> RemoteAddress:Port       Forward Weight ActiveConn InActConn
[*]TCP192.168.0.2:80 rr
[*]-> 192.168.0.5:80             Route 1    0       3
[*]-> 192.168.0.6:80             Route 1    0       2

　　此时可以用浏览器测试：http://192.168.0.2查看页面是否正常，是否可以负载均衡
　　让node1处于Standby状态，查看资源是否会切换：
http://blog.运维网.com/attachment/201208/143919610.jpg

[*]# ipvsadm -Ln
[*]IP Virtual Server version 1.2.1 (size=4096)
[*]Prot LocalAddress:Port Scheduler Flags
[*]-> RemoteAddress:Port       Forward Weight ActiveConn InActConn
[*]TCP192.168.0.2:80 rr
[*] -> 192.168.0.5:80             Route 1    0       0
[*]-> 192.168.0.6:80             Route 1    0       0 vip已经启用

[*]# ip addr show
[*]1: lo:mtu 16436 qdisc noqueue
[*] link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
[*] inet 127.0.0.1/8 scope host lo
[*]2: eth0:mtu 1500 qdisc pfifo_fast qlen 1000
[*] link/ether 00:0c:29:d9:75:df brd ff:ff:ff:ff:ff:ff
[*] inet 192.168.0.12/24 brd 192.168.0.255 scope global eth0
[*] inet 192.168.0.2/32 brd 192.168.0.255 scope global eth0       vip启用

[*]# crm_mon -1
[*]
[*]
[*]============
[*]Last updated: Sun Aug5 10:34:17 2012
[*]Current DC: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff)
[*]2 Nodes configured.
[*]1 Resources configured.
[*]============
[*]
[*]Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): standby
[*]Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online
[*]
[*]Resource Group: Web_server
[*] web_ldirectord (ocf::heartbeat:ldirectord): Started node2.yue.com
[*] web_ip (ocf::heartbeat:IPaddr2): Started node2.yue.com

http://blog.运维网.com/attachment/201208/143002949.jpg
http://blog.运维网.com/attachment/201208/143044507.jpg
　　三、观察heartbeat-ldirectord 对后端 Real Server 健康状况的检测
　　可以停掉一台Real Server ，然后通过浏览器访问 http://192.168.0.2
　　通过刷新页面来观察heartbeat-ldirectord是否可以检测到后端Real Server的健康状况

页: [1]

运维网's Archiver

基于heartbeat v2 和 heartbeat