Heartbeat+Haresources+NFS配置一个简单的HA高可用+资源共享集群

koflover 发表于 2019-1-7 09:11:59

　　一定要先安装openssh和openssh-clients两个包
　　192.168.139.2
　　# ssh-keygen -t rsa -P '' //做ssh双机互信
　　# ssh-copy-id -i ./id_rsa.pub root@192.168.139.4
　　___________________________________________________________________________________________
　　192.168.139.4
# ssh-keygen -t rsa -P ''
# ssh-copy-id -i ./id_rsa.pub root@192.168.139.2
___________________________________________________________________________________________
　　做时间同步,并且写入计划任务每五分钟同步一次时间，我使用的时互联网上的ntp_server，必须能连上网
　　192.168.139.2
　　# ntpdate 0.uk.pool.ntp.org
　　2 Nov 19:43:56 ntpdate: step time server 109.74.192.97 offset -28799.081856 sec
　　# vim /var/spool/cron/root
　　

　　*/5 * * * * /usr/sbin/ntpdate 0.uk.pool.ntp.org > /dev/null
　　___________________________________________________________________________________________
　　192.168.139.4
　　# ntpdate 0.uk.pool.ntp.org
　　2 Nov 19:43:56 ntpdate: step time server 109.74.192.97 offset -28799.081856 sec
　　# vim /var/spool/cron/root
　　

　　*/5 * * * * /usr/sbin/ntpdate 0.uk.pool.ntp.org > /dev/null
　　___________________________________________________________________________________________
　　安装软件，以下只演示在192.168.139.2上过程，192.168.139.4上一样
　　heartbeat官网www.linux-ha.org
　　pacemaker官网www.clusterlabs.org
　　EPEL www.fedoraproject.org/wiki/EPEL fedora的开源站点
　　

　　或者直接像我一样安装fedoraprojict的yum源为第三方yum源，然后直接用yum直接进行本地安装
　　#rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
　　# rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6
　　#yum -y install heartbeat
　　# rpm -q heartbeat //查看安装后的版本号为3.0
　　heartbeat-3.0.4-2.el6.x86_64
　　

　　# rpm -qi heartbeat //查询heartbeat包的说明信息。包括版本号，来源，描                         // 述等
　　Name    : heartbeat                Relocations: (not relocatable)
　　Version : 3.0.4                         Vendor: Fedora Project
　　Release : 2.el6                      Build Date: Tue 03 Dec 2013 12:37:21 AM CST
　　Install Date: Wed 02 Nov 2016 11:48:07 PM CST    Build Host: buildvm-14.phx2.fedoraproject.org
　　Group    : System Environment/Daemons Source RPM: heartbeat-3.0.4-2.el6.src.rpm
　　Size    : 269152                         License: GPLv2 and LGPLv2+
　　Signature : RSA/8, Tue 03 Dec 2013 06:59:13 AM CST, Key ID 3b49df2a0608b895
　　Packager : Fedora Project
　　URL       : http://linux-ha.org/
　　Summary : Messaging and membership subsystem for High-Availability Linux
　　Description :
　　heartbeat is a basic high-availability subsystem for Linux-HA.
　　It will run scripts at initialization, and when machines go up or down.
　　This version will also perform IP address takeover using gratuitous ARPs.
　　

　　Heartbeat contains a cluster membership layer, fencing, and local and
　　clusterwide resource management functionality.
　　

　　When used with Pacemaker, it supports "n-node" clusters with significant
　　capabilities for managing resources and dependencies.
　　

　　In addition it continues to support the older release 1 style of
　　2-node clustering.
　　

　　It implements the following kinds of heartbeats:
　　- Serial ports
　　- UDP/IP multicast (ethernet, etc)
　　- UDP/IP broadcast (ethernet, etc)
　　- UDP/IP heartbeats
　　- "ping" heartbeats (for routers, switches, etc.)
　　(to be used for breaking ties in 2-node systems)
　　# rpm -ql heartbeat //heartbeat安装完后生成的文件
　　/etc/ha.d
　　/etc/ha.d/README.config
　　/etc/ha.d/harc
　　/etc/ha.d/rc.d
　　/etc/ha.d/rc.d/ask_resources
　　/etc/ha.d/rc.d/hb_takeover
　　/etc/ha.d/rc.d/ip-request
　　/etc/ha.d/rc.d/ip-request-resp
　　/etc/ha.d/rc.d/status
　　/etc/ha.d/resource.d
　　/etc/ha.d/resource.d/AudibleAlarm
　　/etc/ha.d/resource.d/Delay
　　/etc/ha.d/resource.d/Filesystem
　　

　　# ls /usr/share/doc//heartbeat-3.0.4/
　　apphbd.cfauthkeysAUTHORSChangeLogCOPYINGCOPYING.LGPLha.cfharesources
　　authkeys文件权限为600，是节点间进行通信的密钥文件，可以通过密钥验证节点的合法性防止随便加台    服务器，配置好VIP，资源便可以加入集群
　　ha.cf文件为heartbeat服务的配置文件
　　haresources文件为资源管理配置文件，即CRM;heartbeat v3的CRM被独立了出去叫pacemaker
　　

　　# cp /usr/share/doc/heartbeat-3.0.4/{authkeys,haresources,ha.cf} /etc/ha.d/ -p
　　# dd if=/dev/random bs=512 count=1 |md5sum //产生512长的随机数在用MD5加密
　　0+1 records in
　　0+1 records out
　　53 bytes (53 B) copied, 0.000186312 s, 284 kB/s
　　e4b8f2837725f10ed16bfd1738b89541-
　　# vim /etc/ha/authkeys
　　auth 1
　　1 md5 e4b8f2837725f10ed16bfd1738b89541//采用MD5加密通信
　　

　　# vim /etc/ha/ha.cf
　　

　　#
　　#    File to write debug messages to
　　#debugfile /var/log/ha-debug //debug的调试日志
　　#
　　#
　　#    File to write other messages to
　　#
　　logfile    /var/log/ha-log //ha的日志
　　#
　　#
　　#    Facility to use for syslog()/logger
　　#
　　#logfacility local0 //local0表示一个日志设施，表示用syslog来记录日志，不能与logfile同             //时启用
　　#
　　#
　　#    A note on specifying "how long" times below...
　　#
　　#    The default time unit is seconds
　　#             10 means ten seconds
　　#
　　#    You can also specify them in milliseconds
　　#             1500ms means 1.5 seconds
　　#
　　#
　　#    keepalive: how long between heartbeats?
　　#
　　keepalive 2 //每两秒发一次心跳信息
　　deadtime 30 //30秒未收到对方心跳信息就认为对方挂掉了
　　#udpport    694 //以UDP/694传输心跳信息
　　#bcasteth0          # Linux //以广播形式传递心跳信息，且从eth0网卡传输
　　#bcasteth1 eth2    # Linux
　　#mcast eth0 225.0.0.1 694 1 0 //以组播255.0.0.1从eth0传输，TTL值，循环值
　　#ucast eth0 192.168.1.2 //以单播192.1681.2传输
　　#auto_failback on //节点恢复正常后是否再将资源转移回来
　　#node ken3
　　#node kathy//在此处下方要加入你的集群节点，要与uname -n命令显示一致
　　node www.rs1.com
　　node www.rs2.com
　　#
　　ping 10.10.10.254 //可以ping网关192.168.139.1来判断自己是否挂掉了
　　#ping_group group1 10.10.10.254 10.10.10.253//还可以通过Ping这个组中任意一个来判断自己是否                            //挂掉
　　#respawn hacluster /usr/lib/heartbeat/ipfail //定义节点挂掉后是否进行重启
　　

　　# cat ./ha/ha.cf |grep -v "^#.*"//最后只需启用这些便可，甚至只要启用节点和                               //广播便可
　　logfile/var/log/ha-log
　　keepalive 2
　　deadtime 30
　　bcast eth0# Linux
　　node www.rs1.com

　　node www.rs2.com
　　ping 192.168.139.1
　　定义集群资源
　　

　　# vim /etc/ha/haresources
　　#
　　#    An example where a shared filesystem is to be used.
　　#    Note that multiple aguments are passed to this script using
　　#    the delimiter '::' to separate each argument.
　　#
　　#node110.0.0.170 Filesystem::/dev/sda1::/data1::ext2
　　

　　

　　

　　每一行定义一个集群服务

　　node1 主节点的节点名一定要与uname -n显示一致
　　10.0.0.170 VIP 为定义的第一个资源
　　Filesystem 资源代理，后面用：：隔离多个参数，Filesystem为定义的第二个资源
　　1 /dev/sda1 Filesystem的第一个参数挂载的设备
　　2 /data1 Filesystem的第二个参数挂载点
　　3 ext2 Filesystem的第三个参数文件系统
　　1.2.3表示将/dev/sda1 挂载到/data1 且以ext2方式挂载
　　

　　#just.linux-ha.org    135.9.216.110 135.9.215.111 135.9.216.112 httpd //多个VIP httpd服务，则此服务会运行在多个节点上
　　

　　

　　# cd /etc/ha.d/resource.d/
　　# ls //有许多资源代理FilesystemIPaddr 就在这
　　apacheAudibleAlarmdb2DelayFilesystemhto-mapfuncsICPidsIPaddrIPaddr2IPsrcaddrIPv6addrLinuxSCSILVMMailToOCFportblockRaid1SendArpServeRAIDWASWinPopupXinetd
　　

　　# chkconfig httpd off //千万别让服务开机自启动，要有CRM决定，本实验采                         //用的CRM为haresources
　　

　　# vim /etc/ha/haresources //定义资源
　　www.rs1.com IPaddr::192.168.139.10/24/eth0 httpd //定义www.rs1.com为主节点，VIP为                                  192.168.139.10 掩码为24 将VIP配在eth0的别名上
　　# scp -p authkeys ha.cf haresources 192.168.139.4:/etc/ha.d/
　　//两个节点上的配置文件一样复制过去
　　

　　# service heartbeat start //先启动主节点
　　

　　# ssh 192.168.139.4 service heartbeat start //ssh启动备份节点
　　

　　# ip addr show //可以看到VIP已经启用
　　: eth0:mtu 1500 qdisc pfifo_fast state UP qlen 1000
　　link/ether 00:0c:29:1c:13:12 brd ff:ff:ff:ff:ff:ff
　　inet 192.168.139.2/24 brd 192.168.139.255 scope global eth0
　　inet 192.168.139.10/24 brd 192.168.139.255 scope global secondary eth0
　　inet6 fe80::20c:29ff:fe1c:1312/64 scope link
　　valid_lft forever preferred_lft forever
　　# netstat -unlp //heartbeat也启动了
　　Active Internet connections (only servers)
　　Proto Recv-Q Send-Q Local Address             PID/Program name
　　udp    0    0 127.0.0.1:659          1323/rpc.statd
　　udp    0    0 0.0.0.0:694          1604/heartbeat: wri
　　udp    0    0 0.0.0.0:55890       1604/heartbeat: wri
　　udp    0    0 0.0.0.0:111          1301/rpcbind
　　udp    0    0 0.0.0.0:628          1301/rpcbind
　　udp    0    0 0.0.0.0:52727       1323/rpc.statd
　　udp    0    0 :::46780          1323/rpc.statd
　　udp    0    0 :::111             1301/rpcbind
　　udp    0    0 :::628             1301/rpcbind
　　# netstat -tnlp |grep httpd //httpd服务也启动了
　　tcp    0       0 :::80                   :::*
                  LISTEN                2284/httpd
　　# iptables -F //清空iptables规则
　　

　　测试访问192.168.139.10 VIP，显示主节点RS1
http://s1.运维网.com/wyfs02/M01/89/BC/wKiom1ga_eTxfIJxAABeBtvyUbQ831.png-wh_500x0-wm_3-wmp_4-s_2120442027.png
　　

　　# service heartbeat stop //停止192.168.139.2上的heartbeat，看是否资源转移
　　Stopping High-Availability services: Done.
　　___________________________________________________________________________________________
　　192.168.139.4
　　

　　# vim /var/log/ha-log //看日志可知道VIP httpd服务在192.168.139.4上启用了
　　Nov 03 17:52:46 www.rs1.com heartbeat: : info: Local Resource acquisition completed.
　　harc(default): 2016/11/03_17:52:46 info: Running /etc/ha.d//rc.d/status status
　　mach_down(default):    2016/11/03_17:52:46 info: mach_down takeover complete for node www.rs2.com.
　　Nov 03 17:52:46 www.rs1.com heartbeat: : info: Initial resource acquisition complete (status)
　　harc(default): 2016/11/03_17:52:46 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
　　ip-request-resp(default): 2016/11/03_17:52:46 received ip-request-resp IPaddr::192.168.139.10/24/eth1 OK yes
　　ResourceManager(default): 2016/11/03_17:52:46 info: Acquiring resource group: www.rs1.com IPaddr::192.168.139.10/24/eth1 httpd
　　/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:47 INFO:Resource is stopped
　　ResourceManager(default): 2016/11/03_17:52:47 info: Running /etc/ha.d/resource.d/IPaddr 192.168.139.10/24/eth1 start
　　IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:47 INFO: Adding inet address 192.168.139.10/24 with broadcast address 192.168.139.255 to device eth1
　　IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:47 INFO: Bringing device eth1 up
　　IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:47 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.139.10 eth1 192.168.139.10 auto not_used not_used
　　/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:47 INFO:Success
　　ResourceManager(default): 2016/11/03_17:52:47 info: Running /etc/init.d/httpdstart
　　harc(default): 2016/11/03_17:52:48 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
　　ip-request-resp(default): 2016/11/03_17:52:48 received ip-request-resp IPaddr::192.168.139.10/24/eth1 OK yes
　　ResourceManager(default): 2016/11/03_17:52:48 info: Acquiring resource group: www.rs1.com IPaddr::192.168.139.10/24/eth1 httpd
　　/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10): 2016/11/03_17:52:48 INFO:Running OK
　　Nov 03 17:54:30 www.rs1.com heartbeat: : info: Link www.rs1.com:eth1 up.
　　# ip addr show //VIP 已经进行转移
　　eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000

　　link/ether 00:0c:29:5f:68:2f brd ff:ff:ff:ff:ff:ff
　　inet 192.168.139.4/24 brd 192.168.139.255 scope global eth1
　　inet 192.168.139.10/24 brd 192.168.139.255 scope global secondary eth1
　　inet6 fe80::20c:29ff:fe5f:682f/64 scope link
　　valid_lft forever preferred_lft forever
　　

　　# netstat -tnlp //httpd服务进行了转移
　　tcp    0    0 :::80 LISTEN    :::* 2952/httpd
　　

# iptables -F //清空iptables规则

　　浏览器测试一下显示RS2
http://s1.运维网.com/wyfs02/M02/89/BD/wKiom1gbCqPzDehfAABQDDUKy6Q348.png-wh_500x0-wm_3-wmp_4-s_765467296.png
　　___________________________________________________________________________________________
　　192.168.139.2
　　# service heartbeat start //重新启动heartbeat
　　Starting High-Availability services: INFO:Resource is stopped
　　Done.
　　___________________________________________________________________________________________ 192.168.139.4

　　# service heartbeat stop //关闭heartbeat
　　Stopping High-Availability services: Done.
　　___________________________________________________________________________________________
　　192.168.139.2
　　

　　用浏览器再测一下，资源又转移回来了。而如果开启了auto_failback on //节点RS1恢复后会自动将资源转移回来，此实验未启用
http://s4.运维网.com/wyfs02/M01/89/BB/wKioL1gbDerivZTHAABSPa2UTZQ282.png-wh_500x0-wm_3-wmp_4-s_495756621.png
　　再加一台主机192.168.139.3，当做NFS-Server来挂载共享页面
　　___________________________________________________________________________________________
　　192.168.139.2
　　# ssh 192.168.139.4 service heartbeat stop //先关闭备份节点的heartbeat
　　Stopping High-Availability services: Done.
　　

　　# service heartbeat stop //关闭主节点heartbeat
　　Stopping High-Availability services: Done.
　　___________________________________________________________________________________________
　　192.168.139.3
　　# vim /etc/exports //编辑nfs的配置文件
　　/web/htdocs 192.168.139.0/24(ro) 将/web/htdocs目录以只读方式共享给192.168.139.0/24网段

　　# mkdir -pv /web/htdocs
　　#cd /web/htdocs
　　# vim index.html //编辑主页面文件，作为挂载后浏览器的访问
　　

　　www.NFS.com
　　# service rpcbind start //启动rpcbind
　　Starting rpcbind:
　　# service nfs start //启动NFS
　　Starting NFS services:
　　Starting NFS quotas:
　　Starting NFS mountd:
　　Starting NFS daemon:
　　Starting RPC idmapd:
　　showmount -e 查看是否共享出去，但使用这个命令好像192.168.139.2上也要service NFS start 才会查看到共享的目录，否则会报出错误:
　　clnt_create: RPC: Port mapper failure - Unable to receive: errno 113 (No route to host)
　　但是任然可以通过192.168.139.2挂载上去：# mount 192.168.139.3:/web/htdocs /mnt
　　# showmount -e 192.168.139.2
　　Export list for 192.168.139.2:
　　/web/htdocs 192.168.139.0/24
　　___________________________________________________________________________________________
　　192.168.139.2
　　# mount 192.168.139.3:/web/htdocs /mnt
　　# ll
　　total 4
　　-rw-r--r--. 1 nobody nobody 21 Nov4 12:55 index.html
　　# umount 192.168.139.3:/web/htdocs /mnt
　　

　　# vim /etc/ha.d/haresources
　　www.rs1.com IPaddr::192.168.139.10/24/eth0 Filesystem::192.168.139.3:/web/htdocs::/var/www/html::nfs httpd
　　

　　#scp /etc/ha.d/haresources 192.168.139.4:/etc/ha.d/haresources
　　# service heartbeat start
　　# ssh 192.168.139.4 service heartbeat start
　　#iptables -F
　　# setenforce 0
　　浏览器测试可能第一次会出现Apache的主页面。再刷新一下
http://s1.运维网.com/wyfs02/M01/89/C7/wKiom1gcI6nCEqsAAABYS8ItThA798.png-wh_500x0-wm_3-wmp_4-s_485390907.png
　　# service heartbeat stop
　　Stopping High-Availability services:
　　___________________________________________________________________________________________
　　192.168.139.4
　　# netstat -tnlp |grep httpd //资源已经转移
　　tcp 0 0 :::80 LISTEN 2445/httpd
#iptables -F
# setenforce 0
　　

　　浏览器测试
http://s5.运维网.com/wyfs02/M00/89/C4/wKioL1gcJSmwyxTAAABbReTneWc995.png-wh_500x0-wm_3-wmp_4-s_830265182.png
　　

　　这样就实现了将两台主/备节点同时共享一个NFS-Server，主节点挂掉后资源转移到备节点后，仍然会挂载原来的NFS存储设备，从而保持页面内容的一致。使用三个节点一个简单的HA+NFS集群就实现了
　　

页: [1]

运维网's Archiver

Heartbeat+Haresources+NFS配置一个简单的HA高可用+资源共享集群