hgrewe 发表于 2015-11-3 09:09:52

Heartbeat的编译安装配置

一、准备工作
Heartbeat 3.0.6:

1
# wget http://hg.linux-ha.org/heartbeat-STABLE_3_0/archive/958e11be8686.tar.bz2




Cluster Glue 1.0.12:

1
# wget http://hg.linux-ha.org/glue/archive/0a7add1d9996.tar.bz2




Resource Agents 3.9.6:

1
# wget https://github.com/ClusterLabs/resource-agents/archive/v3.9.6.tar.gz






1
2
3
4
# yum install gcc gcc-c++ autoconf automake libtool glib2-devel libxml2-devel bzip2 bzip2-devel e2fsprogs-devel libxslt-devel libtool-ltdl-devel asciidoc
# groupadd haclient
# useradd -g haclient hacluster
# yum install httpd






二、编译Cluster Glue

1
2
3
4
5
6
# tar -jxvf cluster-clue-1.0.12.tar.bz2
# cd Reusable-Cluster-Components-glue--0a7add1d9996/
# ./autogen.sh
# ./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient --enable-fatal-warnings=no LIBS='/lib64/libuuid.so.1'##注:32位系统去掉64
# make
# make install





编译错误1:

1
2
3
4
5
Making all in libltdl
gmake: 进入目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/libltdl”
gmake: *** 没有规则可以创建目标“all”。 停止。
gmake: 离开目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/libltdl”
make: *** 错误 1




解决:

1
# yum install libtool-ltdl-devel





编译错误2:

1
2
3
4
5
6
collect2: error: ld returned 1 exit status
gmake: *** 错误 1
gmake: 离开目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/lib/clplumbing”
gmake: *** 错误 1
gmake: 离开目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/lib”
make: *** 错误 1




解决:

1
# ./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient --enable-fatal-warnings=no LIBS='/lib64/libuuid.so.1'




注:如使用32位系统时,将LIBS改为LIBS='/lib/libuuid.so.1'

编译错误3:

1
2
3
4
5
6
gmake: a2x:命令未找到
gmake: *** 错误 127
gmake: 离开目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/doc”
gmake: *** 错误 1
gmake: 离开目录“/root/Reusable-Cluster-Components-glue--0a7add1d9996/doc”
make: *** 错误 1




解决:

1
# yum install asciidoc






三、编译Resource Agents

1
2
3
4
5
6
# tar -zxvf resource-agents-3.9.6.tar.gz
# cd resource-agents-3.9.6
# ./autogen.sh
#./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient --enable-fatal-warnings=no LIBS='/lib64/libuuid.so.1'
# make
# make install







四、编译Heartbeat

1
2
3
4
5
6
7
# tar -jxvf heartbeat-3.0.6.tar.bz2
# cd Heartbeat-3-0-958e11be8686/
# ./bootstrap
# export CFLAGS="$CFLAGS -I/usr/local/heartbeat/include -L/usr/local/heartbeat/lib"
# ./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient --enable-fatal-warnings=no LIBS='/lib64/libuuid.so.1'
# make
# make install






1
2
3
4
5
6
7
8
# cp doc/{ha.cf,haresources,authkeys} /usr/local/heartbeat/etc/ha.d/
# chkconfig --add heartbeat
# chkconfig heartbeat on
# chmod 600 /usr/local/heartbeat/etc/ha.d/authkeys
# mkdir -pv /usr/local/heartbeat/usr/lib/ocf/lib/heartbeat/
# cp /usr/lib/ocf/lib/heartbeat/ocf-* /usr/local/heartbeat/usr/lib/ocf/lib/heartbeat/
# ln -svf /usr/local/heartbeat/lib64/heartbeat/plugins/RAExec/* /usr/local/heartbeat/lib/heartbeat/plugins/RAExec/
# ln -svf /usr/local/heartbeat/lib64/heartbeat/plugins/* /usr/local/heartbeat/lib/heartbeat/plugins/





编译错误1:

1
2
3
4
5
6
7
8
9
10
11
12
checking heartbeat/glue_config.h usability... no
checking heartbeat/glue_config.h presence... no
checking for heartbeat/glue_config.h... no
configure: error: in `/root/Heartbeat-3-0-958e11be8686':
configure: error: Core development headers were not found
See `config.log' for more details
checking heartbeat/glue_config.h usability... no
checking heartbeat/glue_config.h presence... no
checking for heartbeat/glue_config.h... no
configure: error: in `/root/Heartbeat-3-0-958e11be8686':
configure: error: Core development headers were not found
See `config.log' for more details




解决:

1
# export CFLAGS="$CFLAGS -I/usr/local/heartbeat/include -L/usr/local/heartbeat/lib"





编译错误2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
In file included from ../include/lha_internal.h:41:0,
               from uuid_parse.c:25:
/usr/local/heartbeat/include/heartbeat/glue_config.h:105:0: error: "HA_HBCONF_DIR" redefined [-Werror]
#define HA_HBCONF_DIR "/usr/local/heartbeat/etc/ha.d/"
^
In file included from ../include/lha_internal.h:38:0,
               from uuid_parse.c:25:
../include/config.h:390:0: note: this is the location of the previous definition
#define HA_HBCONF_DIR "/usr/local/heartbeat/etc/ha.d"
^
uuid_parse.c:36:26: fatal error: replace_uuid.h: No such file or directory
#include <replace_uuid.h>
                        ^
cc1: all warnings being treated as errors
compilation terminated.
gmake: *** 错误 1
gmake: 离开目录“/root/Heartbeat-3-0-958e11be8686/replace”
make: *** 错误 1




解决:

1
# ./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient --enable-fatal-warnings=no LIBS='/lib64/libuuid.so.1'





五、Heartbeat配置
Heartbeat的配置主要涉及到ha.cf、haresources、authkeys这三个文件。其中ha.cf是主配置文件,haresource用来配置要让Heartbeat托管的服务,authkey是用来指定Heartbeat的认证方式。


1.配置ha.cf----主配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# cat /usr/local/heartbeat/etc/ha.d/ha.cf|grep ^[^#]
debugfile /var/log/ha-debug         ##用于记录heartbeat的调试信息
logfile/var/log/ha-log                ##用于记录heartbeat的日志信息
logfacilitylocal0                  ##设置heartbeat的日志,这里用的是系统日志
keepalive 2                              ##设定心跳(监测)时间时间为2秒
deadtime 30         ##指定若备用节点在30秒内未收到主节点心跳信号,则接管主服务器资源
warntime 10         ##指定心跳延迟的时间为10秒,10秒内备节点不能接收主节点心跳信号,
                        即往日志写入警告日志,但不会切换服务
initdead 120         ##系统启动或重启后预留的忽略时间段,取值至少为deadtime的两倍
udpport694                         ##广播/单播通讯使用的Udp端口
bcast eno16777736# Linux             ##使用网卡eno16777736发送心跳检测
#mcast eth0 225.0.0.1 694 1 0      ##采用网卡eth0的Udp多播来组织心跳,一般在备用节点
不止一台时使用。Bcast、ucast和mcast分别代表广播、单播和多播,是组织心跳的的方式,任选其一
#ucast eno16777736 192.168.10.133      ##采用网卡eno16777736的udp单播来组织心跳,后面跟的IP地址为双机对方IP地址
auto_failback on               ##定义当主节点恢复后,是否将服务自动切回
#watchdog /dev/watchdog         ##可选配置,通过Heartbeat监控系统运行状态。
node node1                     ##主节点名称,与uname -n显示一致
node node2                      ##备用节点名称
ping 192.168.10.1                ##通过ping网关检测心跳是否正常,仅用来测试网络
respawn hacluster /usr/local/heartbeat/libexec/heartbeat/ipfail   ##指定和heartbeat一起启动、关闭的进程,可选
#apiauth ipfail gid=haclient uid=hacluster   ##设置启动IPfail的用户和组




注:
①watchdog /dev/watchdog:可选配置,通过Heartbeat监控系统运行状态。该特性需在内核中载入"softdog"内核模块,用来生成实际的设备文件,如系统中没有该模块,需进行指定,重新编译内核。编译完成输入 "insmod softdog"加载模块,然后输入"grep misc/proc/devices",输入"cat /proc/misc |grep watchdog",最后生成设备文件:"mknod /dev/watchdog c 10 130" 即可使用

②espawn hacluster /usr/lib/heartbeat/ipfail:指定和heartbeat一起启动、关闭的进程,可选。这些进程一般是和heartbeat集成的插件,遇到故障可自动重启。IPfail进程用于检测和处理网络故障,需配合ping语句指定ping node检测网络连通性;hacluster表示启动IPfail进程的用户。


2.配置haresources-----资源文件
Haresources文件用于指定双机系统的主节点、集群IP、子网掩码、广播地址及启动服务集群资源,
文件每一行可包含一个或多个资源脚本名,资源间使用空格隔开,参数间使用两个冒号隔开,主节点
和备份节点中资源文件haresources要完全一样。

一般格式为:
node-name network<resource-group>
node-name表示主节点的主机名,必须和ha.cf文件中指定的节点名一致。network用于设定集群的
IP地址、子网掩码和网络设备标识等。resource-group用于指定需Heartbeat托管的服务(即这些
服务可由Heartbeat来启动和关闭)。

注意:这里指定的IP地址就是集群对外服务的IP地址;
   如要托管这些服务,必须将服务写成可通过start/stop来启动或关闭的脚本,放到/etc/init.d/
   或/etc/ha.d/resource.d/目录下,Heartbeat会根据脚本名称自动去/etc/init.d或者
   /etc/ha.d/resource.d目录下找到相应脚本进行启动或关闭操作。

1
2
# cat /usr/local/heartbeat/etc/ha.d/haresources |grep -v "#"
node1IPaddr::192.168.10.222/24/eno16777736





HA-01是HA集群的主节点,IPaddr为heartbeat自带的执行脚本,heartbeat首先将执行/etc/ha.d/resource.d/IPaddr 192.168.10.111/24 start的操作,即虚拟一个子网掩码为255.255.255.0,
IP为192.168.10.111的地址,此IP为heartbeat对外提供服务的网络地址,同时指定此IP使用的
网络接口

注:如下有haresources详细中文解释

http://blog.iyunv.com/uid-20788470-id-1841644.html

3.配置authkeys-----心跳密钥验证文件

1
2
3
# grep -v "#" /usr/local/heartbeat/etc/ha.d/authkeys
auth 2
2 sha1 HI!




注:auth后填序号,可任意填写,但第二行开头必须为序号名,然后为验证方式,支持三种( crc md5 sha1 )方式验证,最后面是自定义密钥。

六、配置双机互信(可选)并复制文件至备机

HA-01(192.168.10.132):

1
2
ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.133




HA-02(192.168.10.133):

1
2
ssh-keygen -t rsa -f ~/.ssh/id_rsa -P ''
ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.132





复制配置文件至备机:

1
# scp /usr/local/heartbeat/etc/ha.d/* root@192.168.10.133:/usr/local/heartbeat/etc/ha.d/






七、测试

1
2
3
4
# systemctl start httpd
# /etc/init.d/heartbeat start                   ##开启heartbeat
# getenforce 0
# systemctl stop firewalld





查看log信息


1
2
3
4
5
6
# tail /var/log/ha-log
Oct 26 10:07:18 node1 heartbeat: : ERROR: Illegal directive in /usr/local/heartbeat/etc/ha.d//ha.cf
Oct 26 10:07:18 node1 heartbeat: : ERROR: Illegal directive in /usr/local/heartbeat/etc/ha.d//ha.cf
Oct 26 10:07:18 node1 heartbeat: : ERROR: Client child command is not executable
Oct 26 10:07:18 node1 heartbeat: : ERROR: Heartbeat not started: configuration error.
Oct 26 10:07:18 node1 heartbeat: : ERROR: Configuration error, heartbeat not started.




问题解决:
更改IPfail路径:


1
respawn hacluster /usr/local/heartbeat/libexec/heartbeat/ipfail




建立plugin软链接:

1
2
# ln -svf /usr/local/heartbeat/lib64/heartbeat/plugins/RAExec/* /usr/local/heartbeat/lib/heartbeat/plugins/RAExec/
# ln -svf /usr/local/heartbeat/lib64/heartbeat/plugins/* /usr/local/heartbeat/lib/heartbeat/plugins/




继续查看log信息

1
2
3
4
5
6
7
8
9
10
11
# tail /var/log/ha-log
Oct 26 13:11:46 node1 heartbeat: : info: remote resource transition completed.
Oct 26 13:11:46 node1 heartbeat: : info: node1 wants to go standby
Oct 26 13:11:46 node1 heartbeat: : info: standby: node2 can take our foreign resources
Oct 26 13:11:46 node1 heartbeat: : info: give up foreign HA resources (standby).
Oct 26 13:11:46 node1 heartbeat: : info: foreign HA resource release completed (standby).
Oct 26 13:11:46 node1 heartbeat: : info: Local standby process completed .
Oct 26 13:11:47 node1 heartbeat: : WARN: 1 lost packet(s) for
Oct 26 13:11:47 node1 heartbeat: : info: remote resource transition completed.
Oct 26 13:11:47 node1 heartbeat: : info: No pkts missing from node2!
Oct 26 13:11:47 node1 heartbeat: : info: Other node completed standby takeover of foreign resources.





问题解决:

1
2
# vi /usr/local/heartbeat/etc/ha.d/haresources
node1IPaddr::192.168.10.222/24/eno16777736




注:haresources下需添加IPaddr::

问题:

1
2
3
4
5
6
7
8
9
10
11
# tail /var/log/ha-log
Oct 26 17:01:55 node1 heartbeat: : WARN: Message hist queue is filling up (425 messages in queue)
Oct 26 17:01:56 node1 heartbeat: : WARN: Message hist queue is filling up (426 messages in queue)
Oct 26 17:01:57 node1 heartbeat: : WARN: Message hist queue is filling up (427 messages in queue)
Oct 26 17:01:57 node1 heartbeat: : WARN: Message hist queue is filling up (428 messages in queue)
Oct 26 17:01:58 node1 heartbeat: : WARN: Message hist queue is filling up (429 messages in queue)
Oct 26 17:01:59 node1 heartbeat: : WARN: Message hist queue is filling up (430 messages in queue)
Oct 26 17:01:59 node1 heartbeat: : WARN: Message hist queue is filling up (431 messages in queue)
Oct 26 17:02:00 node1 heartbeat: : WARN: Message hist queue is filling up (432 messages in queue)
Oct 26 17:02:01 node1 heartbeat: : WARN: Message hist queue is filling up (433 messages in queue)
Oct 26 17:02:01 node1 heartbeat: : WARN: Message hist queue is filling up (434 messages in queue)




解决:node2未关闭防火墙,systemctl stop firewalld关闭防火墙问题解决

问题:

1
2
3
# tail /var/log/ha-log
IPaddr(IPaddr_192.168.10.222):2015/10/26_17:20:58 ERROR: Setup problem: couldn't find command: ifconfig
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.222):2015/10/26_17:20:58 ERROR:Program is not installed




解决:yum install net-tools后即可使用ifconfig命令

重启heartbeat,继续查看log信息:

1
2
3
4
5
6
7
8
9
10
11
12
# systemctl restart hearbeat
# tail /var/log/ha-log
Oct 26 19:25:36 node1 heartbeat: : info: Configuration validated. Starting heartbeat 3.0.6
Oct 26 19:25:37 node1 heartbeat: : info: heartbeat: version 3.0.6
Oct 26 19:25:37 node1 heartbeat: : info: Heartbeat generation: 1445827146
Oct 26 19:25:37 node1 heartbeat: : info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eno16777736
Oct 26 19:25:37 node1 heartbeat: : info: glib: UDP Broadcast heartbeat closed on port 694 interface eno16777736 - Status: 1
Oct 26 19:25:37 node1 heartbeat: : info: glib: ping heartbeat started.
Oct 26 19:25:37 node1 heartbeat: : info: Local status now set to: 'up'
Oct 26 19:25:37 node1 heartbeat: : info: Link 192.168.10.1:192.168.10.1 up.
Oct 26 19:25:37 node1 heartbeat: : info: Status update for node 192.168.10.1: status ping
Oct 26 19:25:37 node1 heartbeat: : info: Link node1:eno16777736 up.




使用ifconfig命令查看


浏览器输入http://localhost查看


down掉node1节点,查看会不会漂移至node2节点
node1:

1
# systemctl stop heartbeat




node2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# tail/var/log/ha-log
mach_down(default):2015/10/26_20:03:58 info: Taking over resource group IPaddr::192.168.10.222/24/eno16777736
ResourceManager(default):2015/10/26_20:03:58 info: Acquiring resource group: node1 IPaddr::192.168.10.222/24/eno16777736
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.222):2015/10/26_20:03:58 INFO:Resource is stopped
ResourceManager(default):2015/10/26_20:03:58 info: Running /usr/local/heartbeat/etc/ha.d//resource.d/IPaddr 192.168.10.222/24/eno16777736 start
IPaddr(IPaddr_192.168.10.222):2015/10/26_20:03:58 INFO: Using calculated netmask for 192.168.10.222: 255.255.255.0
IPaddr(IPaddr_192.168.10.222):2015/10/26_20:03:58 INFO: eval ifconfig eno16777736:0 192.168.10.222 netmask 255.255.255.0 broadcast 192.168.10.255
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.222):2015/10/26_20:03:58 INFO:Success
mach_down(default):2015/10/26_20:03:58 info: /usr/local/heartbeat/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down(default):2015/10/26_20:03:58 info: mach_down takeover complete for node node1.
Oct 26 20:03:58 node2 heartbeat: : info: mach_down takeover complete.
mach_down(default):2015/10/26_20:03:58 info: mach_down takeover complete for node node1.
Oct 26 20:03:58 node2 heartbeat: : info: mach_down takeover complete.
Oct 26 20:04:29 node2 heartbeat: : WARN: node node1: is dead
Oct 26 20:04:29 node2 heartbeat: : info: Dead node node1 gave up resources.
Oct 26 20:04:29 node2 heartbeat: : info: Link node1:eno16777736 dead.
Oct 26 20:04:29 node2 ipfail: : info: Status update: Node node1 now has status dead
Oct 26 20:04:29 node2 ipfail: : info: NS: We are still alive!
Oct 26 20:04:29 node2 ipfail: : info: Link Status update: Link node1/eno16777736 now has status dead
Oct 26 20:04:30 node2 ipfail: : info: Asking other side for ping node count.
Oct 26 20:04:30 node2 ipfail: : info: Checking remote count of ping nodes.




使用ifconfig命令查看IP是否漂移至node2:

IP已漂移至node2,使用浏览器输入http://localhost查看

OK啦!


附:heartbeat官网:
http://www.linux-ha.org/wiki/Main_Page

sunzhongyu 发表于 2015-11-3 23:38:45

已带走

sunzhongyu 发表于 2015-11-3 23:39:06

已带走
页: [1]
查看完整版本: Heartbeat的编译安装配置