Corosync+Pacemaker+Ldirectord+Lvs+Httpd
Corosync+Pacemaker+Ldirectord+Lvs+Httpd一、硬件环境
4台虚拟机在同一网段
操作系统:centos6.3
关闭系统不必要的服务脚本
#!/bin/bash
services=`chkconfig --list|cut -f1|cut -d" " -f1`
for ser in $services
do
if [ "$ser" == "network" ] || [ "$ser" == "rsyslog" ] || [ "$ser" == "sshd" ] || [ "$ser" == "crond" ] || [ "$ser" == "atd" ];
then
chkconfig "$ser" on
else
chkconfig "$ser" off
fi
done
reboot 二、ip地址规划
master 172.30.82.45
slave 172.30.82.58
node1 172.30.82.3
node2 172.30.82.11
VIP 172.30.82.61 三、注意:
1、设置各个节点间的时间同步
ntpdate 172.30.82.254 &>/dev/null
2、基于hosts文件实现能够互相用主机名访问,修改/etc/hosts文件
3、使用uname -n执行结果要和主机名相同
4、确保ldirectord服务关闭开机启动
chkconfig ldirectord off
5、关闭selinux
setenfroce 0
四、相关软件下载及安装
从pacemaker1.1.8开始,crm发展成了一个独立项目,叫crmsh。也就是说,我们安装了pacemaker后,并没有crm这个命令,我们要实现对集群资源管理,还需要独立安装crmsh
pssh-2.3.1-4.1.x86_64.rpm crmsh-2.1-1.6.x86_64.rpmpython-pssh-2.3.1-4.1.x86_64.rpm下载地址
http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/
libdnet 下载地址:
http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/letter_l.group.html
ldirectord-3.9.6-0rc1.1.1.x86_64.rpm 下载地址:
http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/
yum install corosync pacemakerlibesmtp –y
yum install -y python-dateutil python-lxml redhat-rpm-config cluster-glue cluster-glue-libs resource-agents
yum --nogpgcheck localinstall pssh-2.3.1-4.1.x86_64.rpm crmsh-2.1-1.6.x86_64.rpmpython-pssh-2.3.1-4.1.x86_64.rpm ldirectord-3.9.6-0rc1.1.1.x86_64.rpm 五、配置director节点的高可用
1、拷贝配置文件
cp corosync.conf.example corosync.conf
cp /usr/share/doc/ldirectord-3.9.6/ldirectord.cf /etc/ha.d/
2、生成autokeys文件
corosync-keygen
3、修改corosync.conf
totem {
version: 2
secauth: off #是否开启秘钥认证
threads: 0 #发送集群节点认证信息使用的进程数
interface {
ringnumber: 0 #为避免冗余环路设定的所在的网络接口
bindnetaddr: 172.30.82.0 #集群所在网络
mcastaddr: 239.238.16.1 #集群通告组播地址
mcastport: 5405 #服务端口
ttl: 1
}
}
logging {
fileline: off #日志是否打印行号
to_stderr: no #是否输出标准错误(到显示器)
to_logfile: yes #定义日志
logfile: /var/log/corosync.log
to_syslog: no #是否开启系统日志
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service { #服务启动时启动pacemaker
ver: 0
name: pacemaker
}
4、修改ldirectord配置文件ldirectord.cf
checktimeout=3 # 检测超时
checkinterval=1 # 检测间隔
autoreload=yes # 从新载入客户机
logfile="/var/log/ldirectord.log" # 日志路径
logfile="local0"
quiescent=no # realserver 宕机后从lvs列表中删除,恢复后自动添加进列表
virtual=172.30.82.61:80 # 监听VIP地址80端口
real=172.30.82.3:80 gate # 真机IP地址和端口 路由模式
real=172.30.82.11:80 gate
fallback=127.0.0.1:80 gate # 如果real节点都宕机,则回切到环回地址
service=http # 服务是http
request=".text.html" # 保存在real的web根目录并且可以访问,通过它来判断real是否存活
receive="OK" # 检测文件内容
scheduler=rr # 调度算法
protocol=tcp # 检测协议
checktype=negotiate # 检测类型
checkport=80 # 检测端口
5、复制配置文件到备用节点:
scp -P authkeyscorosync.confldirectord.cfslave:/etc/ha.d/ 六、DR模型下配置realserver脚本:
#!/bin/bash
VIP=172.30.82.61
host=`/bin/hostname`
case "$1" in
start)
# Start LVS-DR real server on this machine.
/sbin/ifconfig lo down
/sbin/ifconfig lo up
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
/sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 up
/sbin/route add -host $VIP dev lo:0
;;
stop)
# Stop LVS-DR real server loopback device(s).
/sbin/ifconfig lo:0 down
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
;;
status)
# Status of LVS-DR real server.
islothere=`/sbin/ifconfig lo:0 | grep $VIP`
isrothere=`netstat -rn | grep "lo" | grep $VIP`
if [ ! "$islothere" -o ! "$isrothere" ];then
# Either the route or the lo:0 device
# not found.
echo "LVS-DR real server is stopped."
else
echo "LVS-DR real server is running."
fi
;;
*)
# Invalid entry.
echo "$0: Usage: $0 {start|status|stop}"
exit 1
;;
esac 七、real上安装httpd服务并添加测试页面
1、node1
yum install -y httpd
echo "Welcome to realserver 1" >/var/www/html/index.html
echo "OK" >/var/www/html/.text.html
service httpd start
2、node2
yum install -y httpd
echo "Welcome to realserver 2" >/var/www/html/index.html
echo "OK" >/var/www/html/.text.html
service httpd start 八、开启、配置并测试高可用集群服务
1、在master上执行
service corosync start
ssh slave 'service corosync start'
注意:启动node2需要在node1上使用如上命令进行,不要在node2节点上直接启动;
查看corosync引擎是否正常启动
#grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/corosync.log
May 19 23:11:05 corosync Corosync Cluster Engine exiting with status 0 at main.c:2055.
May 19 23:11:46 corosync Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
May 19 23:11:46 corosync Successfully read main configuration file '/etc/corosync/corosync.conf'
查看初始化成员节点通知是否正常发出:
# grepTOTEM/var/log/corosync.log
May 19 19:59:44 corosync Initializing transport (UDP/IP Multicast).
May 19 19:59:44 corosync Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 19 19:59:44 corosync The network interface is now up.
May 19 19:59:44 corosync A processor joined or left the membership and a new membership was formed.
检查启动过程中是否有错误产生:
# # grep ERROR: /var/log/corosync.log
查看pacemaker是否正常启动:
May 19 23:11:46 corosync info: pcmk_startup: CRM: Initialized
May 19 23:11:46 corosync Logging: Initialized pcmk_startup
May 19 23:11:46 corosync info: pcmk_startup: Maximum core file size is: 18446744073709551615
May 19 23:11:46 corosync info: pcmk_startup: Service: 9
May 19 23:11:46 corosync info: pcmk_startup: Local hostname: master
使用如下命令查看集群节点的启动状态:
# crm status
Last updated: Wed May 20 00:10:38 2015
Last change: Tue May 19 22:49:50 2015
Stack: classic openais (with plugin)
Current DC: slave - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured, 2 expected votes
2 Resources configured
Online: [ master slave ]
2、配置集群资源,这里需要配置2个基本资源1个组资源
a、配置vip
crm(live)configure# primitivevip ocf:heartbeat:IPaddr params ip=172.30.82.61 nic=eth0 cidr_netmask=24
b、配置ldirectord服务资源
crm(live)configure#priimitive ldir lsb:ldirectord
c、配置组资源,组资源将基本资源定义在同一台服务器上运行,默认情况集群资源会均衡运行在集群中各个节点
crm(live)configure#group lvsserver vip ldir
d、不用定义组,可以通过资源粘性及资源约束来也可定义资源的倾向性,这里只是举例:
顺序约束:资源的启动顺序
crm(live)configure# order vip_before_ldir mandatory: vip ldir
排列约束:哪些资源运行在一起
crm(live)configure# colocation ldir_with_vip inf: vip ldir
位置约束:资源更倾向运行在那个节点上
crm(live)configure# location vip_on_mater vip rule 100: #uname eq node1
e、其他的一些配置
禁用stonith设备
crm(live)configure# property stonith-enabled=false
设定集群未到达法定票数的工作机制为忽略,因为只有两台服务器只能选此项
crm(live)configure#no-quorum-policy=ignore
corosync的框架、运行原理、配置命令说明需自行研究,这里倾向于环境搭建及测试
查看集群配置信息库:
crm(live)configure#show
node master
node slave
primitive ldir lsb:ldirectord
primitive vip IPaddr \
params ip=172.30.82.61 nic=eth0 cidr_netmask=24
group lvsserver vip ldir
property cib-bootstrap-options: \
dc-version=1.1.11-97629de \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false \
no-quorum-policy=ignore
验证配置语法:
crm(live)configure# verify
不报错即提交固化配置:
crm(live)configure# commit
3、测试集群服务,客户端访问172.30.82.61
a、master 上执行:
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP172.30.82.61:80 rr
-> 172.30.82.3:80 Route 1 0 13
-> 172.30.82.11:80 Route 1 0 14
b、slave 上执行:
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
说明集群资源只运行在master上
4、集群资源转移测试
a、master上执行
service corosync stop
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
b、在slave上执行
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP172.30.82.61:80 rr
-> 172.30.82.3:80 Route 1 1 17
-> 172.30.82.11:80 Route 1 0 18
说明集群资源转移成功
c、后端服务故障检测node1上执行
service httpd stop
查看master集群服务
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP172.30.82.61:80 rr
-> 172.30.82.11:80 Route 1 0 0
恢复node1服务
service httpd start
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP172.30.82.61:80 rr
-> 172.30.82.11:80 Route 1 0
-> 172.30.82.3:80 Route 1
页:
[1]