在水一万 发表于 2015-9-8 11:57:48

CentOS6.6+Puppet3.7.4分布式部署Nagios监控系统

  测试框架



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

CentOS-6.6-x86_64(minimal)

puppet-3.7.4

nagios-4.0.8.tar.gz

nagios-plugins-2.0.3.tar.gz

nrpe-2.15.tar.gz

192.168.188.10 mirrors.redking.com

192.168.188.20 master.redking.com

192.168.188.20 nagios.redking.com

192.168.188.31 agent1.redking.com

192.168.188.32 agent2.redking.com

192.168.188.33 agent3.redking.com  Puppet 要求所有机器有完整的域名(FQDN),如果没有 DNS 服务器提供域名的话,可以在两台机器上设置主机名(注意要先设置主机名再安装 Puppet,因为安装 Puppet 时会把主机名写入证书,客户端和服务端通信需要这个证书),因为我配置了DNS,所以就不用改hosts了,如果没有就需要改hosts文件指定。
  1.关闭selinux,iptables,并设置ntp      采用CentOS-6.6-x86_64.iso进行minimal最小化安装
  关闭selinux



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

# cat /etc/selinux/config

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing - SELinux security policy is enforced.

# permissive - SELinux prints warnings instead of enforcing.

# disabled - No SELinux policy is loaded.

SELINUX=enforcing

# SELINUXTYPE= can take one of these two values:

# targeted - Targeted processes are protected,

# mls - Multi Level Security protection.

SELINUXTYPE=targeted

# sed -i '/SELINUX/ s/enforcing/disabled/g' /etc/selinux/config

# cat /etc/selinux/config

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing - SELinux security policy is enforced.

# permissive - SELinux prints warnings instead of enforcing.

# disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

# targeted - Targeted processes are protected,

# mls - Multi Level Security protection.

SELINUXTYPE=targeted

# setenforce 0  
  
  停止iptables



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

# chkconfig --list |grep tables

ip6tables 0:off 1:off 2:on 3:on 4:on 5:on 6:off

iptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off

# chkconfig ip6tables off

# chkconfig iptables off

# service ip6tables stop

ip6tables: Setting chains to policy ACCEPT: filter [ OK ]

ip6tables: Flushing firewall rules: [ OK ]

ip6tables: Unloading modules: [ OK ]

# service iptables stop

iptables: Setting chains to policy ACCEPT: filter [ OK ]

iptables: Flushing firewall rules: [ OK ]

iptables: Unloading modules: [ OK ]

#  
  设置ntp



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

# ntpdate pool.ntp.org

# chkconfig --list|grep ntp

ntpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

ntpdate 0:off 1:off 2:off 3:off 4:off 5:off 6:off

# chkconfig ntpd on

# service ntpd start

Starting ntpd: [ OK ]

#  
  
  
  2.安装puppet服务   puppet不在CentOS的基本源中,需要加入 PuppetLabs 提供的官方源:



?
1
2
3
4
5

# wget http://yum.puppetlabs.com/el/6/products/x86_64/puppetlabs-release-6-7.noarch.rpm

# rpm -ivh puppetlabs-release-6-7.noarch.rpm

# yum update -y  
  在 master上安装和启用 puppet 服务:



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

# yum install -y puppet-server

# chkconfig puppet on

# chkconfig puppetmaster on

# service puppet start

Starting puppet agent:                                    

# service puppetmaster start

Starting puppetmaster:                                    

#  
  在clients上安装puppet客户端



?
1
2
3
4
5

# yum install -y puppet

# chkconfig puppet on

# service puppet start  
  3.配置puppet
  对于puppet 客户端,修改/etc/puppet/puppet.conf,指定master服务器

  并重启puppet服务



?
1

# service puppet restart  4.Client申请证书   服务端自动签发证书设置    设置master自动签发所有的证书,我们只需要在/etc/puppet目录下创建 autosign.conf文件。(不需要修改 /etc/puppet/puppet.conf文件,因为我默认的autosign.conf 文件的位置没有修改)



?
1
2
3
4
5
6
7
8
9
10
11
12
13

# cat > /etc/puppet/autosign.conf <<EOF

> *.redking.com

> EOF

# service puppetmaster restart

Stopping puppetmaster:                                    

Starting puppetmaster:                                    

#  
  这样就会对所有来自fisteam2.com的机器的请求,都自动签名。client需要向服务器端发出请求, 让服务器对客户端进行管理. 这其实是一个证书签发的过程. 第一次运行 puppet 客户端的时候会生成一个 SSL 证书并指定发给 Puppet 服务端, 服务器端如果同意管理客户端,就会对这个证书进行签发,可以用这个命令来签发证书,由于我们已经在客户端设置了server地址,因此不需要跟服务端地址



?
1

# puppet agent --test
  就可以申请证书了,由于我配置的自动签发证书,所以直接就签发了,在服务端执行



?
1

# puppet cert list --all
  Nagios服务器安装
  1.安装Nagios相关依赖包



?
1

# yum install -y httpd php gcc glibc glibc-common gd gd-devel openssl-devel  2.创建Nagios用户与组



?
1
2
3

# useradd -m nagios

# passwd nagios  
  创建nagcmd用户组以执行来自Web接口命令,并添加nagios和apache用户到此用户组



?
1
2
3
4
5

# groupadd nagcmd

# usermod -a -G nagcmd nagios

# usermod -a -G nagcmd apache  3.下载Nagios和Plugins软件包
  http://www.nagios.org/download/下载Nagios Core和Nagios Plugins

  4.编译安装Nagios



?
1
2
3

# tar zxf nagios-4.0.8.tar.gz

# cd nagios-4.0.8  #运行Nagios配置脚本,并把nagcmd更改为之前所创建的组



?
1

# ./configure --with-command-group=nagcmd  #编译Nagios源码



?
1

# make all  #安装二进制文件、init脚本文件、sample配置文件,设置外部命令目录权限



?
1
2
3
4
5
6
7

# make install

# make install-init

# make install-config

# make install-commandmode  5.修改配置文件
  样式配置文件位于/usr/local/nagios/etc目录,可以更改email地址



?
1

# vim /usr/local/nagios/etc/objects/contacts.cfg  6.配置Web界面
  在Apache的conf.d目录中安装Nagios Web配置文件



?
1

# make install-webconf  创建nagiosadmin帐号登录Nagios Web接口



?
1
2
3
4
5
6
7

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

# service httpd start

Starting httpd:                                          

# chkconfig httpd on  开启httpd服务使配置生效并设置开机自启

  7.编译安装Nagios Plugins



?
1
2
3
4
5
6
7

# tar zxvf nagios-plugins-2.0.3.tar.gz

# cd nagios-plugins-2.0.3

# ./configure --with-nagios-user=nagios --with-nagios-group=nagios

# make && make install  8.编译安装Nrpe



?
1
2
3
4
5
6
7
8
9
10
11

# tar zxvf nrpe-2.15.tar.gz

# ./configure

# make all

# make install-plugin

# make install-daemon

# make install-daemon-config
  9.启动Nagios
  本机监控HTTP SSH的Notifications显示警告错误,解决方法



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

# vim /usr/local/nagios/etc/objects/localhost.cfg

# Define a service to check SSH on the local machine.

# Disable notifications for this service by default, as not all users may have SSH enabled.

define service{

use                           local-service         ; Name of service template to use

host_name                     localhost

service_description             SSH

check_command                   check_ssh

notifications_enabled         1#改为1,即可

}

# Define a service to check HTTP on the local machine.

# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service{

use                           local-service         ; Name of service template to use

host_name                     localhost

service_description             HTTP

check_command                   check_http

notifications_enabled         1#改为1,即可

}

# touch /var/www/html/index.html  启动Nagios之前测试配置文件



?
1

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
  启动Nagios、nrpe并设置开机自启



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

# chkconfig nagios --add

# chkconfig --list |grep nagios

nagios          0:off   1:off   2:off   3:on    4:on    5:on    6:off

# chkconfig nagios on

# service nagios start

Starting nagios: done.

# echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.d/rc.local

# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

# netstat -tunpl |grep nrpe

tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 70100/nrpe

tcp 0 0 :::5666 :::* LISTEN 70100/nrpe

#  执行/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1检查连接是否正常

  使用之前定义的nagiosadmin帐号与密码登录Nagios,地址:http://192.168.188.20/nagios/


  创建Nagios客户端监控
  1.Puppet Master安装相应模块
  Nagios没有目前没有提供官方软件源,在批量部署时可以使用第三方epel源,采用Example42所提供的puppet-nrpe来实现Linux服务器批量部署。部署客户端使用官方3个模块:epel、nrpe、puppi。
  epel模块用于安装nrpe软件,nrpe模块用于收集主机信息,puppi属于Example42模块组件,使用Example42模块时都需要加载此模块。
  Puppi是一个Puppet模块和CLI命令,他可以标准化和自动化快速部署应用程序,并提供快速和标准查询命令,检查系统资源。



?
1
2
3
4
5
6
7
8
9
10
11
12
13

# git clone https://github.com/puppetlabs/puppetlabs-stdlib /etc/puppet/modules/stdlib

# git clone https://github.com/example42/puppi /etc/puppet/modules/puppi

# git clone https://github.com/example42/puppet-nrpe /etc/puppet/modules/nrpe

# puppet module install stahnma/epel

# vim /etc/puppet/puppet.conf



modulepath = /etc/puppet/modules/
  2.创建agent节点组配置文件



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

# mkdir /etc/puppet/manifests/nodes

# vim /etc/puppet/manifests/nodes/agentgroup.pp

node /^agent\d+\.redking\.com$/ {

include stdlib

include epel

class { 'puppi': }

class { 'nrpe':

require => Class['epel'],

allowed_hosts => ['127.0.0.1',$::ipaddress,'192.168.188.20'],

template => 'nrpe/nrpe.cfg.erb',

}

}

# vim /etc/puppet/manifests/site.pp

import "nodes/agentgroup.pp"  3.配置Nagios添加agent.redking.com主机监控
  修改/usr/local/nagios/etc/objects/commands.cfg
  command_name check_nrpe ——定义命令名称为check_nrpe,services.cfg必须使用
  command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ ——用$USER1$代替/usr/local/nagios/libexec
  这是定义实际运行的插件程序.这个命令行的书写要完全按照check_nrpe这个命令的用法.不知道用法的就用check_nrpe –h查看; -c后面带的$ARG1$参数是传给nrpe daemon执行的检测命令,它必须是nrpe.cfg中所定义的5条命令中的其中一条。



?
1
2
3
4
5
6
7
8
9
10
11

# vim /usr/local/nagios/etc/objects/commands.cfg

# 'check_nrpe' command definition

define command{

command_name check_nrpe

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

}  修改/usr/local/nagios/etc/nagios.cfg



?
1
2
3
4
5
6
7

# vim /usr/local/nagios/etc/nagios.cfg

cfg_file=/usr/local/nagios/etc/objects/agent1.redking.com.cfg

cfg_file=/usr/local/nagios/etc/objects/agnet2.redking.com.cfg

cfg_file=/usr/local/nagios/etc/objects/agent3.redking.com.cfg  增加agent1~3.redking.com.cfg配置文件



?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

# vim /usr/local/nagios/etc/objects/agent1.redking.com.cfg

define host{

use             linux-server

host_name       agent1.redking.com

alias agent1.redking.com

address         192.168.188.31

}

define service{

use                     generic-service

host_name               agent1.redking.com

service_description   PING

check_command         check_ping!100.0,20%!500.0,60%

}

define service{

use                     generic-service

host_name               agent1.redking.com

service_description   Current Users

check_command         check_nrpe!check_users!10!5

}

define service{

use                     generic-service

host_name               agent1.redking.com

service_description   Current Load

check_command         check_nrpe!check_load!15,10,5!30,25,20

}

define service{

use                     generic-service

host_name               agent1.redking.com

service_description   Swap Usage

check_command         check_nrpe!check_swap!20!40

}  检测Nagios服务并重启使配置生效



?
1
2
3
4
5

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# service nagios restart

# service puppetmaster restart  

  客户端测试



?
1

# puppet agent --test  客户端自动部署nrpe

  下面我们来看下客户端自动化部署nrpe后采集信息的nagios监控界面






  NRPE模块中定义的nrpe.cfg包含大量脚本,我们可以直接拿来使用当然也可以自己修改nrpe.cfg.erb模板内容。在批量部署时可以分别采用自己编写的模块或者现有模块来实现,利用现有模块几乎能实现系统管理日常工作中90%任务,剩余的10%我们可以根据生产业务来自己定制。

  ========================END=================================
  http://redking.blog.iyunv.com/27212/1612136
页: [1]
查看完整版本: CentOS6.6+Puppet3.7.4分布式部署Nagios监控系统