scuess 发表于 2015-9-8 10:26:08

(转)CentOS搭建Nagios监控

  A.Nagios服务端
1.安装软件包

[*]yum install -y httpd
  2.下载nagios

[*]wgethttp://syslab.comsenz.com/downloads/linux/nagios-3.0.5.tar.gz
[*]wgethttp://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
[*]wgethttp://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz
  3.添加nagios账号

[*]useradd nagios
  4.编译安装nagios

[*]mkdir /opt/hadoop/
[*]tar -xzvf nagios-3.0.5.tar.gz
[*]cd nagios-3.0.5
[*]./configure --prefix=/opt/hadoop/nagios
[*]make all
[*]make fullinstall
[*]mkdir /opt/hadoop/nagios/etc
[*]mkdir /opt/hadoop/nagios/etc/objects
[*]cp ./sample-config/cgi.cfg /opt/hadoop/nagios/etc/
[*]cp ./sample-config/nagios.cfg /opt/hadoop/nagios/etc/
[*]cp ./sample-config/resource.cfg /opt/hadoop/nagios/etc/
[*]cp ./sample-config/template-object/commands.cfg /opt/hadoop/nagios/etc/objects/
[*]cp ./sample-config/template-object/contacts.cfg /opt/hadoop/nagios/etc/objects/
[*]cp ./sample-config/template-object/timeperiods.cfg /opt/hadoop/nagios/etc/objects/
[*]cp ./sample-config/template-object/templates.cfg /opt/hadoop/nagios/etc/objects/
[*]cp ./sample-config/template-object/localhost.cfg /opt/hadoop/nagios/etc/objects/
[*]touch /opt/hadoop/nagios/var/nagios.log
[*]chmod -R 755/opt/hadoop/nagios/etc/
[*]chown -R nagios:nagios /opt/hadoop/nagios
  5.编译安装nagios-plugins

[*]tar zxvf nagios-plugins-1.4.13.tar.gz
[*]cdnagios-plugins-1.4.13
[*]./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
[*]make && make install
  检查是否已经安装成功,看这个目录下是否有插件文件

[*]ls /opt/hadoop/nagios/libexec/
  6.安装nrpe

[*]tar zxvf nrpe-2.12.tar.gz
[*]cd nrpe-2.12
[*]./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
[*]make all
[*]make install-plugin
[*]make install-daemon
[*]make install-daemon-config
  7.配置httpd
添加web账号

[*]htpasswd -c /opt/hadoop/nagios/etc/htpasswd.users nagiosadmin
  B.Nagios客户端
1.准备软件包

[*]wgethttp://syslab.comsenz.com/downloads/linux/nagios-plugins-1.4.13.tar.gz
[*]wgethttp://syslab.comsenz.com/downloads/linux/nrpe-2.12.tar.gz
  2.添加nagios账号,准备安装目录

[*]mkdir /opt/hadoop/nagios
[*]useradd nagios
  3.编译安装nrpe

[*]tar -xzvf nrpe-2.12.tar.gz
[*]cd nrpe-2.12
[*]./configure --prefix=/opt/hadoop/nagios --enable-ssl --enable-command-args
[*]make all
[*]make install-plugin
[*]make install-daemon
[*]make install-daemon-config
  4.安装nagios-plugin

[*]tar -xzvf nagios-plugins-1.4.13.tar.gz
[*]cd nagios-plugins-1.4.13
[*]./configure --prefix=/opt/hadoop/nagios --with-nagios-user=nagios --with-nagios-group=nagios
[*]make && make install
  检查是否已经安装成功,看这个目录下是否有插件文件

[*]ls /opt/hadoop/nagios/libexec/
  5. 配置nrpe

[*]vim /opt/hadoop/nagios/etc/nrpe.cfg
[*]找到”allowed_hosts=127.0.0.1”改成“allowed_hosts=127.0.0.1,10.130.2.72”,后边的IP是nagios服务端IP
[*]找到” dont_blame_nrpe=0”改成“dont_blame_nrpe=1”
  6.一段nrpe启停脚本,放在/etc/init.d/nrpe里

[*]#!/bin/bash
[*]#
[*]# chkconfig: 2345 55 25
[*]# description: NRPE Daemon
[*]#
[*]
[*]# source function library
[*]./etc/rc.d/init.d/functions
[*]
[*]RETVAL=0
[*]
[*]prog='nrpe'
[*]NRPE_CFG='/opt/hadoop/nagios/etc/nrpe.cfg'
[*]NRPE_PRG='/opt/hadoop/nagios/bin/nrpe'
[*]NRPE_OPT='-d'
[*]PID_FILE='/var/run/nrpe.pid'
[*]
[*]start()
[*]{
[*]      echo -n $"Starting $prog: "
[*][-f $PID_FILE ]&& rm -f $PID_FILE
[*]    $NRPE_PRG -c $NRPE_CFG $NRPE_OPT
[*]    pid=`ps aux | grep -v grep | grep $NRPE_PRG | awk '{print $2}'`
[*]    echo $pid > $PID_FILE
[*]
[*]if ps aux | grep -v grep |grep -q $NRPE_PRG ;then
[*]            RETVAL=0
[*]      success
[*]else
[*]            RETVAL=1
[*]      failure
[*]fi
[*]    echo
[*]}
[*]
[*]stop()
[*]{
[*]      echo -n $"Stopping $prog: "
[*]    ps --pid=`cat $PID_FILE`&>/dev/null
[*]if[ $?-eq 0];then
[*]      kill -9`cat $PID_FILE`
[*]            RETVAL=0
[*]fi
[*]    success
[*]    echo
[*]      RETVAL=0
[*]}
[*]
[*]case"$1"in
[*]      start)
[*]                start
[*];;
[*]      stop)
[*]                stop
[*];;
[*]      restart)
[*]                stop
[*]                start
[*];;
[*]      status)
[*]                status -p $PID_FILE $prog
[*]                RETVAL=$?
[*];;
[*]*)
[*]                echo $"Usage: $0 {start|stop|restart|status}"
[*]                RETVAL=1
[*]esac
[*]exit $RETVAL
  6. 启动nrpe

[*]/etc/init.d/nrpe start
  C.Nagios服务端添加被监控机
1.配置监控机目录

[*]mkdir /opt/hadoop/nagios/etc/servers
[*]vim /opt/hadoop/nagios/etc/nagios.cfg 追加cfg_dir=/opt/hadoop/nagios/etc/servers
  2.添加配置的机器

[*]vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
[*]define host{
[*]       use                     linux-server
[*]       host_name               10.130.2.22
[*]       alias                   10.130.2.22
[*]       address               10.130.2.22
[*]}
[*]define service{
[*]       use                     generic-service
[*]       host_name               10.130.2.22
[*]       service_description   check_ping
[*]       check_command         check_ping!100.0,20%!200.0,50%
[*]       max_check_attempts      5
[*]       normal_check_interval   1
[*]}
[*]define service{
[*]       use                     generic-service
[*]       host_name               10.130.2.22
[*]       service_description   check_ssh
[*]       check_command         check_ssh
[*]       max_check_attempts      5
[*]       normal_check_interval   1
[*]}
  3.reload nagios服务端使配置生效

[*]service nagios reload
  重新加载nagios后就可以在nagios的界面上看到新的被监控的机器了
4.添加使用nrpe的监控

[*]在/opt/hadoop/nagios/etc/objects/commands.cfg里增加如下行
[*]define command{
[*]       command_name    check_nrpe
[*]       command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
[*]}
  在服务器监控配置文件中加入如下行,确保被监控机的nrpe服务是开的

[*]define service{
[*]       use                     generic-service
[*]       host_name               10.130.2.22
[*]       service_description   check_load
[*]       check_command         check_nrpe!check_load
[*]       max_check_attempts      5
[*]       normal_check_interval   1
[*]}
  重新加载nagios使配置生效。

[*]service nagios reload
  5.自定义监控脚本
编写脚本check_diskmount.sh

[*]vim /opt/hadoop/nagios/libexec/check_diskmount.sh
[*]#!/bin/bash
[*]num=`cat /proc/mounts| grep '/disk' | wc -l`
[*]if[ $num -eq 12];then
[*]   echo "OK - mount disk is $num"
[*]   exit 0
[*]else
[*]   echo "Critical - mount disk is $num"
[*]   exit 1
[*]fi
  加上可执行权限

[*]chmod +x /opt/hadoop/nagios/libexec/check_diskmount.sh
  在被监控机的nrpe里加入自定义脚本路径

[*]vim /opt/hadoop/nagios/etc/nrpe.cfg
[*]command=/opt/hadoop/nagios/libexec/check_diskmount.sh
  重启nrpe

[*]/etc/init.d/nrpe restart
  在nagios服务端加入配置

[*]vim /opt/hadoop/nagios/etc/servers/10.130.2.22.cfg
[*]define service{
[*]       use                     generic-service
[*]       host_name               s9xplan2.isv.cm6
[*]       service_description   check_diskmount
[*]       check_command         check_nrpe!check_diskmount
[*]       max_check_attempts      3
[*]       normal_check_interval   1
[*]}
  重新加载nagios,使得配置生效

[*]service nagios reload

  摘自:http://www.opstool.com/article/236
页: [1]
查看完整版本: (转)CentOS搭建Nagios监控