liukaida 发表于 2019-1-12 13:26:57

nagios安装及配置

  

  nagios 详细的安装及配置! http://guojiping.blog.运维网.com/5635432/1293933
一、安装前的准备
1.下载到核心源码:
  wget http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.3/nagios-3.4.3.tar.gz/download

2.下载插件包nagios-plugins-1.4.16.tar.gz:
http://download.chinaunix.net/download.php?id=26943&ResourceID=7184
3. nagios服务器端软件和客户端软件
  # wget http://nchc.dl.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.13/nrpe-2.13.tar.gz

4.源码安装前需要先确定 已经安装 apache,gcc,GD库和开发库

  yum install -y httpd php gcc glibc glibc-common gd gd-devel perl make
  

二、监控端:
第1个步骤:
  # useradd -s /sbin/nologin nagios
# mkdir /usr/local/nagios
# chown -R nagios.nagios /usr/local/nagios#提高安全性

  创建一个名为nagios的帐号并给定登录口令
# useradd nagios
passwd nagios#密码123456
      创建一个用户组名为nagcmd用于从Web接口执行外部命令
/usr/sbin/groupadd nagcmd
/usr/sbin/usermod -G nagcmd nagios
/usr/sbin/usermod -G nagcmd apache
http://img1.运维网.com/attachment/201309/10/5635432_13788182023ZAB.jpg
  # yum install sendmail –y
  #cd nagios
运行Nagios配置脚本并使用先前开设的用户及用户组:
./configure --with-command-group=nagcmd
  下面这个,我这里没指定
http://img1.运维网.com/attachment/201309/10/5635432_1378818209nu28.jpg
http://img1.运维网.com/attachment/201309/10/5635432_13788182106Sc9.jpg
  make all
make install
http://img1.运维网.com/attachment/201309/10/5635432_1378818210pBwA.jpg
make install-init
http://img1.运维网.com/attachment/201309/10/5635432_1378818210NUQV.jpg
  make install-config
http://img1.运维网.com/attachment/201309/10/5635432_1378818212mUiI.jpg
  make install-commandmode
  把Nagios加入到服务列表中以使之在系统启动时自动启动
http://img1.运维网.com/attachment/201309/10/5635432_1378818217vwSW.jpg
  现在还不能启动Nagios-还有一些要做的...
客户后配置
样例配置文件默认安装在这个目录下/usr/local/nagios/etc,这些样例文件可以配置Nagios使之正常运行,只需要做一个简单的修改...
用你擅长的编辑器软件来编辑这个/usr/local/nagios/etc/objects/contacts.cfg配置文件,更改email地址nagiosadmin的联系人定义信息中的EMail信息为你的EMail信息以接收报警内容。
vi /usr/local/nagios/etc/objects/contacts.cfg
  配置WEB接口
安装Nagios的WEB配置文件到Apache的conf.d目录下
  nagios-3.4.3.tar.gz解压后的nagios中执行
  # cd nagios
  # make install-webconf
  在ls /etc/httpd/conf.d/下,会出现nagios.conf 文件

第2个步骤:ngios插件
  
http://img1.运维网.com/attachment/201309/10/5635432_1378818363BfOa.jpg
  下面使用的版本与上面不同:
展开Nagios插件的源程序包
cd /root
tar xzf nagios-plugins-1.4.16.tar.gz
cd nagios-plugins-1.4.16
yum install -y openssl openssl-devel
编译并安装插件
./configure --with-nagios-user=nagios --with-nagios-group=nagios
make && make install
第3个步骤:安装nagios汉化插件(此步,我跳过了)
http://img1.运维网.com/attachment/201309/10/5635432_1378818388BUxN.jpg
http://img1.运维网.com/attachment/201309/10/5635432_13788184024smW.jpg
第4个步骤:
http://img1.运维网.com/attachment/201309/10/5635432_13788184365iE1.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818473PmDc.jpg
  我在这里采用的yum安装!
http://img1.运维网.com/attachment/201309/10/5635432_13788184883ft6.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818531GP03.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818564TNHC.jpg
  以上,是参考文档,我这里如下配置:
  因为前面建立了个用户组,把nagios和apache都添加进去了!因此,我这里无需改动:除了这个无需改动http://img1.运维网.com/attachment/201309/10/5635432_1378818564j0SR.jpg,其他修改如上
http://img1.运维网.com/attachment/201309/10/5635432_1378818572ver6.jpg
  创建一个nagiosadmin的用户用于Nagios的WEB接口登录。记下你所设置的登录口令,一会儿你会用到它。
htpasswd -c /usr/local/nagios/etc/htpasswd nagiosadmin #密码123456
http://img1.运维网.com/attachment/201309/10/5635432_1378818578hSu3.jpg
  重启Apache服务以使设置生效。
service httpd restart
  如果出现服务启动不了,出现这样
http://img1.运维网.com/attachment/201309/10/5635432_13788185811Rce.jpg
http://img1.运维网.com/attachment/201309/10/5635432_13788185830DyM.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818587ZclE.jpg
  启动Nagios
验证Nagios的样例配置文件
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
如果没有报错,可以启动Nagios服务
service nagios start
# service nagios start
  Starting nagios: done.
  界面:
  http://192.168.1.104:1080/nagios/
  访问时,输入用户名和密码:则是vim /usr/local/nagios/etc/htpasswd 该文件中的用户和密码!
http://img1.运维网.com/attachment/201309/211524500.jpg
三、中间出现的问题及解决方法:
  
这时,监控端已配置好:
  http://192.168.1.104:1080/nagios/
  访问时,却发现,访问不到
http://img1.运维网.com/attachment/201309/10/5635432_1378818596oFxz.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818600WGKd.jpg
http://img1.运维网.com/attachment/201309/10/5635432_13788186027VFo.jpg
  Apache重启才能生效!
http://img1.运维网.com/attachment/201309/10/5635432_1378818606a4RL.jpg
  如果无法访问,查看apache服务是否开启,如果启动不了,查原因,我这里是由于selinux没关闭,导致的,vim /etc/selinux/config 把#SELINUX=enforcing
  SELINUX=disabled
  如果apache正常,扔无法打开web,则查看iptables
  如iptables –F 规则,清理下,ok
四、NEPR监控远程主机
  

在监控机上安装
  cd /root
tar xvf nrpe-2.13.tar.gz
cd nrpe-2.13
./configure
make all && make install-plugin
在文件/usr/local/nagios/etc/objects/commands.cfg后面增加:
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$
}
  定义主机:
http://img1.运维网.com/attachment/201309/10/5635432_1378818612wIvz.jpg
  vim /usr/local/nagios/etc/nagios.cfg
http://img1.运维网.com/attachment/201309/10/5635432_1378818615Euyk.jpg
  /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
  监测看到如下: 这里有1个警告,是没问题的!
http://img1.运维网.com/attachment/201309/10/5635432_1378818619IQ9f.jpg
  /etc/init.d/nagios restart
  看到界面如下:
http://img1.运维网.com/attachment/201309/10/5635432_1378818625RvoA.jpg
  定义相应的服务:
  vim /usr/local/nagios/etc/objects/service.cfg (自己创建的)
  # cat service.cfg
  define service{
  use generic-service;
  host_name test;
  service_description users;
  check_command check_nrpe!check_users;
  }
  define service{
  use generic-service;
  host_name test;
  service_description load;
  check_command check_nrpe!check_load;
  }
  define service{
  use generic-service;
  host_name test;
  service_description disk;
  check_command check_nrpe!check_disk;
  }
  define service{
  use generic-service;
  host_name test;
  service_description zombie;
  check_command check_nrpe!check_zombie_procs;
  }
  define servicegroup{
  servicegroup_name servergroup;
  alias server-group;
  members test,users,test,load,test,disk,test,zombie;
  }(这个文档对应的是被监控端的vim/usr/local/nagios/etc/nrpe.cfg 中定义的命令,如command=/usr/local/nagios/libexec/check_users -w 10-c 16)
  写完后,进行验证:
  /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
  然后记着重启nagios
http://img1.运维网.com/attachment/201309/10/5635432_1378818626Dycw.jpg
  出现上面:
  在被监控端192.168.1.106:
http://img1.运维网.com/attachment/201309/10/5635432_1378818629OLIu.jpg
  在监控端:
  # /usr/local/nagios/libexec/check_nrpe -H 192.168.1.106 -c check_disk
  DISK OK - free space: / 6666 MB (72% inode=95%);| /=2521MB;9188;9488;0;9688
http://img1.运维网.com/attachment/201309/10/5635432_1378818636IV7T.jpg
  上面的HTTP错误,开放80端口,在httpd.conf中,增加Listen 80,然后,apache服务重启,解决!
http://img1.运维网.com/attachment/201309/10/5635432_1378818638711F.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818641vVAA.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818641FMko.jpghttp://img1.运维网.com/attachment/201309/10/5635432_1378818644cMzg.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818644VjXQ.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818651vjQq.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818654nojr.jpg
  在被监控机上:
http://img1.运维网.com/attachment/201309/10/5635432_1378818659g86J.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818659jYcm.jpg
  在这里,我的操作如下:
groupadd nagios
useradd -g nagios -d /usr/local/nagios -s /sbin/nologin nagios
cd /tmp/
tar xvf nagios-plugins-1.4.16.tar.gz
cd nagios-plugins-1.4.16
./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-redhat-pthread-workaround
make && make install
http://img1.运维网.com/attachment/201309/10/5635432_1378818659M7m5.jpg
cd /root
tar xvfz nrpe-2.13.tar.gz
cd nrpe-2.13
./configure
make all
make install-plugin
make install-daemon
make install-daemon-config
http://img1.运维网.com/attachment/201309/10/5635432_1378818661v4ht.jpg
http://img1.运维网.com/attachment/201309/10/5635432_1378818664mKwF.jpg
  具体操作如下:
  更改/usr/local/nagios/etc/nrpe.cfg文件,在参数allowed_hosts后添加监控机ip(多个ip以逗号分隔,目前监控及ip为222.189.237.136,112.84.184.111)
http://img1.运维网.com/attachment/201309/10/5635432_13788186686ExZ.jpg
  可以在/etc/services结尾增加:
echo 'nrpe 5666/tcp # NRPE' >> /etc/services
http://img1.运维网.com/attachment/201309/10/5635432_1378818669If28.jpg
/usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.13
http://img1.运维网.com/attachment/201309/10/5635432_1378818674rOvI.jpg
  更改 /usr/local/nagios/etc/nrpe.cfg,把原来command都注释掉
nrpe.cfg文件里包含需要监控远程主机的命令,如:
echo '
command=/usr/local/nagios/libexec/check_users -w 10 -c 16
command=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command=/usr/local/nagios/libexec/check_disk -w 500 -c 200 -p /
command=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
  command=/usr/local/nagios/libexec/check_procs -w 40 -c 80 -m CPU
  command=/usr/local/nagios/libexec/check_procs -w 150 -c 200 ' >> /usr/local/nagios/etc/nrpe.cfg
  7)手工启动
手工启动方法 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
  8)添加开机启动
  echo '/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d' >> /etc/rc.local
9)验证nrpe是否监听
netstat -tanp | grep nrpe
  

---end---

  

  




页: [1]
查看完整版本: nagios安装及配置