nagios+cacti整合(中文版)
nagios+cacti整合的安装文档安装环境LAMP
1.1 yum安装httpd、mysqld、php及依赖包
[*]yum -y install gcc* glibc* gd* php* mysql* http*
[*]yum -y install httpd mysql-server perl-DBI perl-DBD-MySQL php php-devel php-mysql php-snmp php-pdophp-gd lm_sensors net-snmp net-snmp-libs net-snmp-utils net-snmp-devel
[*]chkconfig mysqld on
[*]chkconfig httpd on
[*]chkconfig snmpd on
[*]service mysqld start
[*]service httpd start
[*]service snmpd start
1.2测试mysql、php、Apache是否安装成功
编写php的测试页面
[*]vim /var/www/html/index.php
[*]
访问http://10.7.7.22/index.php
1.3创建cacti的数据库及cacti用户并赋予cacti用户权限
[*]mysqladmin -u root password '123'
[*][root@localhost yum.repos.d]# mysql -u root -p
[*]Enter password:
[*]Welcome to the MySQL monitor. Commands end with ; or \g.
[*]Your MySQL connection id is 4
[*]Server version: 5.1.73 Source distribution
[*]Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
[*]Oracle is a registered trademark of Oracle Corporation and/or its
[*]affiliates. Other names may be trademarks of their respective
[*]owners.
[*]Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
[*]mysql>Create database cacti default character set utf8;
[*]Query OK, 1 row affected (0.00 sec)
[*]mysql>Grant all on cacti.* to cacti@localhost identified by '123';
[*]Query OK, 0 rows affected (0.03 sec)
[*]mysql>Flush privileges;
[*]Query OK, 0 rows affected (0.00 sec)
[*]mysql> quit
[*]Bye
安装cacti
2.1安装rrdtool需要的软件包
[*]yum -y install cairo-devel libxml2-devel pango pango-devel
2.2安装rrdtool
[*]tar zxvf rrdtool-1.4.4.tar.gz
[*]cd rrdtool-1.4.4
[*]./configure --prefix=/usr/local/rrdtool
[*]make
[*]make install
[*]ln -s /usr/local/rrdtool/bin/* /usr/local/bin/ #此步非常重要
2.3安装cacti-0.8.7.e中文版
[*]wget http://blogimg.chinaunix.net/blog/upfile2/090815172648.gz
[*]tar xvf090815172648.gz -C /var/www/html
[*]cd /var/www/html
[*]mv cacti-0.8.7e-cn-utf8/ cacti
[*]mysql -u cacti -p cacti /dev/null2>&1
[*]service crond restart
浏览器初始化cacti
3.1浏览器输入http://10.7.7.22/cacti/install初始化cacti
如果打开网页页面出现空白那么说明数据库没有导入进去重新导入数据库后重启httpd服务即可
注意:RRDTOOLS版本默认为1.0.x改为1.3.x就行了
3.2如果打开未看到生成图像请手动执行下面命令
[*]/usr/bin/php/var/www/cacti/poller.php &>/dev/null手动生成图像
3.3字体在图形中显示不正常的解决方法
下载并安装中文字体这个是微米黑字体
[*]wgethttp://sourceforge.net/projects/wqy/files/wqy-microhei/0.2.0-beta/wqy-microhei-0.2.0-beta.tar.gz
[*]tar zxvf wqy-microhei-0.2.0-beta.tar.gz
[*]cd wqy-microhei
[*]cp wqy-microhei.ttc /usr/share/fonts/wqy-microhei.ttc
安装完成后注意在"设置"中更改下面两个必改项.
常规->RRDTool应用程序版本改为1.3.x,默认为1.0.x.不改可能图像不能正常显示出来.
路径->RRDTool默认字体路径改为上面安装的文件路径如/usr/share/fonts/wqy-microhei.ttc
安装cacti的插件
4.1安装插件cacti-plugin
[*]gunzip 51CTO下载-cacti-plugin-0.8.7d-PA-v2.4-cn-utf8.diff.gz
[*]mv 51CTO下载-cacti-plugin-0.8.7d-PA-v2.4-cn-utf8.diff /var/www/html/cacti/
[*]cd /var/www/html/cacti/
[*]mv 51CTO下载-cacti-plugin-0.8.7d-PA-v2.4-cn-utf8.diff cacti-plugin-0.8.7d-PA-v2.4-cn-utf8.diff
[*]patch -p1 -N ALTER TABLE npc_eventhandlers ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.02 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_hostchecks ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.02 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_hoststatus ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.03 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_notifications ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.03 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_servicechecks ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.02 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_servicestatus ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.02 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_statehistory ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.03 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_systemcommands ADD long_output TEXT NOT NULL DEFAULT '' AFTER output;
[*]Query OK, 0 rows affected, 1 warning (0.03 sec)
[*]Records: 0Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_services ADD importance smallint(6) NOT NULL DEFAULT '0';
[*]Query OK, 8 rows affected (0.04 sec)
[*]Records: 8Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_hosts ADD importance smallint(6) NOT NULL DEFAULT '0';
[*]Query OK, 1 row affected (0.02 sec)
[*]Records: 1Duplicates: 0Warnings: 0
[*]
[*]mysql>ALTER TABLE npc_contacts ADD minimum_importance smallint(6) NOT NULL DEFAULT '0';
[*]Query OK, 1 row affected (0.02 sec)
[*]Records: 1Duplicates: 0Warnings: 0
[*]
[*]mysql> quit
[*]Bye
6.8重启服务
[*]service mysqld restart
[*]service httpd restart
[*]service nagios restart
[*]ps -ef |grep ndo2db
[*]kill -9 ndo2db的进程号
[*]启动ndo2db
[*]/usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg
3.4.7 Cacti添加被监控主机
7.1监控端的设置
Cacti中添加监控主机设置方法:
① 添加设备
报错:SNMP错误
解决办法:是因为被监控端没有启动snmp
service snmpd restart
然后到监控端cacti服务器上
snmpwalk -c public -v 2c 192.168.1.222 --> (这个ip为被监控主机的ip)
如果能够接收到被监控机器的数据信息,则表示被监控主机的snmp配置已经完成,没有错误。如果没有接收到被监控主机的数据信息,那么进行第三步操作。
第三,用root登录被监控主机,修改snmp的配置文件:
[*]vi /etc/snmp/snmpd.conf
[*]最后配置如下:
[*]syslocation Server Room
[*]syscontact Sysadmin (root@localhost)
[*]rocommunity public 127.0.0.1
[*]agentaddress 161
[*]rocommunity public
[*]rwcommunity private
[*]trapsink 192.168.124.14 public 162 -->这里的ip=192.168.124.14为被监控主机ip
然后,再执行第二步操作即可。
eq\o\ac(○,2)2添加图像
Nagios NRPE监控远程主机
8.1监控端的设置
监控机安装NRPE
[*]wget http://nchc.dl.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
[*]tar zxvf nrpe-2.12.tar.gz
[*]cd nrpe-2.12
[*]./configure --prefix=/usr/local/nagios
[*]make all
[*]make install-plugin
[*]#监控机只需安装到这步
8.2被监控端的设置
8.2.1被监控端添加用户
[*]groupadd nagios
[*]useradd -g nagios -d /usr/local/nagios -s /sbin/nologin nagios
8.2.2被监控端安装nagios-plugins插件
[*]wget http://nchc.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.13.tar.gz
[*]tar zxf nagios-plugins-1.4.13.tar.gz
[*]cd nagios-plugins-1.4.13
[*]./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios --with-ping-command="/bin/ping" --with-mysql=/opt/mysql
[*]make
[*]make install
#查看插件文件是否已安装在这个目录
[*]ls /usr/local/nagios/libexec
8.2.3被监控端安装NRPE
[*]wget http://nchc.dl.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
[*]tar zxvf nrpe-2.12.tar.gz
[*]cd nrpe-2.12
[*]./configure --prefix=/usr/local/nagios
[*]make all
[*]make install-plugin
[*]make install-daemon
[*]make install-daemon-config
[*]chown -R nagios:nagios /usr/local/nagios
8.2.4配置被监控端的NRPE:
[*]vi /usr/local/nagios/etc/nrpe.cfg
[*]allowed_hosts=127.0.0.1,192.168.1.22 #Nagios监控机的地址或域名
修改/etc/hosts.allow增加监控机ip
[*]echo 'nrpe:192.168.1.22' >> /etc/hosts.allow
8.2.5启动NRPE守护进程:
[*]/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
8.2.6将nrpe加入/etc/rc.local,以便开机自动启动。
[*]echo "/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d" >> /etc/rc.local
8.2.7检查NRPE是否正常: 在被监控机上
[*]/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
查看相应的端口:
[*]netstat -an |grep 5666
防火墙开启5666允许局域网IP或固定IP连接
在监控主机上
[*]/usr/local/nagios/libexec/check_nrpe -H192.168.1.222 $目标主机地址(被监控端地址)
都应该可以输出NRPE的版本:NRPE v2.12
8.2.8检查可监控的服务 在被监控端的nrpe.cfg文件中,可以看到这样的配置:
command=/usr/local/nagios/libexec/check_load-w 15,10,5 -c 30,25,20
这是用来检查CPU负载的。
8.2.9配置被监控端的nrpe.cfg
[*]vim /usr/local/nagios/etc/nrpe.cfg
[*]添加以下内容:
[*]command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
[*]command[check_load]=/usr/local/nagios/libexec/check_cpu.sh -w 80% -c 90%
[*]command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
[*]command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2
[*]command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
[*]command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
[*]command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
[*]command[check_iostat]=/usr/local/nagios/libexec/check_iostat.sh -d sda -w 6 -c 10
[*]command[check_mysql]=/usr/local/nagios/libexec/check_mysql -H 192.168.0.22 -u nagios -p 123456 -d nagios
[*]command[check_nginx]=/usr/local/nagios/libexec/check_nginx.sh -u 192.168.0.22 -p /status -w 4000 -c 5000
[*]command[check_mem]=/usr/local/nagios/libexec/check_memory.pl -f -w 20 -c 10
如果需要自定参数则使用下面命令
command=/usr/local/nagios/libexec/check_load-w $ARG1$ -c $ARG2$
并开启dont_blame_nrpe =1
开启参数将会带来一定的安全风险
8.2.10被监控机重启nrpe
[*]ps aux|grep nrpe
[*]kill $pid
[*]/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
8.3监控机NRPE的设置
8.3.1在监控机commands.cfg添加nrpe的定义
[*]vim /usr/local/nagios/etc/objects/commands.cfg
[*]# 'check_nrpe ' command definition
[*]define command{
[*]command_name check_nrpe
[*]command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
[*]}
8.3.2在监控机的/usr/local/nagios/etc/objects/下添加被监控机的配置文件app1.cfg,并配置 在/usr/local/nagios/etc/objects/目录下新建文件:app1.cfg
把app1.cfg添加到nagios主机:
[*]echo "cfg_file=/usr/local/nagios/etc/objects/app1.cfg" >> /usr/local/nagios/etc/nagios.cfg
如果要再添加B机器如法炮制就可以了
[*]Vim app1.cfg
[*]define host{
[*]use linux-server
[*]host_name nagios-client
[*]alias nagios-client
[*]address 192.168.1.222 //被监控端的IP地址
[*]icon_image server.gif
[*]statusmap_image server.gd2
[*]2d_coords 500,200
[*]3d_coords 500,200,100
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name *
[*]service_description PING
[*]check_command check_ping!100.0,20%!500.0,60%
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description boot分区
[*]check_command check_nrpe!check_sda1
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description 根分区
[*]check_command check_nrpe!check_sda2
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description 登录用户数
[*]check_command check_nrpe!check_users
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description 进总程数
[*]check_command check_nrpe!check_total_procs
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description CPU平均负载
[*]check_command check_nrpe!check_load
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description 虚拟内存
[*]check_command check_nrpe!check_swap
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description SSH
[*]check_command check_nrpe!check_ssh
[*]notifications_enabled 0
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description 僵死进程数
[*]check_command check_nrpe!check_zombie_procs
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description iostat
[*]check_command check_nrpe!check_iostat
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description mysql
[*]check_command check_nrpe!check_mysql
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description nginx
[*]check_command check_nrpe!check_nginx
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description memory
[*]check_command check_nrpe!check_mem
[*]}
[*]define service{
[*]use local-service ; Name of service template to use
[*]host_name nagios-client
[*]service_description IP连接数
[*]check_command check_nrpe!check_ip_conn
[*]}
8.3.3检查nagios配置文件是否报错
[*]/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
重启监控机nagios
[*]service nagios reload
访问http://localhost/nagios就可以看到新增的机器了
页:
[1]