Linux系统搭建Nagios监控平台
一、首先在Nagios监控的服务器部署1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# 安装Nagios软件及其依赖的软件
# yum install -y httpd php gcc glibc glibc-common net-snmp nagios nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe gd gd-devel openssl openssl-devel
# 定义Nagios登陆的账号与密码
# htpasswd -c /etc/nagios/passwd nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin
# 对配置文件进行检测
# nagios -v /etc/nagios/nagios.cfg
Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
Checked 8 services.
Checking hosts...
Checked 1 hosts.
Checking host groups...
Checked 1 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 24 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
# service httpd start
正在启动 httpd: [确定]
# service nagios start
Starting nagios: done.
通过浏览器进行访问,输入之前设定账号与密码,进行登陆
登陆成功后,我们可以分别查看主机,以及所监控的服务
监控的主机,默认的只有监控服务器主机一台
监控的服务
二、监控服务器的基本架构以及搭建起来,接下来我们开始配置被监控主机,看一下如何添加主机
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# 在客户端安装必要的软件
# yum install -y nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe openssl openssl-devel
# vim /etc/nagios/nrpe.cfg
................................
# 这里需要添加允许访问的主机地址
allowed_hosts=127.0.0.1,192.168.1.132
.................................
# 启动服务
# service nrpe start
Starting nrpe: [确定]
# nagios的主机是以配置文件进行划分主机的,所以我们只要创建对应主机的配置文件
# cd /etc/nagios/objects/
# 这个目录下有很多的配置文件,功能各不相同,我们会以本机默认配置为模板,定义主机配置文件
# ls
commands.cfgcontacts.cfglocalhost.cfgprinter.cfgswitch.cfgtemplates.cfgtimeperiods.cfgwindows.cfg
# vim /etc/nagios/conf.d/web1.cfg
# 基本监测服务配置形态
efine host{
use linux-server
host_name web1
alias web1.com
address 192.168.1.130
}
define service{
use generic-service
host_name web1
service_description PING
check_command check_ping!100.0,20%!500.0,60%
max_check_attempts 5
normal_check_interval 1
notification_interval 60
}
define service{
use generic-service
host_name web1
service_description SSH
check_command check_ssh
notifications_enabled 0
}
define service{
use generic-service
host_name web1
service_description HTTP
check_command check_http
notifications_enabled 1
contact_groups admins
notification_period 24x7
notification_options w,u,c,r
}
# 对配置文件的正确性进行检查
# nagios -v /etc/nagios/nagios.cfg
Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config file '/etc/nagios/objects/windows.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
Processing object config file '/etc/nagios/conf.d/web1.cfg'...
Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
Checked 25 services.
Checking hosts...
Checked 3 hosts.
Checking host groups...
Checked 2 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 25 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
到主机选项中查看,增加了一个web1的主机
图:一
图:二基本形态已经完成
功能增加:
1、监控负载与硬盘状态
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# 在监控服务器上修改配置
# vim /etc/nagios/objects/commands.cfg
..............................
# 在配置中增加以下内容
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
..............................
# vim /etc/nagios/conf.d/web1.cfg
# 增加监控系统负载和硬盘的状态
....................................
define service{
use generic-service
host_name web1
service_description check_load
check_command check_nrpe!check_load
max_check_attempts 5
normal_check_interval 1
}
define service{
use generic-service
host_name web1
service_description check_disk_hda1
check_command check_nrpe!check_hda1
max_check_attempts 5
normal_check_interval 1
}
# nagios -v /etc/nagios/nagios.cfg
..........................................
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
修改被监控主机的配置
1
2
3
4
5
6
7
8
# vim /etc/nagios/nrpe.cfg
..........................................
# 后面的/dev/hda1修改为/dev/sda1
command=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1
.........................................
# /etc/init.d/nrpe restart
Shutting down nrpe: [确定]
Starting nrpe: [确定]
再次查看浏览器,刚才配置的两个监控项目,可以了
2、配置告警
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# 修改监控服务器配置
# vim /etc/nagios/objects/contacts.cfg
define contact{
contact_name nagiosadmin
use generic-contact
alias Nagios Admin
email nagios@localhost <== 这里修改为邮件地址
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members nagiosadmin
}
# 修改要监控的服务
# vim /etc/nagios/conf.d/web1.cfg
.................................................
# 设置来监控HTTP服务
define service{
use generic-service
host_name web1
service_description HTTP
check_command check_http
notifications_enabled 1
contact_groups admins
notification_period 24x7
notification_options w,u,c,r
}
..................................................
# nagios -v /etc/nagios/nagios.cfg
Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL
...........................................................
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
# 安装发送邮件的服务并启动
# yum install -y sendmail
# /etc/init.d/sendmail start
正在启动 sendmail: [确定]
启动 sm-client: [确定]
我们在客户机上停止http服务,来进行测试告警邮件
1
2
# /etc/init.d/httpd stop
停止 httpd: [确定]
页:
[1]