erw23312 发表于 2015-6-16 09:03:21

Linux系统搭建Nagios监控平台

一、首先在Nagios监控的服务器部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# 安装Nagios软件及其依赖的软件
# yum install -y httpd php gcc glibc glibc-common net-snmp nagios nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe gd gd-devel openssl openssl-devel
# 定义Nagios登陆的账号与密码
# htpasswd -c /etc/nagios/passwd nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin
# 对配置文件进行检测
# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
      Checked 8 services.
Checking hosts...
      Checked 1 hosts.
Checking host groups...
      Checked 1 host groups.
Checking service groups...
      Checked 0 service groups.
Checking contacts...
      Checked 1 contacts.
Checking contact groups...
      Checked 1 contact groups.
Checking service escalations...
      Checked 0 service escalations.
Checking service dependencies...
      Checked 0 service dependencies.
Checking host escalations...
      Checked 0 host escalations.
Checking host dependencies...
      Checked 0 host dependencies.
Checking commands...
      Checked 24 commands.
Checking time periods...
      Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
# service httpd start
正在启动 httpd:                                           [确定]
# service nagios start
Starting nagios: done.




通过浏览器进行访问,输入之前设定账号与密码,进行登陆

登陆成功后,我们可以分别查看主机,以及所监控的服务


监控的主机,默认的只有监控服务器主机一台

监控的服务


二、监控服务器的基本架构以及搭建起来,接下来我们开始配置被监控主机,看一下如何添加主机

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# 在客户端安装必要的软件
# yum install -y nagios-plugins nagios-plugins-all nrpe nagios-plugins-nrpe openssl openssl-devel
# vim /etc/nagios/nrpe.cfg
................................
# 这里需要添加允许访问的主机地址
allowed_hosts=127.0.0.1,192.168.1.132
.................................
# 启动服务
# service nrpe start
Starting nrpe:                                             [确定]

# nagios的主机是以配置文件进行划分主机的,所以我们只要创建对应主机的配置文件
# cd /etc/nagios/objects/
# 这个目录下有很多的配置文件,功能各不相同,我们会以本机默认配置为模板,定义主机配置文件
# ls
commands.cfgcontacts.cfglocalhost.cfgprinter.cfgswitch.cfgtemplates.cfgtimeperiods.cfgwindows.cfg
# vim /etc/nagios/conf.d/web1.cfg
# 基本监测服务配置形态
efine host{
      use                     linux-server
      host_name               web1
      alias                   web1.com
      address               192.168.1.130
      }

define service{
      use                           generic-service
      host_name                     web1
      service_description             PING
      check_command                   check_ping!100.0,20%!500.0,60%
      max_check_attempts 5
      normal_check_interval 1
      notification_interval         60
      }
         
define service{
      use                           generic-service
      host_name                     web1
      service_description             SSH
      check_command                   check_ssh
      notifications_enabled         0
      }
      
define service{
      use                           generic-service
      host_name                     web1
      service_description             HTTP
      check_command                   check_http
      notifications_enabled         1
      contact_groups                  admins
      notification_period   24x7
      notification_options            w,u,c,r
      }
# 对配置文件的正确性进行检查      
# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config file '/etc/nagios/objects/windows.cfg'...
Processing object config directory '/etc/nagios/conf.d'...
Processing object config file '/etc/nagios/conf.d/web1.cfg'...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking services...
      Checked 25 services.
Checking hosts...
      Checked 3 hosts.
Checking host groups...
      Checked 2 host groups.
Checking service groups...
      Checked 0 service groups.
Checking contacts...
      Checked 1 contacts.
Checking contact groups...
      Checked 1 contact groups.
Checking service escalations...
      Checked 0 service escalations.
Checking service dependencies...
      Checked 0 service dependencies.
Checking host escalations...
      Checked 0 host escalations.
Checking host dependencies...
      Checked 0 host dependencies.
Checking commands...
      Checked 25 commands.
Checking time periods...
      Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.




         到主机选项中查看,增加了一个web1的主机

图:一

图:二基本形态已经完成

功能增加:
1、监控负载与硬盘状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# 在监控服务器上修改配置
# vim /etc/nagios/objects/commands.cfg
..............................
# 在配置中增加以下内容
define command{
      command_name    check_nrpe
      command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
      }
..............................


# vim /etc/nagios/conf.d/web1.cfg
# 增加监控系统负载和硬盘的状态
....................................
define service{
      use   generic-service
      host_name                     web1
      service_description             check_load
      check_command                   check_nrpe!check_load
      max_check_attempts 5
      normal_check_interval 1
      }


define service{
      use                           generic-service
      host_name                     web1
      service_description             check_disk_hda1
      check_command                   check_nrpe!check_hda1
      max_check_attempts 5
      normal_check_interval 1
      }
# nagios -v /etc/nagios/nagios.cfg
..........................................
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.





修改被监控主机的配置


1
2
3
4
5
6
7
8
# vim /etc/nagios/nrpe.cfg
..........................................
# 后面的/dev/hda1修改为/dev/sda1
command=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sda1
.........................................
# /etc/init.d/nrpe restart
Shutting down nrpe:                                        [确定]
Starting nrpe:                                             [确定]





再次查看浏览器,刚才配置的两个监控项目,可以了


2、配置告警

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# 修改监控服务器配置
# vim /etc/nagios/objects/contacts.cfg

define contact{
      contact_name                  nagiosadmin            
      use                           generic-contact      
      alias                           Nagios Admin      
      email                           nagios@localhost   <== 这里修改为邮件地址   
      }
define contactgroup{
      contactgroup_name       admins
      alias                   Nagios Administrators
      members               nagiosadmin
      }


# 修改要监控的服务      
# vim /etc/nagios/conf.d/web1.cfg
.................................................

# 设置来监控HTTP服务
define service{
      use                           generic-service
      host_name                     web1
      service_description             HTTP
      check_command                   check_http
      notifications_enabled         1
      contact_groups                  admins
      notification_period   24x7
      notification_options            w,u,c,r
      }
..................................................

# nagios -v /etc/nagios/nagios.cfg

Nagios Core 3.5.1
Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-30-2013
License: GPL

...........................................................

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
# /etc/init.d/nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.

# 安装发送邮件的服务并启动
# yum install -y sendmail
# /etc/init.d/sendmail start
正在启动 sendmail:                                        [确定]
启动 sm-client:                                           [确定]





我们在客户机上停止http服务,来进行测试告警邮件

1
2
# /etc/init.d/httpd stop
停止 httpd:                                             [确定]







页: [1]
查看完整版本: Linux系统搭建Nagios监控平台