87 lines
3.2 KiB
Plaintext
87 lines
3.2 KiB
Plaintext
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
|
|
Configuration Recipe for monitoring ZooKeeper using Nagios
|
|
----------------------------------------------------------
|
|
|
|
I will start by making the assumption that you already have an working Nagios install.
|
|
|
|
WARNING: I have wrote these instructions while installing and configuring the plugin on my desktop computer running Ubuntu 9.10. I've installed Nagios using apt-get.
|
|
|
|
WARNING: You should customize the config files as suggested in order to match your Nagios and Zookeeper install.
|
|
|
|
WARNING: This README assumes you know how to configure Nagios and how it works.
|
|
|
|
WARNING: You should customize the warning and critical levels on service checks to meet your own needs.
|
|
|
|
1. Install the plugin
|
|
|
|
$ cp check_zookeeper.py /usr/lib/nagios/plugins/
|
|
|
|
2. Install the new commands
|
|
|
|
$ cp zookeeper.cfg /etc/nagios-plugins/config
|
|
|
|
3. Update the list of servers in zookeeper.cfg for the command 'check_zookeeper' and update the port for the command 'check_zk_node' (default: 2181)
|
|
|
|
4. Create a virtual host in Nagios used for monitoring the cluster as a whole -OR- Create a hostgroup named 'zookeeper-servers' and add all the zookeeper cluster nodes.
|
|
|
|
5. Define service checks like I have ilustrated bellow or just use the provided definitions.
|
|
|
|
define service {
|
|
use generic-service
|
|
host_name zookeeper-cluster
|
|
service_description ...
|
|
check_command check_zookeeper!<exported-var>!<warning-level>!<critical-level>
|
|
}
|
|
|
|
define service {
|
|
hostgroup_name zookeeper-servers
|
|
use generic-service
|
|
service_description ZK_Open_File_Descriptors_Count
|
|
check_command check_zk_node!<exported-var>!<warning-level>!<critical-level>
|
|
}
|
|
|
|
Ex:
|
|
|
|
a. check the number of open file descriptors
|
|
|
|
define service{
|
|
use generic-service
|
|
host_name zookeeper-cluster
|
|
service_description ZK_Open_File_Descriptor_Count
|
|
check_command check_zookeeper!zk_open_file_descriptor_count!500!800
|
|
}
|
|
|
|
b. check the number of ephemerals nodes
|
|
|
|
define service {
|
|
use generic-service
|
|
host_name localhost
|
|
service_description ZK_Ephemerals_Count
|
|
check_command check_zookeeper!zk_ephemerals_count!10000!100000
|
|
}
|
|
|
|
c. check the number of open file descriptors for each host in the group
|
|
|
|
define service {
|
|
hostgroup_name zookeeper-servers
|
|
use generic-service
|
|
service_description ZK_Open_File_Descriptors_Count
|
|
check_command check_zk_node!zk_open_file_descriptor_count!500!800
|
|
}
|
|
|