Difference between revisions of "TUT:DisMan Monitoring"

From Net-SNMP Wiki
Jump to: navigation, search
(Getting Started)
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
The DisMan (Distributed Management) working group in the [[IETF]] produced [[RFC:2981]], which defines a set of MIB tables describing an ''Event Management System''.  With this system employed, devices are able to monitor themselves and neighboring devices for problems and report the problems via SNMP [[notifications]].  This can greatly relieve the amount of traffic a network management station needs to send to every device on the network.
  
 +
The [http://www.net-snmp.org/docs/man/snmpd.conf.html#lbAX snmpd.conf] manual page describes this feature at length.
 +
 +
Because it's the most typical usage, the remainder of this page discusses ''self-management'' (the agent querying itself) as opposed to querying other external devices.  This document also specifies how to perform ''self-management'' using snmpd.conf configuration directives.  It is actually possible to completely configure an agent using SNMP [[SET|SETs]] to the EVENT-MIB tables themselves.
 +
 +
== Getting Started ==
 +
 +
=== Step 1: Defining the Query Access Credentials ===
 +
 +
The internal monitoring actually makes its internal queries through SNMP itself.  Thus you need to "authorize" the service to actually browse the data from the agent.
 +
 +
The first requirement is that you define a SNMPv3 username to use when performing self-management.  First, we must define a user, add access control for it, and create it.  '''PLEASE PICK A NAME DIFFERENT THAN ''myMonitorName'' and ''mysecretepassword''''', which is used below.  Any name unique to your institution will work just fine.
 +
 +
  createUser    ''myMonitoringName'' SHA ''mysecretpassword'' AES
 +
  rouser        ''myMonitoringName''
 +
  iquerySecName ''myMonitoringName''
 +
 +
=== Step 2: Define where to send the notifications ===
 +
 +
We'll assume a simple trap receiver with only SNMPv2c support:
 +
 +
  trap2sink myhost.example.com myCommunity
 +
 +
=== Step 2: Define what to monitor ===
 +
 +
Now you need to define exactly what you want to monitor.  We can't tell you what is important to actually monitor, so we'll just give you some examples of popular things to monitor.
 +
 +
==== Monitoring disk space ====
 +
 +
Lets say you want to be notified when your processor load goes over 90%.  You can do this by examining the HOST-RESOURCES-MIB::hrProcessorLoad column from the hrProcessorTable.  First a simple expression just to do this:
 +
 +
  monitor machineTooBusy hrProcessorLoad > 90
 +
 +
The ''machineTooBusy'' token is a name we've given to this particular monitor line.  It is
 +
 +
This will generate a notification like this (reformatted output from [[snmptrapd]]):
 +
 +
  DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6539) 0:01:05.39
 +
  SNMPv2-MIB::snmpTrapOID.0 = OID: <font color="green">DISMAN-EVENT-MIB::mteTriggerFired</font>
 +
  DISMAN-EVENT-MIB::mteHotTrigger.0 = STRING: <font color="green">machineTooBusy</font>
 +
  DISMAN-EVENT-MIB::mteHotTargetName.0 = STRING: 
 +
  DISMAN-EVENT-MIB::mteHotContextName.0 = STRING:   
 +
  DISMAN-EVENT-MIB::mteHotOID.0 = OID: <font color="red">HOST-RESOURCES-MIB::hrProcessorLoad.769</font>
 +
  DISMAN-EVENT-MIB::mteHotValue.0 = <font color="red">INTEGER: 92</font>
 +
 +
The <font color="green">mteTriggerFired</font> is the notification that is actually sent.  The <font color="green">machineTooBusy</font> part comes from our own name for the monitoring event.  And the actual values that triggered this notification were from the <font color="red">hrProcessorLoad.769</font> OID instance which had a value of <font color="red">92</font>.
 +
 +
Another notification will be triggered after it falls back down below the 90% mark later.
 +
 +
== Other Configuration Options ==
 +
 +
=== Setting the monitoring frequency ===
 +
 +
The agent won't monitor endless for the values because it would take up all your processing time!  But you can control how frequently it does check for problems.  The default is every 600 seconds (10 minutes), but you can change this with the '''-r''' switch:
 +
 +
  monitor -r 30 machineTooBusy hrProcessorLoad > 90

Latest revision as of 00:18, 12 March 2011

The DisMan (Distributed Management) working group in the IETF produced RFC:2981, which defines a set of MIB tables describing an Event Management System. With this system employed, devices are able to monitor themselves and neighboring devices for problems and report the problems via SNMP notifications. This can greatly relieve the amount of traffic a network management station needs to send to every device on the network.

The snmpd.conf manual page describes this feature at length.

Because it's the most typical usage, the remainder of this page discusses self-management (the agent querying itself) as opposed to querying other external devices. This document also specifies how to perform self-management using snmpd.conf configuration directives. It is actually possible to completely configure an agent using SNMP SETs to the EVENT-MIB tables themselves.

Getting Started

Step 1: Defining the Query Access Credentials

The internal monitoring actually makes its internal queries through SNMP itself. Thus you need to "authorize" the service to actually browse the data from the agent.

The first requirement is that you define a SNMPv3 username to use when performing self-management. First, we must define a user, add access control for it, and create it. PLEASE PICK A NAME DIFFERENT THAN myMonitorName and mysecretepassword, which is used below. Any name unique to your institution will work just fine.

 createUser    myMonitoringName SHA mysecretpassword AES
 rouser        myMonitoringName
 iquerySecName myMonitoringName

Step 2: Define where to send the notifications

We'll assume a simple trap receiver with only SNMPv2c support:

 trap2sink myhost.example.com myCommunity

Step 2: Define what to monitor

Now you need to define exactly what you want to monitor. We can't tell you what is important to actually monitor, so we'll just give you some examples of popular things to monitor.

Monitoring disk space

Lets say you want to be notified when your processor load goes over 90%. You can do this by examining the HOST-RESOURCES-MIB::hrProcessorLoad column from the hrProcessorTable. First a simple expression just to do this:

 monitor machineTooBusy hrProcessorLoad > 90

The machineTooBusy token is a name we've given to this particular monitor line. It is

This will generate a notification like this (reformatted output from snmptrapd):

 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (6539) 0:01:05.39
 SNMPv2-MIB::snmpTrapOID.0 = OID: DISMAN-EVENT-MIB::mteTriggerFired
 DISMAN-EVENT-MIB::mteHotTrigger.0 = STRING: machineTooBusy
 DISMAN-EVENT-MIB::mteHotTargetName.0 = STRING:  
 DISMAN-EVENT-MIB::mteHotContextName.0 = STRING:     
 DISMAN-EVENT-MIB::mteHotOID.0 = OID: HOST-RESOURCES-MIB::hrProcessorLoad.769
 DISMAN-EVENT-MIB::mteHotValue.0 = INTEGER: 92

The mteTriggerFired is the notification that is actually sent. The machineTooBusy part comes from our own name for the monitoring event. And the actual values that triggered this notification were from the hrProcessorLoad.769 OID instance which had a value of 92.

Another notification will be triggered after it falls back down below the 90% mark later.

Other Configuration Options

Setting the monitoring frequency

The agent won't monitor endless for the values because it would take up all your processing time! But you can control how frequently it does check for problems. The default is every 600 seconds (10 minutes), but you can change this with the -r switch:

 monitor -r 30 machineTooBusy hrProcessorLoad > 90