*nix Thread, Nagios argument with HP ProCurve switch memory in Technical; I'm configuring Nagios to monitor my HP ProCurve switches. I found excellent command and service definitions at NagiosExchange and all ...
16th August 2008, 09:09 PM #1
- Rep Power
Nagios argument with HP ProCurve switch memory
I'm configuring Nagios to monitor my HP ProCurve switches. I found excellent command and service definitions at NagiosExchange and all is working wonderfully except for the service that monitors free memory.
The definitions that I'm using are copied and pasted directly from the above-mentions site. They are:
command in commands.cfg:
service in switch.cfg:
command_line $USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .188.8.131.52.184.108.40.206.220.127.116.11.18.104.22.168.1.6.1 -t 5 -w $ARG2$ -c $ARG3$ -u bytes -l free
The switches list about 150MB of total memory, with about 109MB free when I view status from the switch console itself. Nagios is correctly reporting the free 109MB, but is showing the state as critical.
# Service definition MEM-FREE
use generic-service ; Name of service template to use
I've done a good bit of googling to try to understand how the "2000:30000000" and "1000:30000000" sections work. I realize that those are ARG2 and ARG3, and that ARG2 is the warning level and ARG3 is the critical level. What I don't understand is how to adjust those numbers to get the levels that I want to give warning and critical status on my particular switches. I've found info that states that two numbers followed by a colon are a range, and other info that says they are a less-than:higher-than definition for when to return the state defined by the command.
What I'd like is to have the following:
-Up to 60MB of free memory = OK
-Between 60MB and 40MB of free memory = Warning
-Less than 40MB of free memory = Critical
I will likely adjust those values once I get a better idea of memory usage under different loads.
I'd like to understand how to adjust the numbers in the service definition so that my service monitors will work as listed above. Can someone explain this, or point me to a resource that helps explain what the colon separated numbers mean on this particular command? I haven't had any luck in my searching, but I'm continuing to try to find as much information as I can to understand this.
IDG Tech News
16th August 2008, 09:21 PM #2
Looking at other examples on the wiki
Nagios it looks like the arguements are in KB.
16th August 2008, 09:42 PM #3
- Rep Power
Thanks for the link, that provides some helpful information.
I think in this case its bytes b/c of the "-u bytes" in the command definition. What I'm not understanding is what the colon does in the argument.
I think I've narrowed it down to meaning that anything outside the range of 2000 - 3000000 (lower than 2000 or higher than 30000000) would cause a warning state, and anything lower than 1000 or higher than 30000000 would cause a critical state.
So, I modified my config to use the following:
and now the status is showing OK, and again is showing the correct amount of free memory. The switch CLI shows 109,254,008 free as does the nagios service monitor. I'm just not positive that I have the correct values on either side of the colons yet.
Last Post: 20th August 2008, 09:19 PM
By pooley in forum EduGeek.net Site Problems
Last Post: 17th June 2008, 05:00 PM
By kennysarmy in forum Windows
Last Post: 31st March 2008, 09:48 AM
By mattx in forum Wireless Networks
Last Post: 17th August 2007, 10:53 AM
By MrDylan in forum Hardware
Last Post: 9th March 2006, 04:13 PM
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)