Nagios - do you use it?
Do many people use Nagios for monitoring? I know our networking team uses it on a different site but not on ours so far.
I would be interested to see how you are using it? I would like to try it out in the next few weeks on a fedora box (btw which distro would you recommend - fedora, RHEL or ubuntu?) to monitor our managed switches, printers, a few servers and a few hundred cluster PC's as a test.
What are the kind of things you have got it monitoring on Windows PCs?
Have a search theres a few threads about it. I would suggest using Debian or Ubuntu.
I don't use it here, but I implemented it in my last place. Don't bother monitoring PCs and printers, you'll just get a silly number of alerts as people reboot them etc (key printers perhaps, if you like, but there's HP Web JetAdmin for that. I monitored switchgear, servers, and the outside world because I wanted to prove to the LA that their WAN sucked. I expect it probably still does.
Check out the wiki... there's a guide on how to set it up ;)
You might have guessed that I use it - Debian is my distro of choice.
we use it on Centos, it monitors switches, servers, printers and some services, but we don't get it to email us every time a printer is switched off :eek:
Centos if you are an RPMlover or Debian (or Ubuntu if you need more cuddling) if you aren't.
If you are a masocist you could try FreeBSD (or if you are a total masocist you can do like i've been doing and try DragonflyBSD)
Ubuntu - monitors switches/router/servers - very useful for monitoring Windows service statuses too (especially when ePortal likes to fall over every couple of days).
I've never used Nagios, but we do use HypericHQ:
Network & Systems Monitoring Software | Hyperic
Debian all the way.
We are monitoring all the PC's in the building.
I am working on a script/plugin at the moment that sends a WOL packet if a classrom PC get turned off.
It will only report to me if the machine is down for more than 15 minutes.
This should hopefully allow me to identify unplugged/failed PC's relatively quickly, as the teachers never seem to report them.
I use nagios but I found it had to set to so I downloaded a version of linux called Groundworks and installed it on an old pc with loads of ram and it just works. It will scan your address range and help you set it up.
Have a look at Download & Install GroundWork Monitor Community Edition :: GroundWork Open Source
We use it to monitor things that users can't or are unlikely to turn off. This is the key to eliminating junk alerts.
So: switches, ups, servers, routers, external connectivity, thin clients (they're always on as they're used constantly - if nagios can't connect to them it needs fixing), printers, air-con, room temperature etc. SNMP is your friend.
Essentially if X can provide data in plain text, Nagios can monitor it. So you could (for example) run a WMI query against your workstations, dump it to a text file and pull that textfile into Nagios. (You can also make WMI queries from Nagios using NSClient++, but since workstations are often turned off, it's not a great plan).
We run it on Ubuntu 6.06LTS. If you do monitor printers, make sure they're not set to page you when they get a paperjam. Our system pages my mobile if a server or core switch goes down, but not on external connectivity, printers or the router.
BTW Title could have been so much better "Nagios, $word*, Do you speak it?, but the mods wouldn't have been that impressed ;)
I've setup Nagios is a secondary school. Was a basic server, 256mb, first gen P4, FSB400 about 1.6GHz, nothing great, OpenBSD, worked pretty good. Easy enough to setup, it's part of the package tree too.
Only problem was that I monitored printers, doesn't help when the cleaner turns off the printer at the other end of the day. Came in the next day to a large inbox. All I done was to create a rule in my outlook. I suppose I could have set it up to intergrate into a supportdesk system, for example, snmp monitor, "server" appears offline auto logs call... that would be smart.
It's rather advance, just trying to get basic, ping test is simple enough, but you do have to do a lot of Nagios setup to do it, it can monitor loads of things, like load, etc, I've heard it can link into Snort, IDS, so you could then pick on "bad" traffic as well.
In your default template for a generic printer, you can set the notification period which will apply to all printers. So if cleaners are likely to turn printers off after 15:30, set it to not notify you of printer problems from 15:30 to 08:00.
Originally Posted by matt40k
In the templates.conf you can define a new service which has notifications disabled, then organise the printers/workstations into that service (rather than generic-service). Hey presto, no more notifications on user devices.
Yuppies that is correct, I would set it up to be offline at x time each day, or schedule downtime etc, however, we have (well did, I've moved from the school to an LEA) staff remoting in, then printing stuff at home.
It's a good idea if you've got to print of course work but you want to check it first. The only thing is you need to have the printer full of paper and toner before you go home, else it's just sit in the print queue. Unless you know a nice techy who either a stays late or b turns up earily to restock the printer before you get in ;)
Anyway, that's off topic. We were in Cambs so the switches so I didn't bother setting them up, shame really that I left, I found out a few days before that he (a guy at Cambs LEA) had setup snmp so I could have included them!! Been helpful when we were having issues with the 2324 dropping out, which turned out, again I'm going off topic, to be a memory link, even more off topic, why did hp make the 2324 ONLY be able to be firewire over a serial cable at a huge 9600 !! Lol...
Bah, was fun times.