0 - Swatch, Feb/2000 - Alek`s Free Personal Home Pages         [<Prev][TOC][Next>] <=== Navigation Buttons

Abstract

Event Management/Notification is one of the buzzwords one hears today in managing computers systems/networks. The definition goes something like this: having the computer tell you when some exception condition occurs, route that to the right people, log it for future reference, and (in an ideal world) even fix the problem (but then we would not have jobs! ;-)

But like many buzzwords, it`s not clear how this is done in the real world ... especially large UNIX client-server networks. Several commercial solutions are available (ex: CA-Unicenter) that claim to provide this functionality ... but they are often proprietary, expensive, monolithic, and difficult to use.

Swatch is a nifty public domain piece of software that integrates well with the UNIX syslog system. It is used at a Large Aerospace Company to help manage our nationwide network of >1000 HP, SGI, and Sun workstations. It provides instant and summary notification of exception conditions to a large number of Sysadmins and Users ... and has been very helpful in a number of situations. It`s also an extremely good feeling to know that "Mr. Swatch" is keeping an eye on things for us 24 hours/day, 365 days/year. In addition to making it much easier to figure out what the problems are, it now "auto-fixes" a couple of things for us - way cool! ;-)

This talk will present a "case-study" of how we use it, what additional software we have had to write for our large environment, how we have used it, and lessons learned. This will be a fairly technical discussions geared toward Sysadmins.