Monitoring Windows 2000/XP/2003/Vista
Article created 2001-04-23 by Rainer Gerhards
Article last updated on 2007-04-12 by Florian Riedl
Monitoring Windows 2000/XP/2003/Vista is important even for small environments. Automatically
monitored, critical failures can often be avoided. But how to monitor a system
without too much effort? The basic idea behind a successful monitoring and
alerting system is to centralize all system events at a single monitoring
station. Once the information is centralized, it can be used to build an
alerting system or even carry out the corrective actions.
What is a Monitoring System made up from?
Successful monitoring systems do usually have many components. These are
typically loosely coupled so that new requirements can be added easily. Keep in
mind how often systems and environments change - flexibility in a monitoring
system is nowadays a "must have". Typically, a system consists of
- Data collector processes
- Storage engine
- Analysis console
- Background processes
In this scenario, the data collectors run on the monitored systems.
These should be light weight processes because they shouldn't put too much burden
on the host system. This is especially important if high performing systems like
web servers are to be monitored. The data collector picks up
"interesting" events and forwards them to the central storage engine.
The storage engine then stores the received event notifications to
persistent storage. That way, it is safe from any manipulations or technical
problems at the monitored systems. The storage engine typically runs on limited
number of machines. Often, there is only a single storage engine inside a whole
network. That's really not a bad idea, as the whole concept of monitoring is to
have all information centrally. Multiple storage engines, on the other hand, are
typically used in complex scenarios, mostly with WAN links in between. There, a
local storage engines serves as a central hub for one location and forwards the
information to the central system.
The analysis console finally is used by the system administrators. It
is the interface allows to have a look at consolidated reports and also allows
to drill down into more specific topics. Ideally, the console supports multiple
concurrent users as well as provides some hints to fixing detected problems.
Integrated links to vendor knowledge bases or public search & discussion
services are a valuable help here.
Of course, data collectors and the storage engine are background processes.
But there can also be background processes that consolidate and monitor the
storage engine's data on a schedule - e.g. daily. So administrators either
receive an activity overview report or an exception report (for important and
urgent matters).
What about Windows?
Windows 2000/XP/2003/Vista do not come with a built in monitoring solution. So you
need some tools to get it going.
Windows logs the most important state date into the event log. Third party
vendors are also encouraged to log any events to the event log. For example,
most Anti-Virus products will log caught viruses here. The event log is
definitely the place to look at if you'd like to monitor an Windows system's
health. As a build-in tool, only the Windows event viewer is available (part of
the computer management MMC under Windows 2000 and XP). That tool allows
interactive display of current events but was never meant to be part of an
automated monitoring solution.
What we need is a data collector that can run in the background. For this, we use EventReporter. This product monitors the event log in near real time and forwards all new logged messages to the storage engine via SETP or
syslog protocol.
See difference between SETP and
syslog protocols.
Why did I say "near real time"?
Well, EventReporter by design does not operate on Windows event notifications,
which have been proven to be not fully reliable under extreme scenarios.
Instead, it polls the event logs on a pre-set schedule. Resource usage is very
moderate, so the schedule can be set to run every 30 seconds - even more often
in very security sensitive environments. EventReporter does not only forward the
logs but also checks if someone truncates them (via Windows Event Viewer or an
API call). If that is done, a notification is send the the storage engine. This
functionality is important, as such log truncations can be a good indication of
an intruder. EventReporter is installed on each system that is to be monitored.
It runs on all flavors of NT (even ALPHA), so really all systems can be
monitored.
Storing the Events...
Now we need something to store the events collected/reported by EventReporter. We use WinSyslog
for this. This enhanced syslog daemon works much like it's Unix pendant. But
besides writing to flat files, it can also log to a database and carry out
flexible actions.
In our monitoring system, we use it for two functions: first of all, it
stores all events. In our case, events are written both to a flat file as well
as the database. We use this approach because bulk analysis is done fastest with
the help of flat files. However, viewing event details is done best by using a
database. So we've taken the route to simply write to both stores and have the
best of both worlds. A large hard disk is of course helpful here...
Besides storing events, WinSyslog acts also as an alerting engine. It can be
configured to detect important message fragments or high priority messages and
set to forward these to an email account. If your paging provider supports an
email to pager interface, this is also the way to call a pager in case of an
emergency.
Typically, only a single instance of WinSyslog is needed. However, it has
support for syslog cascading. Cascading is used if a reporting hierarchy is
build. This is most often done in corporate networks involving WAN links where
only higher importance messages should be send to a central data store while
less important messages are stored at the individual sites locally. That way,
complete data is available for drill-down, but it is not necessarily being
transmitted over the WAN. WinSyslog fully supports cascading. It is also able to
forward only selected messages based on rules.
Analyzing the Events
Now we come to the analysis part. In most cases, administrators do not like to be bothered with routine information. They just want to get notified if things go either terribly wrong (hopefully a bit before it really hurts) or
regularly to see that all is doing well.
In our system, we have MonitorWare Console running to provide daily reports.
The reports generated by MonitorWare Console tell administrators about system health and security. MonitorWare Console offers flexible and extendible reporting features. In addition to several useful reporting templates provided with the application, users can order new reports exactly according to their own requirements and these new reports will be incorporated in MonitorWare Console application seamlessly.
You have the option of generating the reports on the fly from a "Database" or
"Log Files". Even if MonitorWare Console is connected to some other database, still you can give DSN (of any MonitorWare Database), and the report will be generated on that particular database to which the DSN
is pointing to. Same is true for Log Files.
The integrated Solution
As you see, the system is made up of three main components. Each of these has
specific duties to perform. The modular approach provides the flexibility need
in today's environments. For example, if Cisco information is to be integrated
into the system, you simply need to point the Cisco boxes to the WinSyslog
server. Now, the storage engine saves the new events. Even though MonitorWare Console does
not (yet) pick up and analyze the Cisco events, they can be viewed with the
WinSyslog web interface, which might be very helpful during analysis.
Also, an administrator has the option to add his or her own custom scripts to
be executed on the stored event data. The open system architecture provides
unlimited flexibility to do so.
It is also easy to integrate Unix and Linux machines into the scenario. They
support syslog natively and as such can both send and consume syslog messages.
In fact, the EventReporter product alone is often used as a tool to integrate
Windows events into Unix based management systems.
Conclusion
An effective monitoring solution can save the administrator a lot (and I mean
lot) of work. It can also help prevent major system breakdowns, as critical
situations can be detected early and - hopefully - solved before any damage
occurs. This is especially true if your think about security monitoring.
As I have outlined, a monitoring system needs not to be very complex or hard
to set up. Just use some ready to run tools, integrate them and enjoy the
benefits of the system.
Tools used
The following tools were used to build the monitoring system:
If you'd like to build you own system, you can download free evaluation
copies from the respective web sites. Detailed installation instructions
are available in the additional article "How
To setup Windows NT centralized Monitoring".
I hope this article is helpful. If you have any questions or remarks, please do not
hesitate to contact me at rgerhards@adiscon.com.
|