Performance Optimizing Syslog Server
Do you want to receive syslog in a Windows environment? Take a look at WinSyslog!
Receive, process and store your syslog data from routers, firewalls or linux/unix servers with this easy to configure application in your Windows environment. Troubleshoot network problems or be alerted, all quickly and easily.
We are quite often asked how many syslog message per second MonitorWare Agent can receive. The answer is not as simple as it may look. It largely depends. So I finally thought I write this brief article on the factors that influence syslog server performance. Obviously, you can also use it as a rough guide to optimizing your setup.
Let me try to outline a few factors influencing the performance.
To get started, you should know that MonitorWare Agent is optimized for large traffic bursts. A little understanding on how it is done helps to understand hardware sizing issues. MonitorWare Agent is multithreaded. There are threads that receive messages and there are threads that process them. These two thread types are loosely coupled via an in-memory queue. The receiver threads (by default) have a higher priority than the processing threads. So what happens when a lot of traffic comes in is that the receiver threads very fast receive the messages and store them in memory. The messages are only processed by the processing threads when no receiving thread is using the system. So on a busy system, an in-memory queue of received but unprocessed messsages is build up. Obviously, if the machine continously receives messages at a very high rate the in-memory buffer fills up and received data begins to get lost. However, this design guarantees that as many messages can be received as the machine is capable of. At least for traffic bursts, this de-couples the receive ability from the speed of the rule set actions.
This – the rule set – is another very important factor in deciding how many messages a given system can process. Naturally, writing received messages to a file is much less performance-intense than writing to a database. A small rule set with only a single rule, no filters and a single action is also faster to process than a complex rule set with many rules, complex filters and actions. Of course, MonitorWare Agent is optimized to process complex rule sets quickly… but even this takes time. So If you would like to squeeze the most out of a given machine (or need to process vast amounts of incoming messages), it is worth tweaking the rule set. If you would like to build a high-traffic central syslog server for creating a central archive… just do that. Create a rule set with a single rule, no filters and just a “write to file” (NOT “write to database”!) action. This will give you the optimal performance.
For a highest performance, you will obviously dedicate a machine just to MonitorWare Agent. Most importantly, if you must log to a database, make sure the database server is on a different hardware! Database server software needs considerable CPU ressources and you will definitely not like to take them away from your syslog server. If you use a remote database, however, you will make sure that the network connection to it is well. So you should invest in a good 1 GB ethernet board and create an exclusive connection between your syslog and your database server (you may use an otherwise not connected ethernet switch).
OK, now we have arrived at hardware. Of course, the faster the box, the better. Add plenty of memory to take care for traffic bursts. The more physical memory the machine has, the better it can process traffic bursts. If you do not expect traffic bursts, memory is not that important. However, it is advisable to add some extra memory just for the case of unusual amounts of messages, e.g. caused by a malware outbreak or other exceptional situations.
A fast CPU is obviously important. Multiple fast CPUs are better. Due to our threading nature, multiple CPUs are very efficiently used. However, do not add CPUs without reason. If you have a central server wich just runs one syslog service (eg at standard port 514) and one rule that writes the messages to a file, you actually have 2 “real” threads running (plus a little overhead). So it obviously is a good idea to have at least two CPUs. A third CPU may bring some extra performance, but I would expect only a moderate increase. A fourth or any more CPU will not do any good – at best, they idle, at worst they add OS overhead. So for a typical configuration, a dual-CPU system is the best fit. If you run multiple listeners, additional CPUs help to improve performance greatly. They scale more or less one-to-one. Adding CPUs does not equally well work to improove complex ruleset performance. This stems back to some of the internal ordered queue handling and the need to keep things in order. So as another general adivse, it is good to add CPUs for additional listeners but does seldomly make sense to take care of for complex rule set.
If your store data locally, you would obviously like to have as fast hard disk as you like. If you plan for highest performance, stay away from raid-5 arrays. They have bad write performance. Use RAID 0 + 1 instead. Use it at the hardware level! Make sure that the disk is defragmented – this is often overlooked. On a fast, defragmented disk, MonitorWare Agent’s file monitor can actually “stream” messages right to the hardware. It does not need to seek ondisk, so you can expect performance close to the disk’s physical maximum.
You intend to shuffle a lot of data through the system – make sure you motherboard is well designed. We have seen bad motherboards slowing down an otherwise well-capable system.
Most importantly, make sure the system can talk nicely to the network. Use a brand, high performance busmastering NIC (network interface card). I am in favor of brand hardware because of the drivers. Most brand products come with drivers that actually allow you to leaverage the hardware. Some (but definitely not all) non-brand cards may come with more or less the same hardware, but too slow drivers. If you expect high traffic make sure your card can handle this. If in doubt, add a second or third card. If you do, make sure it is connected to a different switch, otherwise the switch may become the bottleneck.
In general, make sure that the network is capable of carrying the traffic. This is especially important if log data is flowing in via the WAN.
Another factor greatly influencing MonitorWare’s performance is the syslog protocol used. UDP provides the lowest overhead, but for obvious reasons messages which can not be received are lost. TCP based logging comes with some overhead, but it will guarantee that all messages are received – at least until the sender is internally overrun). So TCP based protocols lower the absolute reception rate, but are often a better choice because they offer a much better guarantee that no data is lost (there is no 100% guarantee, but this is finally beyond the scope of this discussion). Please note that there are multiple options for TCP delivery nowadays – “plain” TCP (not standardized) and RFC 3195 compliant “syslog-reliable”. The later is more reliable but unfortunately very seldomly found in actual devices today. Even “plain” TCP is implemented only in few devices, so this may limit your choices to UDP in the actual case. Keep in mind, however, that MonitorWare Agent can run multiple listeners. So you could, for example, run three listeners, one for RFC 3195, one for “plain” TCP and one for UDP (as a reminder, a 4 CPU system would play nicely with that). Using multiple listeners brings you the best of all worlds. Please note that by default the TCP based listeners have a lower thread priority, so this alsov gives you a little more headroom when it comes to UDP bursts.
Finally, think about the message sender (the “device”, may of course also be another server emiting syslog messages). In rare circumstances (worm outbreak) it may happen that the sender itself exhausts its ressources and is not capable to actually send all messages over the network. With UDP, this may simply happen because the (not very large) IP stack send buffer overflows. Even with TCP it can happen – it just can’t happen at the protocol level. But the application emiting syslog messages may overlflow its internal buffers so that the data never makes it to the TCP queue. While sender overrun is seldomly experienced (and even more seldomly detected as such!), it has happened in reality and may happen to you.
The esepcially bad thing about an sender emiting massive amounts of data is that it may also overrun other parts of the whole network, thus affecting otherwise unaffected systems. Recent Internet worms have provided good examples of this in the wild. So if you can tune the sender, try to place some safeguards in there. For example, you could limit the amount of messages of a specific type that are sent within a specific period of time. Or, as another example, you can set MonitorWare’s Windows event log monitor to emit only 10 messages per second. If you do this, you may loose some message from the Windows box in case some malware takes it over, but you keep the rest of your syslog system healthy.
I hope this clarifies at least many of the important factors behind syslog server performance. And now back to the basic question: how many messages can MonitorWare Agent handle? Honestly, I don’t know. You may now better understand why I do not. It depends on what you do with it. I know, however, that we have quite some customers processing vast amounts of data, including burst traffic with our products. On the other extreme, in our lab, I have configured MonitorWare Agent to write log data to an Microsoft Access database on a diskette drive… It could handle large bursts, but it took hours to write the messages to the database… 😉