If there's gold in log files, Splunk, Inc's Splunk will help you to find it. Splunk bridges the gap between simple log management and security information and event management (SIEM) products from vendors such as ArcSight, RSA, Q1 Labs, and Symantec.
Splunk lets you gather log data from systems and devices, and run queries on that data to find issues and debug problems. Splunk's capabilities also include reporting and alerting, pushing it every-so-slightly into the world of SIEM.
What separates out Splunk from the world of Syslog servers and SIEM tools is Splunk Apps, a library of nearly 200 add-ons that make Splunk smarter about particular types of log information, change its look-and-feel, or add new types of analysis.
We found Splunk to be a powerful, if complicated, tool. Moving from a simple syslog server or event viewer with a "search" command to Splunk is not a one-day operation. But Splunk pays off in the power that it brings, and the information it can pull out of your logs. Network managers who look at their logs every day, and those who wish they could get useful performance, capacity, and security information out of the gigabytes of log data stacking up should take a close look at Splunk.
Getting started with Splunk
There's a free version of Splunk for small and midsized deployments, so if your log files don't add up to 500MB each day, Splunk can be yours for the cost of the server you run it on. Some features, such as alerting, role-based access control, and distributed searching are not available in the free version; you also can't run premium applications on top of the free version.
But Splunk is designed to scale up, way, way, up. With distributed search databases, role-based access control, and the ability to eat terabytes of log data each day, Splunk is aimed at the large enterprise.
Splunk wants to be fed everything, including system, web, security, and every other type of log or performance data you can find. We didn't want to go quite that large, so we tested using Splunk on our own small data center, using live data.
Getting data into Splunk follows the same paths as any log management solution. We set up Splunk on a Linux system (Windows and other Unix flavors are also supported), a simple matter of an RPM installation, and had it listen for data sent to it with Syslog, probably the most common way to get your log data off systems and into an analysis tool.
For Windows systems, Splunk provides their "universal forwarder," an application that will pull Windows WMI data and forward it off to a Splunk server. The Universal Forwarder can also monitor file systems for changes and forward data from remote systems back to a central Splunk installation. We only used it to pull Windows event log information.
Splunk isn't too particular about where and how it gets data, with options for scripting and other network input sources.
Our initial contact with Splunk's input system, however, gave us a pretty good feel for Splunk's operational style. Splunk is not a do-it-yourself piece of open source software, but it also doesn't have the smooth polish we have seen from other commercial products. Splunk has an internal complexity that the Splunk team is happy to share with everyone through an extensive on-line documentation system.
If you want to make Splunk work, you've got to be ready to abandon the slick GUI and dive deep into difficult technical configuration, editing configuration files, writing regular expressions, and taking the time to understand where your data are coming from and how Splunk will see them.
We got Splunk working very smoothly in our multi-vendor environment, but only after investing serious effort in understanding how Splunk collects and indexes data. Our installation is slightly unusual, because we already have a central Syslog server and simply sent a copy of the data over to Splunk for indexing. But in these days of compliance and audits, having centralized Syslog for archiving purposes doesn't seem that unusual.
Overall, getting data into Splunk is much more of your typical open source experience, with a confusing maze of pointers, wikis, product tech notes, and documentation, but backed up by Splunk's technical support staff. Plan on spending more than a few moments getting started.
Getting information out of Splunk
If getting information into Splunk takes a while, getting information out of Splunk is a breeze, and can be fun to boot. Splunk has intentionally copied the Google minimalist search bar, and to find information you just start typing into a large box, selecting a time range, and clicking the green "go" button.
Immediately, log entries that have the text you typed begin showing up, while the query continues to run in the background if you selected a particularly wide time range.
But this isn't just your normal text search. Start typing in the search bar and pause for a moment. Splunk creates a drop-down with the most frequently found terms in your logs that contain what you've typed so far, along with the frequency counts.
The other thing that Splunk has copied from Google is speed. This log search is fast. We tested Splunk for two months on a very modest hardware platform: a single core, 2.3 GHz speed virtual machine with 1GB of memory, dropping in about 30 million log entries. Every standard search, even using regular expressions, returned the first screen of data within one or two seconds. If you're looking for something, Splunk is not going to get in the way of finding it.
Not every search is lightning fast; some are merely speedy. For example, we ran a search asking for the most common access point names coming out of our Aruba wireless controller. That search took 19 seconds to summarize the 31,942 records from the Aruba controller, giving us the most common values, sorted by frequency.
Since we didn't even try to optimize the performance of the Splunk server, this snappy response on low-end hardware was a great surprise. Splunk specifically discourages using virtual machines for performance reasons, but dedicating a physical server to log management doesn't seem necessary unless your volume is dramatically higher, into the gigabytes per day range. Splunk does provide guidance in their installation guide on how to select hardware.
Simply searching for text in log entries doesn't even scratch the surface of what Splunk searches can do. The Search manual is 289 pages long, and starts with Splunk's idea of the top search commands you have to learn. That top list has 23 commands, ranging from the easy-to-understand "search," "sort," and "top" to the incredibly interesting "rare", "dedup," and "transaction" (which groups log entries into a single transaction), to the confusing and difficult-to-use "xmlkv" and "bucket." Of course, that's just the start: there are almost 125 search commands.
Fortunately, Splunk is like Excel: it's easy to learn and to get a lot of value out of both products very quickly, but you can keep digging deeper, and deeper, and deeper to get even more power. Network managers who only dig into their logs once a month will not become expert at Splunk, and this is not the right product for them.
However, if you use your logs frequently, or want to get even more information out of the logs you have, taking the time to get good at Splunk is probably worth the investment.
Reporting and alerting
Splunk can generate a variety of reports, including simple graphs (such as pie charts or bar charts) as well as normal textual, tabular reports. Compared with other common reporting tools, Splunk's reports may seem fairly basic.
The underlying tools for generating reports out of Splunk are great. Everything from time charts to rare events to statistics to frequencies and correlations are possible. Reporting is aimed at the GUI interface as the main tool for viewing and consuming report data. Each report shows up in a dashboard, giving the viewer great options for customizing, drilling down into data, and changing timeframes and data sources.
Getting those dashboard views into traditional reports, though, is hard in Splunk. You can simply generate PDFs from the screen reports, and even do it on a schedule. But complex reports, especially multi-part ones with the sorts of decorations that auditors have come to expect, are not strengths of Splunk.
Alerting, available only in the premium version of Splunk, is not a particularly strong feature. If you are looking for alerts as a major part of your log management system, Splunk probably won't fit the bill. Alerts can only be generated based on standard Splunk queries, so any relationship between alerts or dependencies will be either difficult or impossible to express in Splunk's alerting system. Alerts can be generated via email, RSS feeds, and scripts, which makes alerting infinitely configurable.
Reports, in the form of dashboards and visualizations of your log data, are a powerful part of Splunk and valuable analysis tools. Traditional reporting and alerting, such as you might find in a SIEM, isn't what Splunk is all about.
Making Splunk smarter with applications
Splunk doesn't just index log data; it makes a valiant attempt to parse the log data. We discovered that the more time you put into teaching Splunk what kind of log data you're feeding it, the more Splunk can parse and report on information within the logs. You can do this by yourself, building search strings and regular expressions, adding tags, and so on, which is what we did to make Splunk understand our Aruba wireless network. But there's an alternative, the most interesting parts of Splunk, and its biggest differentiator: the add-on library of applications called "Splunkbase." Many, but not all, of the applications in Splunkbase are free.
Splunk is pushing hard on applications, probably because they understand that applications are the key to the future of their product. In fact, the basic search tool in Splunk has been moved into a pre-installed application, presumably to make it clear that applications are not just add ons, but the main way you'll interact with Splunk.
The applications in Splunkbase run the gamut, but two examples that we tested give a good idea of the power. Since we have both Sourcefire Snort IPS and Cisco IronPort anti-spam devices in our network, we directed their logs to the Splunk server and installed the free associated Splunk applications. In both cases, the applications helped to more precisely parse the log information coming out of the devices.
For example, without the Snort application, we couldn't run queries on logs to differentiate between source and destination IPs in Snort events. Same for the IronPort logs: without the add-on IronPort application, we couldn't differentiate From and To email addresses in our searches.
Smarter searching is one feature of Splunkbase applications, but reporting and graphing is another piece that most of the applications we looked at include. For example, the IronPort application came with reports to give us top senders and top receivers. The Snort application gave us a dashboard with automatically built "top 10" reports, as well as a map-based view showing where the attackers blocked by our IPS were located.
The applications in Splunkbase fill some, but not all, of the gap between the command-line techie-friendly searches you get out-of-the-box and what network managers have come to expect from modern applications. Of course, it all depends on what enterprise applications you have running. For example, Splunk has developed a free Splunkbase application with extensive hooks into Microsoft Exchange, including dashboards, message tracking, performance indicators, and capacity planning. If you're running Exchange, that's great. If you're on Domino, re-creating that value won't be easy.
The lack of a Splunk application that matches your enterprise mix shouldn't be a show-stopper. For example, the Snort application we downloaded had 105 files in it, but 90 of those were related to the mapping part of the application. The IronPort application was just as simple, about 15 files (no maps this time) to do what was needed. It only took us a few hours to dive into these applications, read through the source code, understand what was going on, and modify them for other tools on our network (a Trend Micro anti-spam gateway and a Juniper IPS sensor). Recreating what was done with Splunk's Exchange application, however, would be a different and more difficult can of worms.
The Splunkbase applications are the magic dust that makes Splunk truly stand out from other log storage and analysis tools. Without them, it's a hard to use and hard to learn search engine. With them, and the extensions that they enable, Splunk can shoulder away both open source and commercial competition and become the standard for log analysis.
Snyder, a Network World Test Alliance partner, is a senior partner at Opus One in Tucson, Ariz. He can be reached at Joel.Snyder@opus1.com.
Read more about wide area network in Network World's Wide Area Network section.