Why CIOs need to beware of human errors

Today, CIOs are investing considerable resources in systems that provide them with better visibility into, and real-time reporting on, exactly what is happening on their IT systems.

Performance monitoring tools such as Compuware Application Performance Management, CA Performance Monitoring, Nimsoft and many others provide excellent real-time visibility reporting, yet fail to provide a key understanding of many security threats and events.

Log analysis and SIEM (Security Incident Event Management) technologies such as HP ArcSight, LogRhythm, Splunk and many others are becoming a more common means of detecting unusual and suspicious behaviour within a network, but fail to deliver an understanding of actual user activity.

Today, human behaviour is the key factor which many organisations are striving to handle and manage – from common misconfiguration errors to trusted third- party vendors to rogue employees.

As such, CIOs are investing in high-availability systems and performance monitoring solutions, but are challenged to follow best practice procedures to address human errors during activity on organisations key data assets.

Oddly enough, the simple question, “Who did what, when, where and how on the server?” remains one of the toughest questions for CIOs to answer. This is despite the variety of system management tools in use today. It is simply not enough for administrators to just monitor servers and applications when the number one cause for server downtime is human error!

To achieve efficient operations and rapid remediation when problems arise, CIOs and administrators need access to a holistic view of the entire IT infrastructure, including granular monitoring of every human action on company servers.

The challenge of monitoring human activity

According to Ponemon Institute's 2013 Cost of Data Breach Study: Global Analysis, negligent employees and contractors are the root cause for 35 per cent of data breaches. Although "malicious attacks are more costly globally”, costing companies an average of US$157 per record breached (adding up to millions of dollars per data breach), human error doesn’t fall too far behind, costing an average of US$117 per record breached.

Human errors differ from system errors in an important way. For system errors, the tedious process of log analysis has been somewhat alleviated by the adoption of system monitoring platforms (e.g., SIEM) and software profiling utilities. But IT security administrators typically have no means of discovering the human errors which cause data breaches and system failures.

An additional consequence when human error causes a data breach or system failure is that smart users make the “smartest” mistakes. They know the nooks and crannies of arcane configuration files that might tweak an extra 5 per cent of performance out of the system. These hidden corners are subsequently the most difficult to identify as the cause when something goes wrong.

Another difference between human error and system errors is repeatability. For system errors, finding the problem is equivalent to fixing the problem. If, for example, your troubleshooting process leads to a conclusion that a NIC card is not working, then swapping in a new card closes the issue, and you can sleep well that night.

Human errors are not this direct. If you find a suspiciously modified configuration file and swap it with the correct configuration file, the problem may be solved temporarily. But when you go to bed that night, you’re probably still scratching your head, wondering who or what caused this error, and whether it will happen again tomorrow.

Server surveillance and people auditing

Ultimately, CIOs need to understand the importance of implementing a solution for human activity auditing. Such a solution must provide visibility into all user actions performed on every server, whether the server was accessed through the local console or any method of remote access (Terminal Services, Citrix, VMware, Remote Desktop Connection, LogMeIn, GoToMyPC, PC Anywhere, etc).

Ideally, a people auditing solution must provide three levels of activity data:
1. A video recording of all on-screen user actions
2. A summary journal of what each user did (allowing fast review even by non-experts)
3. Searchable text-based activity logs (including the names of windows opened, applications run, URLs viewed, mouse clicks made, text entered or edited and even unseen commands executed by scripts).

Since watching thousands of hours of recorded video is not practical, the summary journal and searchable log capabilities are critical. Furthermore, each journal entry and search result must link directly to the moment of the video where that action occurred so that administrators and trouble-shooters can actually see exactly what the user did at any point of interest.

Watching a video showing exactly what actions a user performed removes all doubt about what might have caused a certain system configuration modification or other change. This provides fast and unambiguous troubleshooting and root cause analysis.

With human error being responsible for 56 per cent of server outages, it is vital that CIOs, IT security staff and administrators have a solution allowing them to quickly and accurately review exactly what users did on servers and system devices. With this knowledge, they can rapidly discover the error, repair the damage, confront the culprit and implement procedures to prevent similar occurrences in the future.

People auditing in a nutshell

Most IT organisations today utilise system monitoring platforms that are efficient for system error troubleshooting, but are ineffective when diagnosing human-generated errors. These human errors represent over half of all downtime and data loss and are best handled by focusing on the root cause: What was done on this critical system, when and by whom?

Answering this root-cause question will bring drastic improvements in troubleshooting effectiveness and will also enhance security and compliance robustness. But most importantly, it will provide CIOs with the understanding and visibility required to make effective decisions and achieve their desired strategic outcomes.

Join the CSO newsletter!

Error: Please check your email address.

More about ArcSightCompuwareGoToMyPCHPLogMeInLogRhythmNICNimsoftSplunk

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Boaz Fischer

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place