Using AI to multiply the efforts of human InfoSec analysts

A system that mimics a human can be thought of as a system that generates an army of virtual analysts, but armies need leaders to direct and train them

Although vendor-written, this contributed piece does not promote a product or service and has been edited and approved by Network World editors.

The most frustrating fact in InfoSec is that attack information is there in the data, but today’s systems are not capable of getting to that data in time and, as a result, they miss attacks and generate a lot of false positives.

Hiring more analysts isn’t the answer because of the costs involved and the difficulty in finding the right talent.  The unemployment rate for InfoSec professionals is essentially zero. In fact, Cisco puts the worldwide shortage of InfoSec professionals at 1 million.

The trick is to emulate the human analyst, since we know humans are best at judging if something is an attack or not, and emulating a human is fundamentally the domain of Artificial Intelligence. 

To mimic a human, a machine needs to learn from a human

There are a lot of machine learning technologies in InfoSec, but the key questions are:  Are they mimicking the analyst? Do they learn from the analyst, and do they predict what an analyst would say when presented with a new behavior? If the answer to these questions is no, then these solutions are part of the problem and not the solution. 

A system that mimics a human can be thought of as a system that generates an army of virtual analysts.  Armies need leaders to direct them and train them. This is the role of human analysts, and it is a crucial role. Working together, the human analyst and army of virtual analysts cover more ground across your entire enterprise and can detect both new and emerging attacks.

To achieve artificial intelligence that can mimic an analyst, we have to combine the computer’s ability to find complex patterns on a massive scale with the analyst’s intuition to distinguish between malicious and benign patterns. This symbiotic relationship helps machines learn from humans when the machines make mistakes, and helps humans see complex patterns across extended time periods. 

The challenge of a thin label space

The reason InfoSec, unlike computer vision, has failed to capitalize on AI is because of a lack of training data, also know as labeled data. In other words, there is a ton of data lying around that hasn’t been organized into behaviors, and then labeled as either malicious or benign by an infoSec analyst. It’s what data scientists call a thin label space. Absent labeled data, an AI system cannot learn. 

But come to think about it, analysts, who are continuously judging whether behaviors they monitor and investigate are malicious or benign, are generating labels. The problem is, these labels are not being captured today. We need to create a system that continuously captures those labels and then uses them to train new predictive models that can emulate the judgment of an analyst in real time. The predictions from these models are shown to the analyst and the feedback (label) is captured again.  At each iteration of this process, the number of labeled examples available to train the system increases and, as a result, model accuracy improves.

This analyst/machine interaction creates a loop where the more attacks the AI system predicts, the more feedback/training it receives, which in turn improves the accuracy of future predictions. The primary benefits of the analyst/machine loop are to reduce the time to detection while working within the limited time the analyst has. 

And when a predictive model is learned at one customer, it can be transferred to the entire network, creating a strong network effect. This enables customers to share intelligence at a behavioral level as opposed to sharing on an entity level.  Entities such as IP addresses or domains are easily gamed by the attackers, while behavioral signatures are not.

Given the limitations of current technology and the chronic drought of InfoSec professionals, there is a need for a new approach.  The goals of such a solution are clear: work within the limited time an analyst has; detect both new and emerging attacks; reduce the time to detection; and reduce false positives. AI, achieved through the combination of man and machine, may well be the answer.

Uday Veeramachaneni, is co-founder and CEO of PatternEx.  Prior to founding PatternEx, Uday led Product Management for Riverbed Stingray and created the first ever L7 SDN Controller that enabled service providers and enterprises to offer elastic web application firewall and L7 services.  

Ignacio Arnaldo, is Chief Data Scientist at PatternEx. Prior to joining PatternEx, he was a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT where he worked as part of  ‘Any Scale Learning For All’ with a focus on designing scalable Artificial Intelligence frameworks for knowledge mining and prediction. 

Join the CSO newsletter!

Error: Please check your email address.

More about CiscoMITRiverbed

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by By Uday Veeramachaneni, CEO, and Ignacio Arnaldo, Chief Data Scientist, PatternEx

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts