Machine learning sets new standard in network-based application whitelisting

Using machine learning to automate the current manual process used to gain application visibility brings network visibility and analytics to a new level. By harnessing the power of the semantic Web through machine learning, emerging solutions let you act faster and proactively defend against cyber threats.

The timing is good, because new applications are introduced at incredible rates: More than 300 applications are created each day, and the Apple AppStore adds some 40,000 apps per month.  While everyone benefits from the application diversity, they prove to be a cyber security challenge.

Security teams must dedicate skilled analysts to monitor and analyze their organizations' network activity to determine if the applications are white-listed or authorized. Unauthorized applications must be blocked from entering the enterprise network, since they might consume bandwidth, lower productivity, compromise critical data or pose potential security threats.

Even if the impact of the applications is minimal, it is critical that security organization have complete visibility. Security organizations that employ static methods (that is, manual reverse engineering) struggle to keep pace with identifying and classifying new applications, especially given the rate at which new applications are being introduced. They must use machine learning to discover applications, extract signatures and create white lists. Security analysts can then have the incisive intelligence necessary to act early, without using costly and scarce resources.

Manual reverse engineering: DPI's Achilles' heel

While deep packet inspection (DPI) solutions have historically held a place in the cyber security solution landscape, a reliance on analysts to manually reverse-engineer applications is quickly becoming the DPI Achilles' heel. DPI solutions provide visibility and enforcement of traffic policies on traffic (flow) for which a packet payload signature exists. In a recent study of a Fortune 50 gateway, DPIs were able to provide detailed visibility into 19% of network traffic, coarse visibility into 64% of the traffic captured (i.e. HTTP), and 17% of the traffic was classified as unknown.

So when faced with unidentified traffic, DPI solutions must employ manual reverse engineering, which requires weeks of investigative work by skilled analysts to identify and generate a signature from a new application, and then appropriately classify it. In the meantime, the unidentified application continues to run on the network, compromising security and operational efficiency.

Innovation Changes the Game

Fortunately, machine learning has emerged to enable automated discovery of all enterprise application, signature extraction and white list generation. Machine learning provides the ability to add context to Internet traffic based on a superior understanding of relationships among data. And machine learning can accomplish this without the use of highly skilled and paid analysts.

Of course, application signature detection, decoding and classification using machine learning is not as simple as replacing humans with machines. The machines must be powered with advanced analytics critical to executing the auto-discovery aimed at identifying the nature of any unknown traffic on the network at any instant. The network traffic might be unknown for a variety of reasons, including:" A never-before-seen network protocol or user application" An evolution of a known protocol or application" The emergence of new internal modes of a known application

By automatically and incrementally learning those signatures, their nature, and their associated evolution over time using advanced analytics, machine learning-based solutions provide security organizations with a clear understanding of the five "Ws":" "What" (type of application)" "Why" (purpose of the application)" "Who" (the application owner/users)" "Where" (network addresses involved with these applications)" "When" (point at which new control policies are required to be enforced)

Breaking it Down: The Operational Life Cycle

With this background in mind, we can walk through a machine learning-based application discovery, signature extraction and white list generation solution's operational life cycle.

The process must begin by focusing on all the network traffic sessions for which security analysts want better visibility. Next, that traffic must be grouped cohesively to extract accurate signatures. Because the traffic seen on the wire does not lend itself to grouping based on protocols and applications (since they might not be known yet), multiple levels of filtering and clustering are required to create cohesive groups that can be used to generate reliable signatures.

Guaranteeing a high level of cohesiveness translates to more precise and reliable signatures. Once the solution processes each group independently using advanced statistical algorithms, it can extract precise signatures and their corresponding protocol/application labels (that is, names).

If validation is necessary, once these signatures (which now become a proxy for the identity of the protocol or the application) are known, the security analyst can assess, offline, the validity of the extracted signatures, apply any modifications if required, and run batteries of coverage and collision tests. When the security analyst is satisfied with the outcome, the signature can then be approved, and associated control policies can be exported to the DPI system in place.

Now that the signatures are known and the protocol or the application has been identified, the organization can use the information in a variety of ways -- to set policy, gain visibility or prevent user access to those applications. The result is a more efficient and secure network, unaffected by the risks unauthorized applications can cause.

In order to maintain the integrity of networks, without limiting productivity or adding operational expenses, administrators and security analysts need a replacement for the formerly days-long manual reverse-engineering process. The only option that can meet the security threshold -- while maintaining the flexibility and productivity afforded by modern-day devices and applications -- is the use of machine learning to automate application discovery, signature extraction and white list generation. This approach delivers the visibility, context and control required to keep networks secure.

Read more about wide area network in Network World's Wide Area Network section.

Join the CSO newsletter!

Error: Please check your email address.

Tags ApplesecurityWide Area Network

More about AppleDPI

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Prakash Nagpal, senior vice president, Product Marketing and Marketing, Narus

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts