Machine learning enabling whole-of-data-centre security analysis: Cisco
- 22 June, 2016 15:44
Complex interdependencies between applications and computing environments can make whitelist-based security nearly impossible to enforce in practice, a Cisco executive has argued as the company debuts a data-centre analytics platform that utilises machine-learning techniques to model and predict the security impact of configuration and application changes.
Those changes can wreak havoc with conventional security models built around blacklists that are difficult to keep up to date as the environment changes, vice president of product marketing Rajeev Bhardwaj told CSO Australia as the company debuted a Tetration Analytics platform that surveils the data centre to monitor data flows and application interactions.
While many data-centre operators preferred to operate on a zero-trust, whitelist-based security model, that had been difficult to accomplish because of changing interdependencies within the “black box” of the data centre, Bhardwaj said. “The data centre of today has multiple layers of complexity from compute, network, and storage infrastructure as well as virtualisation, firewalls, load balancers and customer applications,” he explained. “If something goes wrong it's extremely hard to find out what happened: you look at the black box and don't know which application is talking with which applications, which ports are open, which applications are protected with a firewall.”
Cisco has built the platform around unsupervised machine-learning techniques that continuously monitor activities and build models of normal activity over time – effectively turning the network into a massive sensor.
All network activity is logged at the packet-header level and stored in a journalled log that allows data-centre operators and forensic analysts to step back through a detailed record of changes to identify specific changes that caused problems.
“Think of it as a time machine for the data centre,” said Bhardwaj. “Through analysis of dependency and applications, it will generate a whitelist policy – and if you start seeing violations of the policy, such as a Web server talking to a data server without permission, you can get field analysts to look into it.” The PVR-styled approach also allows network administrators to trace back to the root cause of network slowdowns and other issues that may be creating performance problems even if they're not generating blatant security violations.
Highly-granular mapping of interdependencies also allows network planners to model the potential impact of new applications or network configuration changes before they happen – helping head off unpredictable issues well before they create problems.
Building an architecture for data gathering had been a significant part of the architecture, with Linux and Windows-based software sensors running across the data-centre environment to collect what Cisco claims is up to 1 million unique data flows per second.
This data is aggregated and filtered through the Tetration Analytics appliance, where machine-learning algorithms tear through it to pick out changes as they happen. The large-scale collection and analysis of operational information to identify security issues has become an increasingly popular approach as malicious attackers increasingly play the long game to avoid the glare of security tools.
Visibility into cloud and virtualised platforms, for example, has helped many organisations extend their network-monitoring capabilities across hybrid environments. Machine-learning techniques are also becoming crucial in helping companies stay ahead of the ongoing onslaught of new data.
Tetration Analytics is fundamentally a Hadoop big-data platform that has been bundled into an appliance to simplify the process of adding it to the network. “Not all of our customers have the expertise to implement this capability, so we have taken away the complexity and delivered it in a way that the big-data analytics platform comes completely integrated,” Bhardwaj said, noting the platform's “Google-like interface” that allows easy searching on events and strings. “Customers don't have to understand how big data works or to look at a big-data implementation,” he said. “It's basically self-learning and everything is freely integrated. All of this visibility empowers powerful what-if analysis so if there is a network or application problem, we can very clearly tell you about it in real time.”