Advances in machine learning are making security systems easier to train and more flexible in dealing with changing conditions, but not all use cases are benefitting at the same rate.
Machine learning, and artificial intelligence, has been getting a lot of attention lately and there's a lot of justified excitement about the technology.
One of the side effects is that pretty much everything is now being relabeled as "machine learning," making the term extremely difficult to pin down. Just as the word "cloud" has come to mean pretty much anything that happens online, so "artificial intelligence" is rapidly moving to the point where almost anything involving a computer is getting that label slapped on it.
"There is also a lot of hype," said Anand Rao, innovation lead for US analytics at PricewaterhouseCoopers LLC. "People talk about AI becoming super intelligent and will take over humanity and human decision making so on."
[ ALSO ON CSO: Machine learning is reshaping security ]
One common security tasks is to determine whether newly-downloaded or installed applications are malicious. The traditional approach is a very basic expert system -- does the application's signature match that of known malware?
The downside of this standard antivirus approach, however, is that it needs to be updated constantly as new malware shows up, and it is extremely brittle. A piece of malware that has only minor modifications in it can easily evade detection.
One startup, Deep Instinct, is looking to apply deep learning techniques to the problem, taking advantage of the fact that there are now close to 1 billion samples of known malware that can be used for training purposes.
"Deep learning has revolutionized many areas," said Eli David, the company's CTO. "Computer vision has improved 20 to 30 percent a year, to super-human vision in no time. Speech recognition. Why shouldn't that work in cyber security?"
Even a probability-based machine learning system is limited, he said. There are only so many factors that can be identified by experts, weighed and then tuned for optimum results. Meanwhile, uncounted other factors are dismissed as too minor or irrelevant.
"You're throwing away most of the data," he said.
The way that Deep Instinct works is that the deep learning system is trained, in the laboratory, on all the known samples of malware.
The process takes about a day, he said, and requires heavy-duty graphical processing units to analyze the data.
The resulting trained system is about a gigabyte in size, he said, too big for most applications, but then the company prunes it down to about 20 megabytes. It can then be installed on any endpoint device, including mobile, and can analyze incoming threats in a few milliseconds on the slowest machine.
"It's under a millisecond for an average file of one megabyte," David said. "We do all the complex stuff with the really sophisticated infrastructure in our laboratory, and what the customers get is a very small brain. They don't see all the complexity."
Meanwhile, back at the lab, new malware samples are added to the data collection and every three or four months or so an update goes out to all the brains working away in the end point devices.
"But even if the brain is not updated for six months, it can still detect new files," David said. "Deep learning is very good at being agnostic to new changes or mutations."
Most of the millions of new malware samples that appear each day are tiny mutations of existing malware.
"Even brand-new zero-days from advanced threat actors and nation states are still 80 percent the same as the old ones," David said. "Traditional methods won't detect them. Deep learning will easily detect them."
The company is working with independent testing labs to quantify the results, he said, but early testing with Fortune 500 customers has shown a 20 to 30 percent higher malware detection rate compared to existing solutions.
"We recently did a test against 100,000 files at a major bank in the U.S.," he said. "The existing solution was updated the morning of the test, ours was 2 months old. Our solution got 99.9 percent detection, their's got 40 percent."
Finding the reasons why
One of the downsides of the newest deep learning systems is that they can come up with an answer -- but might not necessarily be able to explain how they did it.
But that's not always the case.
In fact, the main job of Eureqa, a proprietary AI engine from Nutonian, is to find explanations for why things happen.
For example, when pointed at physics data, Eureqa was able to rediscover Newton's Laws, said Michael Schmidt, the company's founder and CTO.
"It would find the simplest, most elegant way to describe what happens, and what that relationship is," he said.
The company has made the engine available for free to researchers, and it has already been helpful in more than 500 journal publications, he said. For example, in medicine, it has helped find new models to aid in diagnosing diseases such as macular degeneration and appendicitis.
And it also has applications in cybersecurity, he said.
"One of the hardest problems is to find out the anatomy of a cyberattack," he said. "One of the applications of AI with Eureqa is to do that process automatically."
Once a customer signs up for the cloud-based system, it can take about an hour to go over the data, and then answers come back very quickly.
"We've been able to reproduce results that took them months or years to get in a matter of minutes," he said.
Local and global training
In cybersecurity, regular updates are important for any kind of machine learning system because the landscape changes so rapidly.
Without regular updates, all systems become obsolete because humans are always coming up with new stuff. Employees start doing new things. Vendors change their applications. Customers change their shopping patterns. And, of course, hackers invent new malware specifically designed to bypass existing systems.
Meanwhile, there's a window of vulnerability until the next update comes out.
In particular, bad guys can buy copies of the security software and test their attacks against them until they find something that works.
"Then they can use that on all of that vendor's customers until the next update comes out," warned Mike Stute, chief scientist at managed networking company Masergy Communications.
One solution, he said, is to move away from the one-size-fits-all approach used by many security system vendors.
"You can work with local patterns, peer patterns, and industry-wide patterns, and update them at different rates," he said.
Masergy uses a certain number of global factors to look for the likelihood that something suspicious is happening, then combines it with unique local indicators.
A global system can only look at at a limited number of inputs, he said. "There is only so much space. I look for the features that occur most often."
The additional local focus allows the addition of many more inputs, he said. "In the local model, I don't have to compress them down to the smaller set of features."
That allows not only for uniqueness, but for much better accuracy, as well, he said.
The combined local and global approach is also the one used by Acuity Solutions, which makes the BluVector appliance that uses machine learning to detect cyber threats.
Based on an advanced research program for U.S. government agencies, the system starts out with years of good software and learns what benign code looks like.
"Our engine is good at looking at a piece of code and saying, this piece of code has the absence of features that you would expect to see in benign code," said Acuity CEO Kris Lovejoy.
But then it also incorporates new learning from individual customers.
"We have pre-trained our engine before we give it to our customer, and then from that point on, it's almost like the child has left the nest, and it will continue to learn within the customer's environment," she said.
The main engine also gets updates quarterly based on global data, but the unique, customer-specific data isn't shared across the system.
That makes each deployment of the product slightly different, and customized for each particular customer. Even if the attacker buys a copy of the system and finds code that bypasses it, it won't do them any good.
"It's a moving defense, impossible to reverse engineer because those technologies are specific to your environment," said Lovejoy.