Securing big data off to slow start

While so-called "big data" initiatives are not new to a number of industries such as large financial services firms, pharmaceuticals, and large cloud companies it is new to most organizations. And the low cost and ease of access of the software and hardware needed to build these systems, coupled with an eagerness to unleash any hidden value held within all of those enterprise data, are two trends that have sent large, next-generation database adoption soaring.

[Big Data still 'a new frontier' for most of the public sector]

Unfortunately, the efforts to secure these systems haven't soared equally as high or as fast. But fortunately, that appears to be starting to change.

In many cases, analysts say, big data initiatives began organically, within small enterprise departments or teams, and without much, if any, IT oversight or governance. In a recent survey by IDG Enterprise of more than 750 IT decision makers, almost half (48 percent) of enterprises anticipate big data will be widely used by their enterprise within three years, while another 26 percent expect significant use within a business unit, department, or division.

When it comes to security, big data poses a number of interesting challenges. Some of the challenges arise for similar reasons that make the consumerization of IT and BYOD trends so challenging for many organizations. "This is a very compelling security story because we're watching small organizations pull down open source tools and, with only a couple of programmers, be able to out-scale the largest Oracle databases in existence," says Adrian Lane, analyst and CTO at information security research firm Securosis.

"We're not talking about millions of dollars of infrastructure; we're not talking about large services teams parachuting people in and spending a couple of million dollars. We're talking agile, cost-effective, scalable modular databases that can be setup quickly by anyone," he says.

Now, add to that widespread and inexpensive access to large data sets and the reality that many enterprises don't know how to go about securing these implementations, and many vendors and open source projects don't have the security features that organizations need. There's the recipe for large privacy violations or a very large and costly enterprise breach.

[As companies grow, managing risks gets more complex]

It turns out that groups are starting to use these data. When Lane starting surveying organizations, he found that groups within the organizations actually were using these tools. "I was talking to marketing organizations that actually had hired data architects, under their own budgets, because they had interesting data that they wanted to mine. So, some of that went up to the cloud. Some of it was in-house, but there weren't any security controls on it, because that wasn't even part of the project's scope," Lane says.

Many times, these data were actually customer data that internal groups wanted to find out what behaviors and trends they could discern. Both Lane and David Mortman, another security analyst at Securosis, say that, almost universally, these teams believed there weren't any sensitive data in the database, but invariably that was not the case. "I'd ask them what they were doing for security, and they'd tell me they have logins; that was about the extent of it. It simply wasn't a part of the project scope," he says.

Encouragingly, some of the news on the security front is starting to brighten. According to Lane and Mortman, who both recently discussed Bigger Data, Less Security? at the Secure360 Conference in St. Paul, MN, the applications used to build big data systems are starting to take security in mind, as are some of the enterprises implementing them.

Lane and Mortman explain say that when they were preparing for the same talk a little over a year ago, the security feature in big data applications was barren.

"What was available from Hadoop and other organizations such as Cloudera, Zettaset, and others was very minimal, while many security vendors hadn't adjusted their products to work well within Hadoop environments," says Lane.

That has started to change in the past year; vendors, as well as enterprises, are starting to take a closer if not painfully slow look at securing these systems. According to Lane and Mortman, more vendors today are better at integrating identity and access management capabilities into their big data applications. That could include leveraging identity capabilities inherent within Linux, or tighter integration with Kerberos.

[Chomsky, Gellman talk Big Data at MIT conference]

Enterprises are starting to take more initiative internally, too. "We're seeing teams look for the best ways to add layers of security around these databases, either to avoid security and privacy risks, or to stay on the right side of government regulatory mandates," says Mortman.

To increase security, some organizations are employing "walled gardens," or relatively closed software system that were very common in securing mainframe data. Some of the more agile, smaller development teams are using approaches similar to what we see in web application's security. They're wrapping security into the application and user identity layers.

Additionally, Lane and Mortman say that organizations are starting to do a better job at using identity to build access controls around their implementations, including between applications and the users of those applications. They're also turning to block layer encryption, which will improve security but also enable big data clusters to scale and perform. "That encryption is a very easy way to make sure that the data at rest are secured, and that your platform admins can't get access to the data files," says Mortman.

Unfortunately, there is much left to do when it comes to securing big data and next generation database implementations. One issue involves database monitoring. Enterprises have been monitoring their networks, applications, and databases for many years, and these practices should most certainly extend to their big data implementations. "There are specific ways of looking at those usage profiles or behavioral profiles, or metadata information to vet good vs. bad queries. We don't have this ability with big data yet," says Mortman.

[Big Data security, privacy concerns remain unanswered]

Fortunately, there are numerous general purpose logging tools out there that enterprises can use to build their own big data logging solutions. "You're just going to be making your own queries to the log everything," says Mortman.

That's better than nothing, and until these toolsets and the security models around big data mature, many enterprises are going to be making their own way along the path to embracing big data securely.

Join the CSO newsletter!

Error: Please check your email address.

Tags Big Data Securityapplicationsdata miningsoftwarebig datadata protectionIDG

More about IDGLinuxMITOracle

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by George V. Hulme

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts

Market Place