Can Big Data help universities tackle security, BYOD?

The University of Texas at Austin, the flagship of the University of Texas System, is a prime example of the scope of the challenge. Its 350-acre campus features nearly 200 buildings, all linked by a 10 gigabit fiber optic backbone. At any one time, up to 120,000 individual devices—ranging from servers to switches, wireless access points, desktops, laptops, tablets, smart phones and security cameras—may be connected to its network.

"As with other universities, we have tens of thousands of users representing an even larger population of networked devices," says Cam Beasley, chief information security officer (CISO) of the University of Texas at Austin. "We have a constant need to identify anomalous user account behavior, detect, locate and quarantine compromised systems in real-time, and correlate events across multiple logging environments to more fully understand potential problems or threats."

UT Austin's Information Security Office (ISO) analysts used to rely primarily on intrusion detection/prevention system (IDS/IPS) appliances and custom developed software tools to monitor the problem. But it was slow and unwieldy; moreover, it didn't fully leverage the goldmine of data ISO had in the form of its log data.

"We wanted to plug into the many different servers and devices downstream that were coming under attack to correlate our network information with actual system log data," Beasley explains. "We didn't want a big, heavy SIEM [security information and event management] product because we hadn't had much luck with them in the past. We needed a more flexible system that we could adapt to our unique needs."

Jason Pufahl, CISO of the University of Connecticut, faced a similar problem.

[ Slideshow: 9 Open Source Big Data Technologies to Watch ]

"Ultimately, every time we needed to do any kind of data mining, it was half a dozen sources using a variety of different tools," he says. "It could only be done by one or two different people [who had the skills to do it]."

Big Data Analytics Helps Universities Mine Log Data

Like more than 275 universities around the world, UT Austin and UConn turned to Splunk.

"Universities have some of the most complex IT infrastructures in the world, and this makes them extremely vulnerable," says Mark Seward, senior director of security and compliance marketing at Splunk. "It's the ultimate BYOD situation. Security threats are constantly evolving. Splunk collects massive amounts of data and helps users detect unknown and persistent threats."

Splunk bills itself as a provider of real-time operational intelligence software. Essentially, Splunk is a Big Data indexing engine designed collects, indexes and harnesses machine data generated by Web sites, applications, servers, networks and mobile devices. Splunk is the biggest in an evolving field that includes competitors like Sumo Logic, Loggly and LogLogic.

The idea behind Splunk came as Das and Swan were struggling with a Java application they were writing, Seward explains.

"They were finding a lot of errors in the application," Seward says. "They were looking at Java stack traces that were 100 lines long and fairly unstructured. It took a lot to go through these logs and figure out what errors there were and how to deal with them. Then one of them turned to the other and say, 'Hey, I wish we could Google this.' That's how it got started."

The initial use case was application troubleshooting, but security professionals soon saw that Splunk could give them the capability to make use of the reams of logs constantly generated by the sites, servers, applications and devices they had to monitor.

With Machine Data, the Only Limit Is the Imagination

Once an indexing engine like Splunk has access to that data, Seward says the only real limitation is the imagination of the user. He points to one CISO at a financial services firm that wanted to curtail tailgating-people following an authorized person into a facility without swiping themselves in with their badges.

By correlating badge swipe data, Active Directory login data and VPN use data, the CISO was able to determine whether users were working remotely or in the facility when they logged in, and then discover whether those who logged in from the facility had swiped a badge to enter. As an added bonus, if a user had not swiped into the facility but logged in locally, the CISO could now ask his staff to make a visual check of the user's desk-if the user was not actually in the facility, it was a good sign the user's machine was infected.

Power companies have also begun using their log data to gain better intelligence, Seward says. Smart meters now have remote shutoff capabilities, which could lead to illegitimate shutoffs.

"An insider may want to shut off someone's electricity for whatever reason," he explains. "I can pull in information from the billing system and compare the address where the shutoff occurred to the billing information. After all, it could be a billing error or someone who's looking to do harm to someone else. I can then add the GPS information from local utility trucks and maybe note that one of my trucks happens to be parked outside that particular house."

At UConn, Pufahl says the capability to organize disparate data sets into a central location and analyze it rapidly proved its importance almost immediately. Near the beginning of the semester, there was an issue with a primary course-related server that led to an outage.

"Splunk made troubleshooting it and visually describing what the problem was transparent immediately," Pufahl explains. "It stopped any amount of finger pointing. It was obvious who had to handle the problem and it was instantly apparent exactly when the problem occurred."

UConn Leverages Data to Improve Security Posture

Pufahl notes that the technology has helped his office make strides in implementing anti-virus capabilities on a university-wide basis.

"This sounds like a security best practice," he says. "In a corporate where you can manage it centrally, it's probably trivial. Here we have a transient population. It's very difficult to do."

But by using log data, Pufahl's staff is able to audit the environment and see where the trouble spots are, then generate reports and push them to the appropriate administrators to help them communicate with users who need to upgrade or install an anti-virus solution. Pufahl's staff has also used the capability to develop score every school, college and department on its security.

"We've developed what we're calling the University of Connecticut Security Score," he says. "We measure eight or so different security metrics with weighted values and produce that as a score-anti-virus, OS patches, a few other products that we expect to see running. Depending on the state of those, they'll be given a score and a corresponding report on how to improve that score."

"I think that, quite honestly, every organization is going to have to deal with making use of the valuable data that they've actually got in their institution," Pufahl adds. "It's not just a matter of disparate data on 300 systems. The minute you can take advantage of that data as a central collection, the questions you're able to ask of it really changes. We've found tremendous institutional benefit from being able to place of this data in a single repository and able to use it to make IT decisions."

Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for Follow Thor on Twitter @ThorOlavsrud. Follow everything from on Twitter @CIOonline and on Facebook. Email Thor at

Read more about education in CIO's Education Drilldown.

Join the CSO newsletter!

Error: Please check your email address.
Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Thor Olavsrud

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts