Artificial intelligence system can predict data theft by scanning email

The UBIC system searches for signs of disgruntled or cash-strapped workers

Workers who may be tempted to sell confidential corporate data should think twice about what they write in an email -- an AI-based monitoring system could be watching.

Tokyo-based data analysis company UBIC has developed an artificial intelligence system that scans messages for signs of potential plans to purloin data.

A risk prediction function is being added to an existing product from the company that audits email for signs of activity such as price fixing. The Lit i View Email Auditor has been used in electronic discovery procedures in U.S. lawsuits.

The artificial intelligence system, dubbed Virtual Data Scientist, can sift through messages and identify senders whose writing suggests they are in financial straits or disgruntled about how their employer treats them.

Such a situation would be classified as a "developing" problem, while messages about data access that's out of the ordinary, for instance, would get a "preparation" classification.

"Cases such as information leaks do not occur all of a sudden," a UBIC spokeswoman wrote in an email.

"The Risk Prediction function can detect which risk phase the company is facing and alerts in advance so that the company can make the crisis prevention before the incident takes place," the spokeswoman wrote.

The system seems a bit like a tool from the science fiction movie "Minority Report," designed to intercept would-be criminals before a crime takes place, but it's built on established human expertise. The Virtual Data Scientist trains itself by studying and emulating the techniques of professional auditors.

It can then bring those techniques to bear by scanning massive volumes of email. UBIC says it's more efficient than traditional manual keyword searches and that even subtle indications of fraud can be detected.

The Japan Patent Office recently decided to issue UBIC a patent for "predictive coding" that identifies behavior that could lead to future misconduct.

The approach links machine learning with analysis of big data and behavioral sciences such as psychology and criminology. The emerging field is known as behavior informatics and it has its own IEEE task force and other research groups.

UBIC's system currently works in Japanese only, but support for English and other languages is being added, the spokeswoman wrote.

The feature follows the arrest in July of an engineer who allegedly stole personal data on up to 20.7 million customers of Benesse, the parent company of Berlitz language schools in Japan, to sell them for a profit. The incident was one of Japan's largest data leaks.

Join the CSO newsletter!

Error: Please check your email address.

Tags Internet-based applications and servicessecuritydata breachinternetUBICprivacy

More about IEEE

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Tim Hornyak

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place