AWS launches Macie: AI powered DLP for the cloud

Amazon Web Services (AWS) has launched security service called Macie that uses machine learning to automatically classify sensitive data stored in the cloud and alert customers to unauthorized access or data leaks. 

Macie is aimed at customers that are liable for protecting sensitive information within large volumes of data stored in the cloud, such as organizations that handle medical records, intellectual property, legal documents, and personally identifiable information (PII).  

AWS customers can launch Macie from the AWS Management Console to begin classifying data and receive alerts for potential security breaches. The service currently supports data stored in Amazon Simple Storage Service (S3) buckets, but will be rolled out to other AWS data storage services later this year. 

Amazon Macie continuously monitors data stores it’s classified for signs that indicate a security breach, such a download of large amounts of source code, insecurely stored credentials, and accidentally exposed sensitive data.  

The service profiles a customer’s normal behavior based on where they locate sensitive information and how the data is typically accessed, including user authentication, locations, and times of access. If Macie detects an action that deviates from an established baseline, it will issue an alert that AWS claims will be “highly accurate” and include advice on resolving the issue.

Macie uses several methods to classify content gives them a risk score between 1 and 10, including a range of content types, file extensions, and keyword-based themes. For PII, such as credit card numbers, email address and user names, it automatically assigns a high, moderate, or low risk rating to each stored item. 

The machine-learning component of Macie is called the Support Vector Machine classifier, which was trained on a large but undisclosed body of data that has been “optimized to support accurate detection of various content types, including source code, application logs, regulatory documents, and database backups.” 

As AWS notes, the classifier can detect source code even if it doesn’t match the types of source code it was trained to recognize. It is trained to see source code written in Bash, C, Cobol, Go, Java, JavaScript, CSS, HTML, and XML. It's also trained to recognize e-books, email, and SEC forms.

The new service is similar to technology AWS gained earlier this year after reportedly acquiring cybersecurity startup harvest.ai for around $20 million. The US-based startup used user behavior analytics for a next-gen data loss prevention (DLP) product called MACIE Analytics, which helped customers protect against user account breaches and data theft. 

AWS has planned integrations for Macie with Palo Alto Networks, Splunk, and Trend Micro already underway. 

Macie is currently supported only in AWS US East and US West regions. Macie classification services cost $5 per GB after the first GB has been processed. The service uses CloudTrail events to assess anomalies in data stores being monitored. There’s also a cost for processed events at a rate of $4 per million events after the first 100,000 events. 

Since classification will occur on an initial set of data that’s likely to be added to over time, a larger fee could be expected in the initial month, followed by relatively smaller incremental fees in subsequent months. A large merger of course could result in a bumper month too.  

AWS announced the service at its annual NY Summit today, boasting several marquee Macie customers, including Netflix, Autodesk and Edmunds. 

Join the newsletter!

Error: Please check your email address.

Tags palo alto networksamazondata securitysecurity breachesDLPAWScloud security

More about AmazonAmazon Web ServicesAWSDLPEastNetflixPalo Alto NetworksSECSimpleSplunkTrend MicroWest

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Liam Tung

Latest Videos

More videos

Blog Posts

Market Place