Data convenience isn’t a crime, but treating it as one should be
- 05 January, 2016 14:00
When a database of some 191 million voter records leaked just before New Year’s, the Internet freaked out over another massive data breach. Lost in the screaming was the fact that all of those millions of data points were already public information. The value to cyberthieves was solely a matter of convenience — a tidy pile of data all in one place.
People, this is a problem. Speaking as someone who cares intensely about both data security and privacy, our tendency to react in the same way to anything remotely resembling a data breach serves little useful purpose. But it does undermine serious security and privacy efforts in many ways.
As I’ve noted before, the more frequently the IT community overreacts to non-events in security and privacy, the weaker the protections we can give to significant security/privacy problems.
Let’s first delve into this 191 million-record incident, as it’s a bit complicated. This started with information published by DataBreaches.net, which is a wonderful resource for compilations of all kinds of data collections that have been exposed. It noted that the entries “may contain your first and last name, your home and mailing addresses, your date of birth, gender, and ethnicity, the date you registered to vote, your telephone number, your party affiliation, your e-mail address if you provided one when you registered, your state voter ID, whether you’re a permanent absentee voter, and whether or not you’re on the Do Not Call list.”
This brings us to the key question: “Isn’t all of that public data anyway?” The short answer is yes, but it’s not quite that simple. Actually, it is that simple, but given that many government workers — especially those within municipal and state governments — don’t understand how the Internet works, they craft rules that make no sense. Such is life.
A statement issued by one company that said that it might somehow be kind of involved in the breach sheds a little light, albeit in an impressively squirrely way. Nation Builder issued a statement that said, “While the database is not ours, it is possible that some of the information it contains may have come from data we make available for free to campaigns. From what we’ve seen, the voter information included is already publicly available from each state government so no new or private information was released in this database.”
That statement got into the “public or not” debate with this comment: “Each state has different restrictions and we make sure that each campaign understands those restrictions before providing them with any data.”
Therein lies the problem. Once the data is released and is being shared, it’s out. The Internet doesn’t respect national boundaries — if an Italian company doesn’t want to sell to someone in Albania, the onus is on that company to block such interactions — so it certainly won’t respect state rules. Also, the people being given this data, as opposed to, say, law enforcement employees or lawyers, are not licensed or managed in any kind of uniform way.
These rules are based on earlier rules, which long predate the Internet and, for that matter, phones with RAM and disk space that rival that era’s mainframes.
The idea is that the data is only supposed to be used for political purposes. But anyone can file to run as an independent candidate. Yes, they need to collect x number of verified signatures to get on the ballot, but no state rules say anything about being on the ballot.
The New York Times noted that the voter databases for North Carolina, Alaska, Florida and Washington, D.C., are free for all.
As an industry, we have to differentiate between restricted data — think payment card numbers, health records, bank account specifics — and unrestricted data. There can honestly not be much in between. If data is unrestricted anywhere, it’s unrealistic to consider it protected somewhere else.
In this situation, are we to believe that the voting details of a North Carolina voter are less sensitive than those from a California voter? And how much faith do we realistically have that a political operative working in Iowa isn’t going to combine those records with those from Florida? And when data is combined, that it’s somehow going to be later segregated so that those files will comply with varying state ordinances? Really?
By the way, an example of restricted data that I chose to not cite is from religious leaders. How long will it be before Catholic priests, for example, will save confession details in a desktop database — or, better yet, a mobile app (SaveASin?) — for easier reference? As a parishioner leaves one congregation, the original priest could email that person’s confession files to the new priest.
The point is that we can’t get all worked up about the disclosure of data that was pretty much already widely disclosed. Technically, we clearly can get so worked up, but not if we want to be able to triage resources to safeguard truly sensitive data.