Big Data and elections: The candidates know you – better than you know them

Most political campaigns emphasize providing information – carefully controlled information – about a candidate to voters. But in the era of Big Data, they are also collecting information about voters – with little or no control, consent or security

With the presidential nominating conventions looming, the candidates are getting ready to add to the hundreds of millions they’ve already spent to tell you about themselves – but only what they want you to know about themselves.

Meanwhile, they have also been spending millions of dollars collecting information about you – and you have no say in what is collected.

Which means that, in the era of Big Data, if you’re a potential voter, they know a lot more about you than you know about them.

[ ALSO ON CSO: When tech trips up presidential candidates ]

The desire to know what will turn a voter to, or from, a candidate is not new, of course. Campaigns have been chopping up voters into interest groups for decades – minorities, gays, blue-collar workers, soccer moms, the religious right, progressives, boomers, NASCAR dads, union members, retirees, the rich, plus a host of occupational groups ranging from health care to law to the food and beverage industry.

They have been tracking voting history, political contributions and volunteer history as well.

But the information being collected now is much more, as they say, “granular.” It includes social media – everything from “friends” and “likes” on Facebook to YouTube views, LinkedIn profiles, activity on Pinterest, Tumblr, Instagram and Reddit to who a person follows on Twitter, or who they retweet.

It includes magazine subscriptions, the types of cars or boats they own, where they shop, charitable contribution history, memberships, where they live, whether they rent or own a dwelling, whether they have a vacation home, permits and licenses, own a gun, and more.

All of which is designed to help candidates “micro-target” their message to groups of voters. They call it better communication, although it has an obvious element of manipulation to it.

joseph lorenzo hall

Joseph Lorenzo Hall, chief technologist, the Center for Democracy & Technology

“It can be as simple as swapping out a phrase that might have been found to be more appealing to one kind of voter, via focus groups, etc., or more complicated things like changing the visual demographics or traits of people appearing in ads,” said Joseph Lorenzo Hall, chief technologist at the Center for Democracy & Technology.

Josef (Joey) Ansorge, New York attorney and author of “Identify & Sort,” which includes a focus on the political implications of Big Data, said the ZIP code is among the most important pieces of information collected because, “where they live, where they work and where they went to school tell us a lot about individuals.”

When it is correlated with information gathered from contacts, then, “calls or visits inform the campaign how an individual is tending to vote.”

josef ansorge

Josef (Joey) Ansorge, New York attorney and author of Identify & Sort

This, he said, lets campaigns create “micro” groups of voters, the most important of which is those considered “sway-able.” Obviously, that is the group the campaigns will try the hardest to influence.

But such detail about people’s lives, preferences and opinions – even their personal health – also raises both privacy and security concerns. How many people have access to it? How well is it being protected from online attacks? Will it be discarded after the election is over, or kept indefinitely? Could it be used by those who get elected and want to track those who supported their opponent?

Ansorge has a problem with using Big Data to send very different messages to different groups. “There is an elemental universalism to democracy that is undermined by these kind of practices,” he said, adding that he thinks voters ought to be made aware of how campaigns feed them information based on their profiles.

Andrew Hay, CISO of DataGravity, said he is not overly concerned about the collection of voter data itself, or even the tweaking of the message. “Candidates have a lot of information to remember, and the analysis of data simply helps them match the needs and wants of clusters of voters to a particular message,” he said.

But he said data security and governance is crucial. “I'm less concerned about the government keeping a ‘burn list’ of clusters of voters and more concerned with the protection, retention, and destruction of the data collected,” he said. “This includes raw data as well as any derived analysis from said data.”

andrew hay

Andrew Hay, CISO, DataGravity

That is also the view of Brenda Leong, senior counsel and director of operations at the Future of Privacy Forum. Big data analytics offers, “great new ways to engage with voters on the things that really matter to them, which results in more motivated, and hopefully better informed, participants in the electoral process, and likely higher turnouts on election day,” she said.

But she said “proper handling of the data” is not always easy for campaigns that tend to ramp up quickly from nothing to, “multi-million-dollar – even billion-dollar – enterprises, made up with large sections of volunteers or temporary staff. 

“Every campaign needs to treat security and privacy needs seriously, and have meaningful training for workers. We strongly recommend that every campaign have a chief privacy officer to monitor just these issues,” she said.

brenda leong

Brenda Leong, senior counsel and director of operations, Future of Privacy Forum

Ansorge agrees. “These databases have afterlives that are not under the control of the government or the party,” he said. “There is always a risk of abuse, by domestic and foreign actors. Here there is a perfect storm of data collected for a specific purpose potentially being abused for another.”

Unfortunately, there is ample evidence that it is more than just potential. Just three weeks ago, MacKeeper security researcher Chris Vickery discovered that a client of the data brokerage firm L2 was hosting a database with 154 million U.S. voter registration records and, “leaking information on a dizzying array of intimate details, including gun ownership, Facebook profiles, address, age, position on gay marriage, ethnicity, email addresses and whether a voter is ‘pro-life.’”

That wasn’t the only case. Six months earlier, Vickery discovered a “misconfigured” voter database with 191 million voter records - including his - that was, “just sitting in the public, waiting to be discovered by anyone who happens to be looking,” according to CSO’s “Salted Hash” columnist Steve Ragan.

Vickery told Ragan he was outraged to see his own record with, “details that could lead anyone straight to me. How could anyone with 191 million such records be so careless?”

Yet another breach, of 56 million records, included 19 million profiles that had not only voting history but also personal information like “Christian values, Bible study and gun ownership.”

Hall said those cases, along with nation-state hacking of campaign information systems, make it obvious that voters should be concerned about the data collection of modern political campaigning.

“Campaigns only seem to care about the security of data when they're protecting it from their political rivals,” he said. “Voters should be especially concerned because there are zero repercussions for campaigns mistreating or improperly protecting these data. The FTC has no jurisdiction over non-profits – there are serious First Amendment problems with government telling political speakers (campaigns) what to do.

“And there is zero chance that politicians will pass laws that reduce their capacity to micro-target, even if it means more robust protection of voter data.”

Beyond that, political databases are more likely to be hacked because they are shared more than those collected by commercial companies. Leong noted that, “companies routinely promise not to share your data, but campaigns and political advocacy organizations share data as a standard, so reading the disclosures or policies when submitting data is more important than ever.

“If you sign up for a particular cause or issue, that organization is likely telling you that they intend to share that information with ‘like-minded’ organizations, and you will end up on the mailing list for multiple causes,” she said.

Hall agreed. “If you donate to a campaign, one of the first things you see – and will see periodically after that – is a ‘We'd like to get to know you better!’ survey,” he said, adding that they will seek information on things like gun ownership and views on abortion, “that the campaign can't easily infer or purchase from other sources.”

He said even when voters volunteer that information, he is not sure they understand that it is used to get, “highly granular information about the voters for targeting, and in a number of cases this year, to get information about households around a given voter's address that might not be as forthcoming or politically involved, such as, ‘Do you know if any of your neighbors are gun owners too?’”

Ansorge said he thinks it would not be too difficult to create laws to limit data collection, especially governing presidential campaigns. “Candidates would self-discipline and would not want to create the potential scandal of their campaign being identified as law-breaking.”

He said voters could decide to give more of their personal information to the campaign they support – “we could think of it as donating your data,” he said – but the choice would be up to them.

Given the detail of the data collected, there is general agreement that there should be regulations on destroying it after a campaign ends.

Hay recommended that the U.S. adopt something like the General Data Protection Regulation (GDPR) in the EU, “specifically the Right to Erasure (right to be forgotten) language.

“If, as a citizen, I give consent to my data being collected and used in this manner I should also have the right to request what has been collected and the right to have it erased,” he said.

Join the CSO newsletter!

Error: Please check your email address.

More about CSOEUFacebookFTCNASCARTechnologyTwitter

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Taylor Armerding

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place