AADHAAR: How do you build the world's largest identity system?

Speaking at the 2013 AISA conference, Srikanth Nadhamuni, Advisor to the UID Authority of India and Chief Executive Officer of Khosla Labs, talks about the challenges of building an identity system for India.

Here’s a challenge. Take a nation with a population in excess of 1.2 billion people and create a database that gives each and every person a unique 12-digit identifier and stores their fingerprints, a photograph and their iris. The purpose is to provide secure authentication and access to services.

According to the Indian government, AADHAAR “... is primarily aimed at ensuring inclusive growth by providing a form of identity to those who do not have any identity".

"It seeks to provide UID numbers to the marginalised sections of society.

"Apart from providing identity, the UID will enable better delivery of services and effective governance."

In a country with widely varying wealth and literacy levels, this initiative is seen as a way of including everyone in a system so that there is equal access to government services.

This is critical in a country where 70 per cent of the population lives in villages and less than a third has a bank account.

Srikanth Nadhamuni, Advisor to UID Authority of India and Chief Executive Officer of Khosla Labs, highlights that there is a counter to what seems like a difficult social and economic situation. “There are over 800 million mobile phones,” he said.

The Indian government spends about $40 billion on subsidies for the poor and disadvantaged. However, between 30 to 40 per cent of that money is wasted through distribution to ghost or duplicate identities.

The challenge is that most records are paper-based and kept in the villages where they can’t be shared. Indian residents don’t have a centralised identity such as the Tax File Number system in Australia or Social Security Number in the United States. Identity documents are not portable or usable between cities.

So, how do you solve this problem? Nadhamuni says that the vision of AADHAAR was to create a “common national identity platform for every resident".

"This had to be biometric identity because we had eliminate these duplicates and fakes. We can’t do this using names and addresses and leather-bound books and pen.

“We also wanted to create an online form of identification,” he added. “We wanted to jump from partial ID to full digital ID”.

Every enrolled resident’s biometric information is checked against the entire database at the time of enrolment to ensure that no-one enrols more than once. In the case where a resident lacks a full set of fingerprints - a common occurrence in an agrarian economy where fingerprints are worn away after many years of working the land - a second biometric is also collected. This is the iris of the eye.

AADHAAR supports the registration of residents into a Central ID Repository and verifies identity when accessing online services. Although literacy rates in India are below other countries, access to the Internet over mobile devices is widespread.

Enrolling over a billion people is a gargantuan task. This required the establishment of an ecosystem made up of state governments - as they have the responsibility for managing projects within India’s federal government - as well trained personnel for collecting the data, vendors for creating the biometric data collection equipment, verification authorities to ensure that the equipment was made to the correct specifications and many other entities. The states are paid for each identity that they collect.

“We had to build an entire ecosystem,” said Nadhamuni. This involved hardware vendors, training companies and numerous others in order to build the systems and to have the equipment in order to collect the massive volumes of data.

Much of the technology underlying the AADHAAR system is based on open source tools. When the encrypted data packets for each enrolment are received they are stored using the Hadoop distributed file system. MySQL is also used with MongoDB to support the large data volumes and the need for searches to be executed quickly.

As the data moves through the various stages of validation RabbitMQ is used as the messaging system for moving the data from stage to stage.

Nadhamuni says that “all of this is running on commodity blade servers" and "the idea is to make it scalable; it scales gracefully by throwing blade servers at it”. This is a critical consideration given the volume of data that the system has to manage and process.

Currently, AADHAAR is adding a million new residents each day resulting in 500 trillion data matches as the biometric data is checked in the database. The current storage of 5 petabytes represents about 25 per cent of what will be needed in the long term. Interestingly, iris matching can be executed about a million times faster than fingerprint matching.

Three separate vendors are used for the biometric components as this delivers a much higher degree of accuracy when carrying out de-duplication of the biometric data. “When one thinks it has found a duplicate, we send those packets to the other two and thereby we increase the accuracy of the system.

Currently, this system is two-orders-of-magnitude more accurate than previous best system before AADHAAR” according to Nadhamuni. And, the three vendors are working competitively so if one of the solutions is less effective than the others, workloads are moved. This impacts the revenues of the less effective service provider.

“Dynamic allocation of workloads were an innovation of this project,” said Nadhamuni. As well as de-risking the project it motivates all three vendors to maintain high standards and to continually innovate.

Security is a critical consideration in the AADHAAR system. All data is encrypted at the source and the number allocated to each resident is random - there is no intelligence attached to the sequence or selection of digits in the 12-digit identifier. When a system requests ID validation from AADHAAR it only receives a yes/no response - no other data is exchanged.

All data is encrypted and never transmitted in the clear with data separated into different security zones. Incredibly, much of the data is delivered to the central data centre via physical hard drives. “Never underestimate the bandwidth of FedEx,” commented Nadhamuni.

It’s important to note that AADHAAR is not simply an identity verification system. It’s a platform that can be used as part of a broader security system. For example, if a bank wants customers to verify their identity using a credit card PIN code, this can be used in addition to AADHAAR. This means AADHAAR is not a “closed loop system” added Nadhamuni.

This ability to leverage AADHAAR has resulted in a number of startups launching and existing industries able to reduce inefficiencies in payment and identity systems so they can either improve services or deliver new services.

Follow @CSO_Australia and sign up to the CSO Australia newsletter.

Publishers note: The author suggested in the original article that India had no privacy or data security laws. This is clearly factually incorrect and the article has since been amended by the publisher.

Join the CSO newsletter!

Error: Please check your email address.

Tags identity and access managementAADHAAR

More about CSOFedExMySQL

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by Anthony Caruana

Latest Videos

  • 150x50

    CSO Webinar: Will your data protection strategy be enough when disaster strikes?

    Speakers: - Paul O’Connor, Engagement leader - Performance Audit Group, Victorian Auditor-General’s Office (VAGO) - Nigel Phair, Managing Director, Centre for Internet Safety - Joshua Stenhouse, Technical Evangelist, Zerto - Anthony Caruana, CSO MC & Moderator

    Play Video

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

More videos

Blog Posts

Market Place