Give Your Data Strategy a Reality Check

Proof Zero
7 min readFeb 1, 2021

If you want to receive updates and more information on the cool things we are working on for the data science community, please visit:
https://omq1ez0wxhd.typeform.com/to/JSYqqbyv

You’ve probably heard a lot about digital identity but haven’t experienced material adoption. This article looks at the challenges digital identity faces and proposes a solution to push the field forward.

Oil Spills: Data’s Dirty Secret

Spoiler alert: your personal information is everywhere. If you’ve transacted on the internet, you need to assume your information is public. Privacy breaches are common, and organizations are struggling to stop the attacks. In September 2017, Equifax announced a data breach that exposed the personal information of 147 million people. In December 2019, Desjardins Financial Group had a data breach that affected 4.2 million members and 1.8 million credit cardholders. If data is the new oil, spills are out of control.

Data leaks create a challenge for historically strong factors of identity. Your name, social insurance number, mother’s maiden name, and even your pet’s name are likely accessible to identity fraudsters. Even worse, traditional data handling copies data through noisy channels producing dirty data that is hard to track, low in integrity, and virtually impossible to clean up. Worse, individuals have no way to clean up these leaks (you can’t change your birthday).

Organizations can’t trust data anymore. As a result, organizations are looking towards trustless solutions, like blockchain, to provide a renewable alternative to crude data — a shared, verifiable identity.

The impact on the online customer experience is huge. As recently as two years ago, you could go to your favorite telecom service provider’s website and fill out a form with your personal information to request a new phone and plan. You would receive your device in the mail, and then activate your contract. This was a great customer experience that relied on simple personal data verification — personal data that is now readily available to fraudsters. Telecoms that can’t mitigate this are taking down order pages and making customers validate their identities in-store, like in the 1990s. We’re going backward!

The Rise of Digital Identity: Innovation in “Renewables”

The market has responded to the dirty data problem with a unique collaboration between the private and public sectors. In the US, a bottom-up approach has emerged with dozens of new digital identity projects and companies collaborating on blockchain-based solutions. “Self-Sovereign Identity” (SSI), a framework developed by the Sovrin Foundation, aggregates these ideas under the concept of a consumer-owned digital identity. SSI has influenced other standards bodies like DIACC to adopt and develop these concepts into their frameworks, creating a massive business opportunity. The value of the digital identity market, even in its infancy, has surpassed USD $13.7 billion in 2019 and is expected to grow to $30.5 billion by 2024. Digital identity is climbing the Gartner hype curve, exploding in investment and uptake. COVID has only accelerated this response!

The goal of digital identity is to enhance customer experience while maintaining the level of security, traceability, and verifiability required to comply with regulations. Customers surrender privacy for convenience yet; organizations still lack data integrity to run strong KYC/AML checks. To achieve this goal, you need to simulate a “Hard ID,” like a passport or driver’s license, that has the same properties of traditional, physical identification — hard to copy, easy to understand, and easy to verify (especially against technology like Deep Fake attacks. That is why blockchain is such a great fit.

Blockchain-based solutions provide private, network-verified identification. However, just like energy renewables, digital identity requires significant investment in infrastructure and complex upstream integrations to succeed. And even if you solve those problems, blockchain solutions suffer under Metcalfe’s Law — the network is only as valuable as its veracity, velocity, and volume. In other words, the utility of digital identity networks scales positively with the number of participants in the network, specifically institutions that verify identity claims.

These small incentives for early participants have kept adoption low.

Early Days: Data Tarpits

The size of the network is not the only problem digital identity faces. To go beyond verifiable identity and leverage the full potential of the network, organizations need first to solve upstream data integration and data integrity challenges.

If an organization can resolve its own customer identities across its systems of record, it can reflect them externally on the digital identity network to redefine the customer experience. For many organizations, however, resolving customer identities across internal data silos is a challenge on its own, especially if the organization has grown into multiple verticals organically or through acquisition, spinning off bespoke customer databases.

Traditionally, organizations tried to tackle this problem with data warehouses or data lakes. In many cases, these master data management (MDM) projects suffered from never-ending integration, quality, and integrity issues. What many organizations have been left with is an expensive Data Tarpit that drags on their top and bottom lines, with an estimated “12% potential revenue loss” due to these inherent quality issues. This is where AI and ML projects also tend to fail: bad data in, bad models out.

Oil and Water Don’t Mix: Data Tarpits vs Consumer Data Rights

Over the last few years, we have seen the market put a heavy focus on consumer data rights. This has given rise to new regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). The common theme of these policies is to address how organizations collect, share, and monetize customer data to push organizations to invest in better, more transparent means of collecting consent and reporting on use.

If your data lake is more like a data tarpit, then complying with continuously evolving regulatory requirements is going to be a challenge. Digital identity systems promise to make this sticky process simple by giving consumers compliant access to their personal data. In other words, organizations can outsource data quality improvement efforts to their customers to create high-fidelity customer information assets.

However, adoption of a Hard Identity solution requires a heavy investment that only the most agile, technologically-oriented organizations can afford. This compliance barrier puts consent-oriented walled gardens like Amazon, Facebook, and Google in a position to dominate their markets (this makes sense: CCPA is a California law, and the tech industry lobby is centered in California).

This is why Proof Zero invented a patent-pending “Soft Identity” approach.

No Spills. No Frills: Introducing Soft Identity

A Soft ID is a universal identifier derived from existing systems of record. Soft IDs take an inside-out approach to creating digital identities. Data is indexed to link customer records to internal/external stakeholders through standard interfaces, without needing to migrate data or stand up new infrastructure. This lets organizations leverage existing data assets into a digital identity system without the massive upfront investment required.

Soft IDs are by no means a replacement for Hard IDs; however, they are a step towards the promised land. With Soft IDs, businesses can tackle their data integrity and quality challenges quickly so they can focus on delivering better customer experiences. Organizations that use Soft IDs also experience the benefits of the latest in cryptography and automation with a reduced risk profile through tight data access controls — don’t mitigate data breaches, eliminate them.

Soft IDs help businesses tackle challenges in coalition marketing, consent management, fraud detection, open banking, compliance, and other areas. They are interoperable across organizational trust boundaries enabling organizations to fight back against data aggregators and walled gardens.

Hedging the Walled Gardens: The First-Party Data Network

A Soft ID not only helps organizations bust internal data silos, but the Soft ID itself is interoperable across organizations. Crossing those trust boundaries using the Soft ID allows organizations to collaborate privately and securely with customer data while managing consent. Nothing is more powerful or valuable than first-party data — it really is the new oil.

Organizations that exchange first-party data in a controlled way can collaborate to combat institutions that exploit their data and steal business.

This unique competitive position is achievable using a zero-knowledge approach that leverages Soft IDs to discover data collaboration opportunities amongst themselves, without needing to engage legal and compliance teams. This intersection of customer data sets allows organizations to solve hard business problems that would otherwise be impossible.

Inside-Out Approach: Soft ID to Unified ID

Over time, as organizations focus on busting internal data silos and establishing external collaboration partnerships, a network will naturally emerge — addressing the Metcalfe Law scaling problem mentioned above. When this network reaches a critical mass, organizations will have ready-made digital identity tooling in place and can choose when and if to deliver a direct-to-consumer model.

A Soft ID is a rich business asset that is compatible with Hard ID networks, allowing the two systems to integrate easily. What this means is the two systems themselves can collaborate, enabling seamless consumer-to-business and business-to-consumer relationships. All that is required is for consumers to choose to link their Soft and Hard IDs to create a Unified ID.

Once linked, a consumer can manage their relationship with their service providers directly, and service providers will be able to provide a rich, trustless online customer experience.

Aligned Incentives Make the Transition Possible

Data tarpits are polluting your ecosystem with dirty data, making it difficult to maintain utility and compliance. Without solving these challenges, it will be difficult to deliver a trustless digital identity experience to the market.

Rather than attacking the problem from the outside, we need to better understand the internal data challenges many organizations are struggling with and provide the right business incentives to make the transition to digital identity seamless.

That is our mission at Proof Zero — our solution not only enables organizations to solve data quality and integrity problems, but helps them re-imagine consent management, consumer data privacy, and prioritize data-driven decision-making. Proof Zero is a zero-knowledge platform that bakes insecurity in a way that eliminates the risk of managing private customer information, making it simple for businesses and consumers to collaborate. Ultimately, companies that follow Proof Zero’s best practices in digital identity will be positioned to become leaders in customer experience and shape the future of their respective industries.

Proof Zero’s interoperable Soft ID and consent management solution is how banks are solving open banking, insurance providers are reducing fraud, and retailers are improving loyalty.

If you want to receive updates and more information on the cool things we are working on for the data science community, visit:
https://omq1ez0wxhd.typeform.com/to/JSYqqbyv

--

--

Proof Zero

Proof Zero allows data engineers and data scientists to explore, prepare and maintain clean quality data at scale