Forging a Social Contract for Data

Mar 2022

The Draft Data Accessibility and Use Policy is silent on the norms, rules, and mechanisms to bring to fruition its vision.

In February 2022, the Ministry of Electronics and Information Technology (MEITY) released the draft India Data Accessibility and Use Policy 2022 (or Draft Policy) for public consultation. The Draft Policy aims at providing a robust scaffolding for harnessing public sector data for informed decision-making, citizen-centric delivery of public services, and economy-wide digital innovation. Specifically, it seeks to maximise access to and use of quality non-personal data (NPD) available with the public sector, overcoming a number of historical bottlenecks: slow progress on the Open Government Data (OGD) platform, fragmentation of data sets into departmental silos, absence of data anonymisation tools, insufficient attention to the development of data stewardship models; and lack of data quality standards, licensing, and valuation frameworks to support data-sharing.
Incomplete norms

This GovTech 3.0 approach — to unlock the valuable resource of public sector data — does upgrade the OGD vision of the National Data Sharing and Accessibility Policy (NDSAP), 2012. It seeks to harness data-based intelligence for governance and economic development. However, the Draft Policy’s silence on the norms, rules, and mechanisms to bring to fruition its vision of data-supported social transformation requires attention.

Ensuring greater citizen awareness, participation, and engagement with open data is mentioned as a core objective of the Draft Policy. In imagining such openness, the draft confines transparency of public data to non-personal data sets. Any attempt to promote meaningful citizen engagement with data cannot afford to ignore the canons of the Right to Information (RTI), and hence, the need for certain citizen data sets with personal identifiers to be in the public domain, towards making proactive disclosure meaningful. This does pose ethical and procedural dilemmas to balance privacy/risk of data misuse with transparency-accountability considerations. The unfinished task of the NDSAP in bringing coherence between restrictions on the availability of sensitive personal information in the public domain and India’s RTI, therefore, has been lost sight of.

Similarly, with respect to government-to-government data sharing for citizen-centric service delivery, the Draft Policy highlights that approved data inventories will be federated into a government-wide, searchable database. Given that citizen data sets generated during service delivery contain personal identifiers, the assumption here seems to be that adherence to anonymisation standards is sufficient safeguard against privacy risks. But even in the case of anonymised citizen data sets (that is no longer personal data), downstream processing can pose serious risks to group privacy. Considering that India has no personal data protection law, the convergent data processing proposed through the Draft Policy becomes especially problematic.

The Draft Policy adheres to the NDSAP paradigm of treating government agencies as ‘owners’ of the data sets they have collected and compiled instead of shifting to the trusteeship paradigm recommended by the 2020 Report of the MEITY Committee of Experts on non-personal data governance. When government agencies are cast as owners of public sector data sets, it means they have a carte blanche with respect to determining how to classify their data holdings into “open, restricted or non-shareable” sans any mechanisms for public consultation and citizen accountability. The lack of a data trusteeship framework gives government agencies unilateral privileges to determine the terms of data licensing. As such, predecessor policies ignore obligations for regular updation of public data sets. Taking on board a trusteeship-based approach, the proposed Draft Policy must pay attention to data quality, and ensure that licensing frameworks and any associated costs do not pose an impediment to data accessibility for non-commercial purposes, while also protecting public sector data from being captured by large firms, especially transnational Big Tech, for economic innovation.

In the current context, where the most valuable data resources are held by the private sector, it is increasingly evident to policymakers that socioeconomic innovation depends on the state’s ability to catalyse wide-ranging data-sharing from both public and private sector actors across various sectors. The European Union, for instance, has focused on the creation of common, interoperable data spaces to encourage voluntary data-sharing in specific domains such as health, energy and agriculture. These common data spaces provide the governance framework for secure and trust-based access and use, in full compliance with personal data protection, and updated consumer protection and competition laws.

Creating the right conditions for voluntary data-sharing is a necessary, but not sufficient, condition for democratising data innovation. Competition law regulation has proven to be inadequate in the platform economy where first-movers entrench themselves owing to their intelligence advantage. And mandatory public access in exceptional cases such as public emergencies, suggested in the EU’s proposed Data Act (2022), cannot unlock the data enclosed by lead firms for public value creation, in general.

In this regard, the data stewardship model for high-value data sets proposed by the MEITY’s Committee of Experts in their Report on Non-Personal Data Governance (2020) is instructive. In this model, a government/not-for-profit organisation may request the Non-Personal Data Authority or NPDA (an independent institutional mechanism) for the creation of a high-value data set (only non-personal data) in a particular sector, demonstrating the specific public interest purpose for which such data is being sought as well as community buy-in on the basis of an appropriate public consultation process. Once such a request is approved by the NPDA, the data trustee has the right to request data-sharing from all major custodians of data sets corresponding to the high-value data set category in question – both public and private. Private sector custodians have a mandatory duty to comply with such requests for specific raw data fields. They can only claim trade secret protection in inferred data. In the case of refusal of a data trustee’s request by a data custodian, the NPDA has the final say in terms of resolving the dispute.

While the detailed checks and balances for such mandatory data-sharing arrangements are yet to evolve, the radical idea of high-value data sets as a social knowledge commons over which private data collectors have no de facto claim is vital to balance public use and private innovation.
What we need

What we need is a new social contract for data whereby: a) the social commons of data are governed as an inappropriable commons that belong to all citizens; b) the government is the custodian or trustee with fiduciary responsibility to promote data use for public good; and c) democratisation of data value is ensured through accountable institutional mechanisms for data governance. The Draft Policy needs to be revisited from this perspective, in order to seize the data opportunity before it is too late.

This article was originally written by Anita Gurumurthy and Nandini Chami for the Hindu on March 26, 2022, and can be accessed here.

IT for Change's inputs on the draft DataAccessibility and Use Policy can be read here.

data sharing