Data often finds itself embroiled in analogies describing its nature. From oil to air, many theorists have tried to capture the nature of data when it is used as an economic resource. Recently, a proposal was made to determine the ownership of this economic resource. The Committee of Experts on Non-Personal Data Governance Framework was set up by the Ministry of Electronics and Information Technology and recently submitted its recommendations. They focus on the economics of data use and can fundamentally change the digital economy. They allow some democratic say in how data is governed, and for this reason deserve conditional support.
The Committee recognises that data is increasingly contributing to wealth generation, and that the digital economy is characterised by concentrated markets. With this background, it makes recommendations concerning non-personal data. The Justice Srikrishna Committee made recommendations relating to the protection of personal data in 2018.
The report defines non-personal data as data that either was never about identifiable people (such as data collected through sensors in the soil), or anonymised data (such as data about people, but which has gone through anonymisation processes so that it cannot be traced back to people). It recommends the creation of anonymisation standards and for consent to convert personal data into anonymised non-personal data. It also recommends that non-personal aggregated sensitive personal data (such as data on political beliefs) should carry the same privacy protections as sensitive personal data.
The report says that certain kinds of non-personal data, even if they are collected by private companies, should not be considered the property of these companies. This means that if Uber collects data while providing a cab matching service, this data in its anonymised form cannot be said to belong to Uber alone. Potentially, other cab companies, Uber drivers, city dwellers and/or the municipal corporation could also have some rights over this data. We shall refer to this recommendation of making non-personal data available to certain groups and/or the public as “mandated non-personal data sharing”. This recommendation is not for all kinds of non-personal data, and neither does non-personal data become automatically shareable or shared.
The report proposes governance structures and infrastructure to facilitate mandated non-personal data sharing and use. Many technology policy experts have written about the need for a separate regulator, the arrangements between data intermediaries, and the mechanisms that trigger sharing of non-personal data. Here, the focus is on the concept itself, assuming that implementation is an open field.
Mandated non-personal data sharing has not been implemented anywhere at this scale. It should be understood as one tool in the new regulatory shed for a digital economy. As digitalisation starts to dominate more than just communication, including production and economic activity in general, regulation must be updated to reflect this technological change. This is an attempt to first engage with some of the criticisms of mandated data sharing, and then draw attention to its pitfalls.
The cocktail of monopoly. :
Most non-personal data is socially produced and only socially meaningful. Unless the purpose is surveillance, the data produced by me driving around the city is only meaningful and useful when compared to other data created by other people. Currently, data that is socially produced or meaningful is not socially owned. The status quo is that data collectors have all commercial rights over non-personal data. Large technology companies have built their market capitalisations over this monopoly control over personal and non-personal data. Digital companies like Amazon and Google grew partly due to network effects—their platforms connect people, and the more people join, the higher their value proposition, and the more people they attract. But network effects have been combined with the use of data, personal and non-personal, to bring these companies to their present heights. More people on a platform means more data, which often means better services (through more accurate algorithms). This is an automatic entry barrier for new firms who find it difficult to overcome network effects and data accumulation by incumbents. Network effects and data control constitute a cocktail of monopoly.
The status quo then has led us to private monopolies with all their attendant problems, including overpricing and regulatory capture. There is reason to challenge this status quo through regulatory action. As data started becoming a resource, it was turned into private property without anyone’s consent; it can just as well be afforded more democratic ownership through regulation now that the effects of such control and accumulation are clear.
Many decry mandated data sharing because technology companies invest money to build products, install sensors, etc, to collect data. But are these investments socially efficient? Large investments are made to earn fantastic rents through monopoly data ownership; if monopoly ownership is no longer possible for certain sets of data, these investments will not be made and that is not necessarily an undesirable outcome. The reduction of economic rent is a net gain to the economy.
The Justice Srikrishna Committee recognised that people have at least one right over data—right to privacy. The Non-Personal Data Committee report goes further and says that people also have economic rights over data—even when such data is not explicitly about them personally, because it is about them collectively, and its use affects their lives and livelihoods.
A common view is that mandated data sharing will reduce innovation in the economy. This is reminiscent of the case made for intellectual property rights (IPR) promoting innovation. The innovation question is really about the counterfactual—how much innovation is currently not being made because data is monopolised? To illustrate, we know technology startups are being bought at an unprecedented rate by Big Tech, so much so that it is normal now for startups to aim for acquisition only.
Mandated data sharing will perhaps mean that the current business models cannot continue. Companies may no longer be able to offer a free service in exchange for data and then use that data to extract rent. But other data business models that are currently being thwarted can be built. Innovation and service improvement have come out of the interaction between network effects and data, not necessarily the monopoly control of data. It is also a bleak and incorrect view of the world to think that the only incentive for innovation is rent. In any event, there can still be profitable businesses that use data and thus there will always be an incentive to collect and use data.
Data as a resource:
One valid critique that of mandated data sharing is that it entrenches the use of data as an economic resource. One can argue that the economic use of data has brought only misery to the world, and policy must not further such use. Some privacy activists argue that data can never be de-anonymised and should thus never be used, even in aggregate form. People also worry about the fate of the marginalised when data is used commercially.
These concerns point towards an argument to end or greatly limit all data collection for commercial use. We would then have to shut down Big Tech entirely in the interest of consistency. However, we can imagine many commercial and non-commercial uses for data that are beneficial to society, and that can be realised if non-personal data were to be treated as infrastructure rather than as private property. As recent events have shown, public health can benefit from data use. There is also a great social value in figuring out the quickest routes to a hospital from any given location—a functionality that can be built over Google and Uber data. A food delivery service like Swiggy’s is valuable without its subordination of restaurants and exploitation of gig workers. We can imagine a less exploitative cooperative of gig workers competing with Swiggy if some data is mandatorily shared. Artificial intelligence based speech-to-text solutions help people with disabilities communicate, and they require large data troves. Data is a resource, and its commercial and non-commercial use has harms in addition to benefits. That cannot be an argument against any collective governance of data, and it certainly not an argument for the status quo.
A call for caution:
Mandated non-personal data sharing can potentially cause a positive disruption in the trend towards data accumulation and monopolisation. It can provide a way out of the senselessness of digital technology markets today, where the control that technology companies have over our lives worries these companies as well. But that does not mean that the concept of mandated non-personal data sharing is without problems.
First, it is inadequate as a regulatory tool on its own. We will still need the Competition Commission of India to actively examine and prevent dangerous mergers and acquisitions. We will still need critical hardware like semiconductor chips to be manufactured domestically. Most importantly, we will still need strong and predictable privacy protections. Non-personal data sharing without a privacy regime in place will be nothing short of a disaster. People will be left vulnerable to de-anonymisation, data leaks, and data misuse. Providing consent to Big Tech to use our personal data already carries little meaning today; a privacy-less mandated data sharing system will make consent completely meaningless. It puts people in an impossible take-it-or-leave-it situation with digital companies. The ability of law enforcement agencies to access sensitive anonymised data should also be curtailed. Because de-anonymisation is possible, we should be protected against privacy harms, both from state and non-state actors, before anonymised non-personal data is shared. The sequencing is critical and a weak personal data protection legislation will make mandated non-personal data sharing look considerably less attractive.
Second, there is a good possibility that mandatory data sharing is used only as leverage by domestic capital against foreign capital, and nothing more. To be sure, this leverage is necessary. Indian digital start-ups find it difficult to compete with Big Tech. The ones that are able to hold their own do so on the basis of foreign venture capital, which is needed to build a data monopoly and consolidate network effects. In the future, traditional Indian industry will also face up to the threat of Big Tech. It will find that it will have to pay rent to technology giants to effectively digitalise its undertakings. For example, Internet of Things technology, which uses interconnected devices and digital data to engage in the real world, could add several levels of efficiency to manufacturing and commerce. Apple has been making great strides in IoT, and if the trajectory continues, even industrialists will be subordinate to Apple or other technology companies to be able to use this technology. This presents us with a problem: do even industrialists (let alone workers) not have a right over the data generated in their own factories using Apple IoT devices? Can they not use this data on their own to improve production, or must they buy reports based on this data from Apple?
The only competitor to the United States in new technologies is China, and it got there partly through protectionism and a disregard for intellectual property rights. Even the European Union is far behind, and is increasingly implementing protectionist measures to carve out a space for itself. Promoting Indian industry and innovation is a legitimate aim; using mandatory data sharing to do this is a legitimate, if new, method. But it is quite likely that a national Reliance monopoly that is happy to collaborate with foreign capital to further its own interests, becomes the greatest beneficiary of this policy. It is our job as alert and engaged citizens to ensure that this outcome is prevented and that the public welfare aspects of mandatory data sharing do not fall by the wayside.
We need to be vigilant to ensure that when a union, cooperative, research institution or public interest body requests data, its request is not unfairly denied. It is the same vigilance that the Right To Information Act asks of us. We will need to develop our own ideal frameworks for mandated non-personal data sharing to challenge frameworks that only help a few.
Third, the public welfare objective cannot be fulfilled in practice if there is no capacity-building of public and public interest organisations to use this data. Labour unions and labour departments will need the technical skills that are necessary to analyse, or even to contract out analysis of, data. The public education system must have the imagination to think of ways in which data can benefit students, far from corporatised models of remote education and computer-based assessment. Public services like India Post will have to start thinking about how they can use data generated by their own activities for public welfare, and also how they can make use of logistics data from e-commerce companies if it is mandatorily shared. We thus need to push for policies in other sectors, particularly in education, that will make such capacity possible.
A worthwhile endeavour:
All drawbacks considered, the idea of mandated non-personal data sharing deserves support because it is a more concrete, well-reasoned proposal than the much-touted “break up Big Tech” that we often see. The problem of data accumulation in interaction with network effects cannot be solved merely by retrospectively splitting technology giants into arbitrary arms. The problem must be tackled at as basic a level as we can manage, and the Non-Personal Data Committee recognises that a part of the problem is the default treatment of non-personal data as private property. Its recommendations should be met with qualified endorsement, and we should ensure that the institutions that implement such an idea are as democratic, transparent and effective as possible.
The author is a technology policy researcher focusing on the economics of platforms in the Global South. The views are personal.