The word statistics means ‘science of the state’. This etymological origin is not fortuitous. Under the nation state, the concept of sovereignty acquired the political meaning of the self-determination of a people over their territory. Thus, citizens’ cognition of the physical, social and economic aspects that shape the national space has become a fundamental precondition for the exercise of sovereignty. Such collective cognition has historically been achieved through the national centralization of the production and distribution of official statistics across states. In a word, there is no national sovereignty without statistical sovereignty.

Official statistics are instruments that both guide the planned action of governments and agents in the national space, and enable the exercise of citizenship by the population. This includes the critical assessment of government performance, through reliable information constructed autonomously in a common political and cognitive space.* Official statistics thus establish a fundamental link between national sovereignty and democracy, ensuring citizens’ right to quality public information.**

Throughout the 20th century, states consolidated statistical systems centralized by National Statistical Offices (NSOs). At the same time, in the international field, efforts to articulate global statistical coordination reached their current configuration with the advent of the United Nations system. On the one hand, this meant the structuring of relatively autonomous public institutions on the border between science and the state. On the other, a knowledge architecture tensioned between national sovereignty and geopolitical interests. This borderline character—between science and the state, sovereignty and geopolitics—makes NSOs unique and strategic institutions, especially in the current context of datafication.

This borderline character—between science and the state, sovereignty and geopolitics—makes NSOs unique and strategic institutions, especially in the current context of datafication.

By datafication, we understand the double process of converting different aspects of human life, the physical world, the economy, society, goods, and services to digital data format, and generating different forms of value from this data. This is a socio-technical process, encompassing new modalities of automated data processing, quantification, and analysis that transmute human cognition and action into an analyzable form. However, it also constitutes a new political and economic regime driven by the logic of capital accumulation, and characterized by the extraction, enclosure, and conversion of data into goods, assets, and capital.

Our studies demonstrate that, in the 21st Century, Big Tech has advanced over official statistics. In our understanding, this can compromise the statistical sovereignty of national states, especially for countries in the Global South. Our empirical research demonstrates that this process occurs with the active involvement of companies such as Microsoft and Google in lobbying, financing, research, and development actions that can lead to changes in global statistical production and coordination structures, including impacts on national statistical systems. These actions are implemented through the ‘Big Data for official statistics’ and ‘data revolution for development’ agendas, which involve International Organizations, Big Tech, NGOs and other private companies, and National Statistics Institutes from the Global North and Global South.

In our times, the financial, political and ecological crises point to an exhaustion of the neoliberal model of government. Such crises are accompanied by the emergence of a new data-driven economy and a mode of capital accumulation increasingly guided by Big Tech and digital platforms. In this context, the datafication process involves two unprecedented factors of tension for the official statistical field.

First, the economic and political model of Big Tech domination is guided by the logic of data enclosure and commoditization. This new political economy of data directly contradicts the traditional conception of data as public goods, a fundamental principle of official statistics. Second, there is an epistemological shift underway. New data sources controlled by private companies are a byproduct of digital user interactions. Unlike statistical sources structured to represent a certain population, Big Data has an unstructured and non-representative character. Thus, the data science that emerges, linked to private corporations, is based on another statistical interpretation, inclined towards induction, modeling, and prediction. This is, therefore, in contrast with traditional statistical practices—aimed at explaining phenomena, based on sampling theory and the deductive approach to the design of representative surveys.

New data sources controlled by private companies are a byproduct of digital user interactions. Unlike statistical sources structured to represent a certain population, Big Data has an unstructured and non-representative character.

As new informational modalities produced by Big Tech and a set of intermediaries begin to threaten the relevance of official statistics, NSOs are being pressured to ‘adapt’ or ‘modernize’ by engaging with Big Data. However, private control has prevented NSOs from freely accessing new data sources. At the same time, the production and circulation of official statistics constitute a potential niche market for private agents. Datafication thus represents a new field of tension between nation states and corporations.

In this disputed terrain, a discourse of ‘modernization of official statistics’ through Big Data emerges. This movement, which originated in the USA and some European countries in the second decade of the 21st century, began to stimulate public-private partnerships in the statistical sector and the creation of a new generation of data scientists at NSOs, professionals whose profiles are more entrepreneurial and receptive to innovation and collaboration with market agents than state statisticians who are traditionally guided by the ethos of public service and established methods.

The pro-market movement in the official statistical field gained global scale in 2014, with its incorporation into the agenda of the UN Statistical Commission. This happened in a context of greater dependence of the organization on private financial support after the 2008 crisis, which favored the leading role of the business sector in the 2030 global development agenda. The Sustainable Development Goals (SDGs), at the same time established unrealizable statistical demands on nation states—according to the latest SDG report, data was not provided by all countries for around one-third of the indicators—and drove the Big Data agenda at NSOs, through what was dubbed a ‘data revolution for sustainable development’.

Notably, as the European Union gradually moved away from the agenda of public-private partnerships in Big Data and sought to protect its statistical systems through regulation, we can verify the dissemination of this agenda to the Global South’s statistical sector as upholding a ‘colonialist’ approach to data. Microsoft and Google are the main promoters of the ‘data revolution’ through permanent lobbying at the UN Statistical Commission and Division, materialized in the NGO entitled ‘Global Partnership for Sustainable Development Data’, financed by foundations of these companies.

The UN Big Data Project, which results from high-level coordination with Big Tech, constitutes a global nucleus for experimentation and dissemination of new datafication practices for official statistics. In this context, the development of a platform project aimed at sharing data, methods, and technologies between NSOs and private corporations stands out. The UN Global Platform comprised the development of a political and technological structure managed by the NGO Global Partnership, which resulted in the implementation of four regional Big Data hubs at NSOs from the Global South in Asia (Indonesia), Africa (Rwanda), East Middle East (United Arab Emirates) and finally in Latin America (Brazil), where the regional hub was implemented at Instituto Brasileiro de Geografia e Estatística (IBGE) in 2021, during the Bolsonaro government.

The evidence gathered in our research demonstrates that the Global Platform adopts a neocolonial conception of data that intensifies technological asymmetries, giving countries in the Global South a status of data providers and passive receivers of methods and technologies developed by countries and corporations in the Global North.

The evidence gathered in our research demonstrates that the Global Platform adopts a neocolonial conception of data that intensifies technological asymmetries, giving countries in the Global South a status of data providers and passive receivers of methods and technologies developed by countries and corporations in the Global North. We assess that the penetration of private corporations into national statistical systems, mediated by the Platform, represents a risk to the statistical sovereignty of countries in the Global South.

The data mobilized and stored by national statistical systems and the potential for consumption of new data sources, methods. and technologies by governments are key elements in understanding the interest of Big Tech in the statistical sector. Projects like the Global Platform are experiments that enable corporations to test new business models and data markets in the Global South. In doing so, they mainly seek: i) the standardization of methods for producing official statistics, dependent on data extracted and processed by private corporations; ii) the creation of data markets with the commercialization of data stored by private parties for the production of official statistics; iii) the creation of distributed cloud computing infrastructures to monetize access to data; and iv) the use of statistical data for AI training.

These models are still being experimented on. Its implementation faces resistance mainly linked to the current state, regulated, public and national nature of official statistics, which materialize in protective counter-movements within the official statistical field. In Europe, under pressure from the European Statistical System, new legislation is being proposed that attempts to regulate compulsory access to data stored by private parties for statistical purposes, such as the Data Act and other national laws. In Brazil, our research detected similar demands for regulation from the IBGE technical staff. We also identify the autonomous development of promising initiatives using new data sources—such as the use of web data for price and e-commerce statistics—and sober and responsible positions on the integration of new sources and methods; that is, in an autonomous, sovereign, and self-determined way, while preserving traditional sources and methods.

In these counter-movements, it is important to highlight the role of ‘state statisticians’ who hold specialized scientific knowledge and are invested in a public ethos, guided by a commitment to the quality, confidentiality, and representativeness of data and the publicity, precision, and reliability of methods.

In these counter-movements, it is important to highlight the role of ‘state statisticians’ who hold specialized scientific knowledge and are invested in a public ethos, guided by a commitment to the quality, confidentiality, and representativeness of data and the publicity, precision, and reliability of methods. The actions of soft power in the official statistical sector financed by Big Tech, such as Data Festivals, Datathons, exchanges, etc., must be understood as part of a strategy to gradually overcome national and internal resistance from NSOs to the advancement of private corporations in the statistical sector, through the breading of a new generation of data scientists with a more entrepreneurial and disruptive profile, open to innovation and cooperation with market agents.

At this juncture, it is not a matter of giving up on the use of new data sources, methods, and technologies. On the contrary, the task imposed on the statistical sectors of the Global South is to appropriate the technological development of other dynamic centers, adapting it and creating their own capabilities, guided by the principles of sovereignty, transparency, and democracy. This requires above all an understanding of the political and economic determinations involved in the current datafication process.

To this end, we recommend: i) the legal and political reaffirmation of the strategic nature of official statistics for national sovereignty and democracy, as well as the centrality of the NSOs in this process; ii) the need to legally guarantee the public nature of data for statistical and geo-scientific purposes; iii) regulation of NSOs’ access to data stored by private parties; ii) the review of NSO’s international agreements involving the participation of Big Tech; iii) the review of training processes for statisticians in data science, taking into account both the needs for technological development and the social, political, and economic aspects underlying new technologies and statistical epistemologies; iv) the political and technological structuring of National Statistical Systems in order to guarantee better coordination and cooperation between public entities; guaranteeing security and sovereignty in the storage, sharing and processing of data.

 

* The notion of a political and cognitive national space of equivalence conventions that sustains the statistical practices was developed by Alain Desrosières: https://books.openedition.org/pressesmines/901

** This link is the basis of the fundamental principles of official statistics ratified by the United Nations: https://unstats.un.org/unsd/dnss/gp/fundprinciples.aspx

This text was prepared for presentation at the National Conference of Data Users and Producers (CONFEST/CONFEGE) of the Brazilian Institute of Geography and Statistics (IBGE) at the round-table ‘Digital transversal communication and the risks and opportunities for producers and users in the Digital Era’ held on August 1, 2024.