The datafication of social life has led to a profound transformation in how society is ordered, how decisions are made, and how citizens are governed.
This is particularly apparent in the use of data analytics in the public sector where the collection and processing of large quantities of data is an increasingly integral part of government practice. Public services are allocated, and state interventions are triggered based on data analytics that assess, categorize, rate and rank people and predict their behavior and their ‘risks’. The use of scoring systems that combine data from a variety of both online and offline activities is a particular emerging practice which significantly affects state-citizen relations and our understanding of civic rights.
In the commercial realm, credit scoring is a common practice in the financial sector to assess an individual’s creditworthiness. A wider range of consumer scores are now being applied across different economic sectors to predict consumption patterns as well.
The use of data scores has also reached governmental and public services and is thus applied to assess our performance as citizens, not just consumers. The most comprehensive scoring system currently developed is the Chinese Social Credit Score. The system works by integrating the rating of citizens’ financial creditworthiness with a wide range of social and consumer behavior to assess people’s overall trustworthiness and allow, deny, and privilege services accordingly. It combines different types of data from, e.g., online consumption; use of services; legal and educational records; social media activity; etc. and with few legal limits for data collection and use, it has been criticized as a digital totalitarian state and a tool for social control.
However, elements of this are emerging in western and the global north countries, too. Predictive analytics are being used in policing, fraud detection, welfare eligibility, the health sector and child protective services, among others. In criminal justice systems, risk assessment algorithms are used to produce ‘risk scores’ on defendants to estimate their likelihood of re-offending and thus determine sentencing. In border control, data-driven profiling based on a cross-set of aggregated data is increasingly used for ‘vetting’ the ‘threat’ of migrants and refugees to society.
Often, this is done with the aim of making public administration more efficient and evidence-based and of improving necessary state services, for example, identifying children in need of support.
However, there is growing concern about how such practices can lead to the profiling and labeling of citizens; the entrenchment of established forms of discrimination by using past patterns for future predictions; the disproportional targeting of already marginalized communities, as with predictive policing; increased surveillance through data collection and sharing; and unaccountable forms of risk assessment.
The lack of transparency on where and how governments are making use of data systems means that citizens are assessed, treated and targeted without their knowledge. These systems are based on criteria we are not aware of, and there is little possibility to engage or to object.
This raises significant questions about the role of democracy in increasingly datafied and automated states.
The Data Justice Lab at Cardiff University conducted a one-year research project in 2018 to find out more about the use of data scores and other forms of data analytics in the UK. As part of our project ‘Data Scores as Governance’, we focused on mapping and analyzing, local government uses of data analytics. The project led to: 1) a list and map of data analytics systems across local authorities, 2) a research report that details concrete examples of the different types of analytics systems being used as well as a survey of civil society concerns and 3) an interactive online tool to facilitate greater research and debate.
Our findings showed the extent to which scoring systems and predictive analytics are used in government. We identified 53 local councils that use data analytics for issues such as child welfare, predictive policing, criminal justice and fraud detection. Although it is highly likely that further systems being used were missed in the analysis, given the lack of transparency in this field.
The roll-out of data analytics systems is highly exploratory. Different systems are used from place to place, there are no standard procedures for how data systems are implemented and for what purpose. Finally, there is simply no common understanding of what constitutes predictive analytics in government.
There is simply no common understanding of what constitutes predictive analytics in government.
While effectiveness and efficiency gains are typically highlighted as rationales for implementing data analytics, our research shows that the austerity context of UK government policy, and the resulting funding constraints faced by local authorities are the prime reason for introducing or enhancing those systems. Local authorities are responding to cuts by using data to try and better target resources.
Many local authorities use commercial data analytics systems, provided by companies like IBM and Experian. This means that decisions on key public services are affected by commercial systems whose exact workings are often poorly understood by those who operate them. Furthermore, as commercial systems, their algorithms are trade secrets and thus outside public scrutiny.
Some of these systems aggregate and integrate a vast set of data types – socio-demographic data, consumer data, transactional data, social data, etc. Public service decision-making and ‘risk assessments’ of citizens may thus incorporate data on consumption habits, health, ethnicity, occupation, mobility patterns, social media activity, etc. They may include a variety of data types that were never intended for this purpose.
Data analytics systems are supposed to inform and assist human decision-making but with their increasing integration in governance processes and service provision, the space for human decisions is shrinking. When supposedly objective data analytics lead to a certain outcome – for example, that parents are likely to neglect their child or a previous offender is likely to re-offend – it gets more difficult for case workers to question these results and choose a different course of action. This is particularly so in times of shrinking budgets and increased pressures on time and resources.
As many data scores are in fact ‘risk scores’, they foster an interpretation of human behavior as ‘risk’ and serve as a particular lens through which the world is perceived. Moreover, they individualize that ‘risk’. Data scores and other data analytics systems typically point to the individual dimension of health issues, family problems, criminal justice issues, etc., rather than the societal, political and economic context. These therefore become individual problems that are ‘solved’ at the level of the individual person or household.
An emphasis on risk assessment will lead to a broader shift in state operations – citizens will be viewed less as co-creators of the societies they are part of and more as potential risks needing management.
This dynamic resonates with established neoliberal approaches, and is entrenched in the public sector through the seemingly objective practice of data analytics. An emphasis on risk assessment will lead to a broader shift in state operations – citizens will be viewed less as co-creators of the societies they are part of and more as potential risks needing management.
Civic participation in the development, implementation and use of data analytics systems is therefore crucial. So far, there is little effort to encourage citizen engagement and intervention. Yet, if datafied societies are to be democratic societies, data scores and other predictive analytics need to be closely scrutinized by citizens, through new forms of participation and civic control.