ebook img

Algorithmic Regulation: a critical regulation PDF

40 Pages·00.447 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Algorithmic Regulation: a critical regulation

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/318820255 Algorithmic regulation: A critical interrogation Article  in  Regulation & Governance · July 2017 DOI: 10.1111/rego.12158 CITATIONS READS 17 1,268 1 author: Karen Yeung University of Birmingham 32 PUBLICATIONS   513 CITATIONS    SEE PROFILE All content following this page was uploaded by Karen Yeung on 09 February 2018. The user has requested enhancement of the downloaded file. (Yeung Algorithmic Regulation 2017 accepted.doc) The published version of this paper will appear in a forthcoming issue of Regulation & Governance available at http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1748-5991 Algorithmic Regulation: A Critical Interrogation Karen Yeung Professor of Law, Centre for Technology, Ethics, Law & Society (TELOS) King’s College London This paper has been accepted for publication by Regulation & Governance. Please do not distribute without the author’s permission. Abstract: Innovations in networked digital communications technologies, including the rise of ‘Big Data’, ubiquitous computing and cloud storage systems, may be giving rise to a new system of social ordering known as algorithmic regulation. Algorithmic regulation refers to decision-making systems that regulate a domain of activity in order to manage risk or alter behaviour through continual computational generation of knowledge by systematically collecting data (in real time on a continuous basis) emitted directly from numerous dynamic components pertaining to the regulated environment in order to identify and, if necessary, automatically refine (or prompt refinement of) the system’s operations to attain a pre-specified goal. It provides a descriptive analysis of algorithmic regulation, classifying these decision- making systems as either reactive or pre-emptive, and offers a taxonomy that identifies 8 different forms of algorithmic regulation based on their configuration at each of the three stages of the cybernetic process: notably, at the level of standard setting (adaptive vs fixed behavioural standards); information-gathering and monitoring (historic data vs predictions based on inferred data) and at the level of sanction and behavioural change (automatic execution vs recommender systems). It maps the contours of several emerging debates surrounding algorithmic regulation, drawing upon insights from regulatory governance studies, legal critiques , surveillance studies and critical data studies to highlight various concerns about the legitimacy of algorithmic regulation Keywords: big data, algorithms, surveillance, enforcement, automation 1 Algorithmic Regulation : A Critical Examination 1. Introduction “It’s time for government to enter the age of big data. Algorithmic regulation is an idea whose time has come” (Tim O’Reilly, CEO of O’Reilly Media Inc). A so-called Big Data revolution is currently underway, which many claim will prove as disruptive to society in the 21st century as Henry Ford’s system of mass production in the late 19th century (boyd and Crawford 2012). Although ‘Big Data’ has been variously defined, I use the term to refer to the socio-technical ensemble which utilises a methodological technique that combines a technology (constituted by a configuration of information-processing hardware and software that can sift and sort vast quantities of data in very short times) with a process (through which algorithmic processes are applied to mine a large volume of digital data to find patterns and correlations within that data, distilling the patterns into predictive analytics, and applying the analytics to new data) (Cohen 2012: 1919). The excitement surrounding Big Data is rooted in its capacity to identify patterns and correlations that could not be detected by human cognition, converting massive volumes of data (often in unstructured form) into a particular, highly data-intensive form of knowledge, and thus creating a new mode of knowledge production (Cohen 2012: 1919). Industries, academics and governments are enthusiastically embracing these technologies, all seeking to harness their tremendous potential to enhance the quality and efficiency of many activities, including the task of regulation, which this paper interrogates by critically examining the phenomenon of ‘algorithmic regulation’. It draws upon selective insights from legal and social scientific literature, highlighting emerging critiques of algorithmic power and the rise of automated data-driven systems to inform decision-making and regulate behaviour. My primary aim is to map the contours of emerging debates, raising questions for further research rather offering definitive answers, proceeding in four parts. Part I offers a working definition of algorithmic regulation. Part II then constructs a taxonomy of algorithmic regulatory systems based on their configuration at each of the three stages of the cybernetic process: notably, at the level of standard setting (adaptive vs simple, fixed behavioural standards); information-gathering and monitoring (historic data vs predictions based on inferred 2 data), and at the level of sanction and behavioural change (automatic execution vs recommender systems). Parts III, IV and V provide a critical analysis of algorithmic regulation, identifying concerns about its legitimacy drawn selectively from several strands of academic literature, including regulatory governance and public administration, legal scholarship, surveillance studies and critical data studies. Part VI concludes, sketching the contours of a broad research agenda anchored within legal and regulatory governance scholarship. 2. The mechanisms and forms of algorithmic regulation 2.1 What is algorithmic regulation? Although Silicon Valley entrepreneur Tim O’Reilly exhorts governments to embrace algorithmic regulation to solve policy problems he does not define algorithmic regulation, but merely points to various technological systems1 which he claims share four features: (a) a deep understanding of the desired outcome; (b) real-time measurement to determine if that outcome is being achieved; (c) algorithms (i.e. a set of rules) that make adjustments based on new data; and (d) periodic, deeper analysis of whether the algorithms themselves are correct and performing as expected (O’Reilly 2013). Because greater precision and rigour is required for critical analysis, I begin by offering a definition of algorithmic regulation, exploring what it means to describe something as ‘algorithmic’, and then explaining how I will understand the term ‘regulation’. In their broadest sense, algorithms are encoded procedures for solving a problem by transforming input data into a desired output (Gillespie 2013; 2014). Although algorithms need not be implemented in software, computers are fundamentally algorithm machines, designed to store and read data, apply mathematical procedures to data in a controlled fashion, and offer new information as the output. Even when confined to software, the term ‘algorithm’ may be variously understood. Software engineers are likely to adopt a technical understanding of algorithms , referring to the logical series of steps for organising and acting on a body of data to achieve a desired outcome quickly which 1 He refers to motor vehicle fuel emissions systems, airline automatic pilot systems, credit card fraud detection systems, drug dosage monitoring by medical professionals, internet spam filters and general internet search engines: O’Reilly (2013). 3 occurs after the generation of a ‘model’, i.e. the formalisation of the problem and the goal in computational terms (Gillespie 2013; 2014; Dourish 2016). But social scientists typically use the term as an adjective to describe the sociotechnical assemblage which includes, not just algorithms, but also the computational networks in which they function, the people who design and operate them, the data (and users) on which they act, and the institutions that provide these services, all connected to a broader social endeavour and constituting part of a family of authoritative systems for knowledge production. Accordingly, Gillespie suggests that, when describing something as ‘algorithmic’, our concern is with the insertion of procedure that is produced by, or related to, a socio-technical information system that is intended by its designers to be functionally and ideologically committed to the computational generation of knowledge. For him, ‘what is central is the commitment to procedure, and the way procedure distances its human operators from both the point of contact with others and the mantle of responsibility for the intervention they make’ (Gillespie 2014). Although computational algorithms include those which encode simple mathematical functions, the excitement surrounding Big Data is largely attributable to sophisticated machine learning algorithms , fed by massive (and often unstructured) data sets, that operate computationally and depart from traditional techniques of statistical modelling (Dourish 2016: 7). Traditional statistical modelling requires the analyst to specify a mathematical function containing selected explanatory variables and, through regression analysis, enables the identification of the goodness of fit between the data and these analytic choices. In contrast, machine learning does not require a priori specification of functional relationships between variables. Rather, the algorithms operate by mining the data using various techniques2 to identify patterns and correlations between the data and which are used to establish a working model of relationships between inputs and outputs. This model is gradually improved by iterative ‘learning’ that is, by testing its predictions and correcting them when wrong, until it identifies something like what is understood in conventional statistics as a ‘line of best fit’ to generate a model that provides the strongest predictive relationship between inputs and outputs.3 Accordingly, 2 Five well-used techniques are logistic regression models, the naïve Bayes classifier, k- nearest neighbors, decision trees and neural networks, all of which exemplify predictive modelling: Mackenzie (2015) 432-433. 3 Machine learning algorithms can be broadly split into three categories based on how they learn. Supervised learning requires a training data set with labeled data, or data with a known 4 this methodological approach is sometimes described as ‘letting the data speak’4 (Mayer- Schonberger and Cukier 2013: 6). While conventional statistical regression models worked with 10 or so different variables (such as gender, age, income, occupation, educational level, income and so forth) and perhaps sample sizes of thousands, machine learning algorithms that drive the kind of predictive analytic tools that are now commonly in use are designed to work with hundreds (and sometimes tens of thousands) of variables (‘features’) and sample sizes of millions or billions (Mackenzie 2015: 434). Algorithmic decision-making refers to the use of algorithmically generated knowledge systems to execute or inform decisions, and which can vary widely in simplicity and sophistication. Algorithmic regulation refers to regulatory governance systems that utilise algorithmic decision making. Although the scope and meaning of the term ‘regulation’ and ‘regulatory governance’ is contested (Baldwin et al 2010), I adopt the definition offered by leading regulatory governance scholar Julia Black, who defines regulation (or regulatory governance) as intentional attempts to manage risk or alter behaviour in order to achieve some pre-specified goal (Black 2014). Several features of this understanding of regulation are worth highlighting. First, although regulation is output value. Unsupervised learning techniques do not use a training set, but find patterns or structure in the data by themselves. Semi-supervised learning uses mainly unlabelled and a small amount of labelled input data. Using a small amount of labelled data can greatly increase the efficiency of unsupervised learning tasks. The model must learn the structure to organize the data as well as make predictions (NESTA 2015: 5). 4 There is a growing literature in ‘Critical Data Studies’ (or ‘Critical Algorithm Studies’) which seeks to explore data as situated in complex ‘data assemblages’ of action (Kitchen 2014b: 24-26), referring to the vast systems, comprised not just of database infrastructures, but also the ‘technological, political, social and economic apparatuses that frames their nature, operation and work’, including processes of data collection and categorization to its subsequent cleaning, storing, processing, dissemination and application (Kitchen et al 2015). A growing body of research examines the labour and political economies entailed in the reproduction of these assemblages, using a wide range of disciplinary lenses including STS (Ziewieke 2016; Beer 2017; MacKenzie 2015; Cheney-Lippold 2011) focusing on data from a variety of sources including meteorological data, data produced by for-profit education companies, financial trading data and biomedical data. This literature exposes the fallacy of understanding data as an objective set of facts that exist prior to ideology, politics or interpretation by seeking to understand data as situated in socio-technical systems that surround its production, processing, storing, sharing, analysis and reuse, thereby demonstrating that the production of data assemblages is not a neutral, technical process, but a normative, political and ethical one that is contingent and often contested, with consequences for subsequent analysis, interpretation and action (Kitchen 2014a). An illustration of the wide range of disciplinary perspectives, questions and approaches emerging within this field can be found in Big Data & Society (2016), Special Issue on Critical Data Studies available at http://bds.sagepub.com/content/critical-data-studies (accessed 11 November 2016). 5 widely regarded as a critical task of governments, regulation is also pursued by non-state actors and entities (Black 2008). Just as a public transport authority may regulate vehicle movement to optimise traffic flow, likewise a social media company such as Facebook might regulate the posting and viewing behaviour of users to optimise its financial returns. Secondly, the size of the regulated ‘population’ is highly variable. It may refer to the intentional actions of one person who adopts some system or strategy to regulate some aspect of her own behaviour (such as an individual who uses a fitness tracking device to help her ensure that she attains a minimum level of daily physical activity) through to regulatory systems that seek to direct and influence the behaviour of a large number of people or entities, such as algorithmic systems employed by digital car-sharing platforms, Uber, to enable drivers to offer motor vehicle transport services to individuals at a pre-specified fee without having had any previous relationship. Thirdly, because regulation is above all an intentional activity directed at achieving a pre-specified goal, any regulatory system must have some kind of system ‘director’ (or ‘regulator’) to determine the overarching goal of the regulatory system. Accordingly, I refer to algorithmic regulation as decision-making systems that regulate a domain of activity in order to manage risk or alter behaviour through continual computational generation of knowledge from data emitted and directly collected (in real time on a continuous basis) from numerous dynamic components pertaining to the regulated environment in order to identify and, if necessary, automatically refine (or prompt refinement of) the system’s operations to attain a pre-specified goal. 2.2 Forms of Algorithmic Regulation: A Taxonomy Algorithmic regulation has antecedents in the interdisciplinary science of cybernetics that emerged in the aftermath of World War II. Cybernetic analysis sought to move away from linear understanding of cause and effect and towards investigation of control through circular causality, or feedback (Medina 2015). The logic underpinning algorithmic regulation, and the ‘smartification’ of everyday life which it makes possible, rests on the continuous collection and analysis of primary data combined with metadata, which logs the frequency, time and duration of device usage and which, via direct machine-to-machine communication via digital networks, allows the combined data to be algorithmically mined in order to trigger an automated response (Morozov 2014). 6 By understanding regulation as a cybernetic process involving the three core components of any control system – i.e. ways of setting standards, goals or targets (‘standard- setting’); ways of gathering information (‘information-gathering’) and ways of enforcing those standards once deviation is identified in order to change behaviour so that it meets the requisite standards (‘enforcement and behaviour modification’), various forms of algorithmic regulation can be identified (Hood et al 2001: 23; Morgan and Yeung 2007:3 ). My taxonomy identifies two alternative configurations for each component, thereby generating a total of eight different forms (see Table 1). First, at the level of standard setting, the behavioural norm which the system enforces may be either a simple, fixed (yet reprogrammable) standard of behaviour. This is the most basic form of algorithmic intervention, exemplified in the use of password protection systems to authorise access to digital content. Alternatively, the behavioural standard may be adaptive, to facilitate the attainment of whatever fixed, overarching (yet reprogrammable) system goal the regulatory system is designed to optimise in order to produce system stability. These latter systems are often described as ‘complex’ or ‘intelligent’, such as intelligent transportation systems that effectively teach themselves how to identify the most reliable predictor of traffic flow through machine learning processes that utilise trial and error applied to continuously updated real-time traffic data (Lv et al 2015). Although these systems allow behavioural variation, for example, in vehicle speed limits and/or the duration, timing and frequency of traffic light cycles, depending upon traffic volume and distribution, the overarching objective is pre-specified and fixed by the system director: put simply, to optimise traffic flow within the system.5 Second, at the level of information gathering and monitoring, the system may operate on a reactive basis, configured automatically to mine historic performance data in real-time to detect violation. Simple reactive systems include automated vehicle speeding detection systems that utilise speed cameras to provide real-time identification of vehicles that exceed prescribed speed limits, while complex reactive systems include credit card fraud detection systems that utilise machine learning techniques to profile the 5 Algorithmic systems of this kind also underpin growing labour market practices including the use of ‘zero hour contracts’ which subject workers to variable scheduling, focusing paid work hours to times of high demand thus shifting the risk of changing demand onto workers and increasing work intensity (Wood 2016). Similarly, algorithmic workplace performance management techniques rely on microlevel surveillance of call centre workers to provide feedback to employers and employees aimed at optimising worker productivity (Kuchler 2014; Edwards and Edwards 2016). 7 spending patterns of credit card holders, aimed at detecting suspicious transactions when they occur, immediately alerting the credit provider and/or card-holder to take action. Alternatively, algorithmic systems may be configured to detect violations on a pre-emptive basis, applying machine learning algorithms to historic data to infer and thereby predict future behaviour. Simple predictive systems include digital text auto- complete systems while complex, pre-emptive systems make possible new forms of personalised pricing, applying machine learning algorithms to data collected from on-line tracking and measurement of on-line user behaviour at a highly granular level to generate consumer profiles, varying the price of goods offered to individuals on-line, based on algorithmic evaluations of the user’s willingness and ability to pay (Miller 2014). Third, at the level of enforcement and behaviour modification, the system may automatically administer a specified sanction or decision without any need for human intervention beyond user input of relevant data (or data tokens), such as simple reactive systems that automatically block access to web-content if the user fails to enter an authorised password. These systems constitute a form of action-forcing (or coercive) design (Yeung and Dixon-Woods 2010) thus offering the promise of immediate ‘perfect enforcement’ (Zittrain 2009)6. These systems may also operate pre-emptively, based on algorithmically determined predictions of a candidate’s future behaviour, such as systems that automatically evaluate applications from individuals seeking access to services such as loan finance, insurance cover and employment opportunities (O’Neil 2016). Although a human operator might be authorised to review and override the automated decision at a later stage, automation is relied upon to make and implement a decision that has real, consequential effects for the individual.7 Alternatively, both simple and complex systems 6 Although the self-executing capacity of these systems holds considerable allure by rendering human enforcement agents redundant, the legitimacy of ‘perfect enforcement’ has been questioned by cyberlawyer Jonathan Zittrain who highlights the dangers of smart devices (which he termed ‘tethered appliances’) emerging in an earlier internet age because they ‘invite regulatory intervention that disrupts a wise equilibrium that depends upon regulators acting with a light touch, as they traditionally have done within liberal societies’ (Zittrain 2009: 103). Moreover, the promise of ‘perfect’ enforcement is illusory, given the inevitable impossibility of defining ‘perfect’ standards that are capable of anticipating every single possible future event that may be of relevance to the operation of the regulatory system’s goals (Yeung 2008: 92-93). 7 It is the use of these kinds of algorithmic decision-making systems that has given rise to increasing concerns about errors in the underlying data, their application to particular individuals, and their potential discriminatory effects. Concerns of this kind have spawned a rising chorus of concern about the need for mechanisms that can secure algorithmic accountability, which are discussed more fully at section 5 below. 8 may be configured to provide automated ‘assistance’ or ‘recommendations’ to a human decision-maker, by prioritising candidates from within the larger regulated population. These ‘recommender systems’ are intended to direct or guide an individual’s decision- making processes in ways identified by the underlying software algorithm as optimal, offering prompts that focus a human user’s attention on a particular set of entities within the data set, with the human user retaining formal decision-making authority, exemplified by on-line shopping recommendation engines (Yeung 2017). Each of these forms of algorithmic regulation can be employed by state and non-state institutions. Some systems are applied to regulate the conduct of many millions of individuals, such as Facebook’s News Feed system (Luckerson 2015), while others may be limited to managing relationships within a small group. In particular, when used to manage the relationship between parties to a contract, they have been referred to by others as computer mediated contracts, referring to arrangements between contracting parties which harness the capacity of networked communication systems to undertake continuous, real-time digital monitoring of behaviour to detect, monitor and enforce performance of the terms of the contract, thereby overcoming a significant limitation of conventional contracts : the need for the principal to monitor the behaviour of the agent to guard against the agent’s temptation to ‘shirk’, that is, to act in a self-interested manner that is contrary to the interests of the principal (Williamson 1975). Hal Varian, Google’s Chief Economist, provides two examples of computer-mediated contracts that significantly reduce these costs for the principal: first, remote vehicle monitoring systems that verify whether driver behaviour conforms with the desired standard, thereby enabling car rental companies to continuously monitor and verify whether a driver is honouring his/her contractual obligation to operate the car in a safe manner. Secondly, these vehicle monitoring systems enable automated remote enforcement that allow a lender easily to repossess a car purchased by an individual on loan finance who fails to make two consecutive monthly repayments by automatically immobilising the car (Varian 2014). Table 1: A taxonomy of algorithmic regulatory systems Standard Monitoring Enforcement/ Description setting Sanction 1. Fixed Real time reactive Automated Simple real-time sanction violation detection administration systems 2. Fixed Real time reactive Recommender Simple real-time warning violation detection system systems 3. Fixed Pre-emptive Automated Simple pre-emptive sanction 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.