ebook img

City boundaries and the universality of scaling laws PDF

7.3 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview City boundaries and the universality of scaling laws

City boundaries and the universality of scaling laws Elsa Arcaute∗1, Erez Hatna2, Peter Ferguson1, Hyejin Youn3, Anders Johansson4, and Michael Batty1 1Centre for Advanced Spatial Analysis (CASA), University College London, UK 2Center for Advanced Modeling, The Johns Hopkins University, USA 3Santa Fe Institute, USA 4Department of Civil Engineering, University of Bristol, UK 3 Keywords: Power-laws;Scalinglaws;urbanindicators; electrical consumption and road surface area amongst 1 0 city boundaries many others [7, 9]. These are encompassed in the fol- 2 Abbreviations: E&W, England and Wales; LUZ, lowing relationship Larger Urban Zone; US, United States of America; an MSA, Metropolitan Statistical Area; UK, United King- Y(t)=Y0N(t)β (1) J dom;ONS,OfficeforNationalStatistics;CI,Confidence Interval where Y(t) and N(t) represent the urban indicator and 8 the population size of a city at time t respectively, and ] Abstract Y0 is a time dependent normalisation constant. Evi- h This paper investigates the universality and robust- dence of such laws has been observed in the US, Ger- p ness of scaling laws for urban systems, according to many and China [10, 8] among other countries. The - the work by Bettencourt, Lobo and West among others exponent β has been found to lie within three universal c categories [7]: (i) β < 1 (sublinear) economies of scale o [7, 8], using England and Wales as a case study. Ini- associated with infrastructure and services, e.g. road s tial results employing the demarcations for cities from . the European Statistical Commission digress from the surface area; (ii) β ≈1 (linear) associated with individ- s ual human needs, e.g. housing and household electrical c expected patterns. We therefore develop a method for i producing multiple city definitions based on both mor- consumption; and (iii) β > 1 (superlinear) associated s with outcomes from social interactions, e.g. number of y phologicalandfunctionalcharacteristics,determinedby patentsandincome. Asummaryoftheexponentsfound h population density and commuting to work journeys. in [7, 10] is given in Fig. 1A. p For each of these realisations of cities, we construct [ urban attributes by aggregating high resolution census Scaling laws of this sort have proven to be very suc- data. Theapproachproducesasetofmorethantwenty cessfulinbiology[36,31], wheresizealoneprovidessuf- 1 thousand possible definitions of urban systems for Eng- ficientinformationtopredictmanypropertiesofanani- v landandWales. Weusetheseasalaboratorytoexplore mal, such as its life span, metabolism and heart rate. 4 7 the behaviour of the scaling exponent for each configu- Consequently, identifying equivalent universal scaling 6 ration. Theanalysisofalargesetofurbanindicatorsfor lawsforcitiescouldbeextremelyimportantforfurther- 1 thefullrangeofsystemrealisationsshowsthatthescal- ingourunderstandingofurbandynamics,andmayhelp . ing exponent is notably sensitive to boundary change, to manage many contemporary global challenges that 1 particularly for indicators that have a nonlinear rela- concern cities such as the effect of transport and in- 0 tionship with population size. These findings highlight dustrial emissions on climate change, the use of natural 3 the crucial role of system description when attempting resources and the growth of urban poverty [14]. 1 : to identify patterns of behaviour across cities, and the Nevertheless, many challenges need to be overcome v need for consistency in defining boundaries if a theory before universal laws can be established [25]. For ex- Xi of cities is to be devised. ample, questions remain as to how to generate reliable results both within and across national datasets. The r Introduction problemofambiguityinthedefinitionandmeasurement a Everycityhasevolvedunderauniquesetofgeograph- of urban indicators gives rise to incongruent compar- ical, political and cultural conditions [21] but despite isons between countries. In addition, it is not always heterogeneity in their historical trajectories, there ap- possible to specify the generative mechanism of an ur- pear to be certain characteristics common to all cities ban indicator, in order to unequivocally assign it as the regardlessoftheirlocation. Suchcharacteristicsinclude outcome of either social interactions, human needs or fractal properties [6, 1], Zipf distributions of city sizes services. Furthermore, the quality of the data might [38] and population growth laws [19, 18, 16, 29, 32]. In not permit a full characterisation of the exponent in the search for appropriate paradigms, a single charac- either one of the three regimes: sublinear, linear and teristic of urban systems, population city size, reveals superlinear. scaling laws quantifying urban attributes ranging from These challenges bring forward a more fundamental innovation,incomeandemploymentrates,tohousehold aspect in the development of a theory of cities. This refers to the robustness of the model, assessed through ∗Correspondingauthor,email: [email protected] the sensitivity of the exponent to different city defini- 1 A of journey to work trips that are destined for each core. Results from Bettencourt et al PNAS 2007 Wecanthenprobethewholeparameterspaceofdensity 1.8 andcommutingthresholdstoobtainmorethan20×103 1.7 R&D Employment France 1999 realisations of cities. When the scaling laws are tested 1.6 superlinear regime for all these system descriptions, we find that the ex- 1.5 ponent is consistent for variables in the linear regime, Scaling exponentβ1111....1234 linear regime Total Wages Patents bstmieuoentntsh.ittohiTvdaheotelovtsoageryribaetobsouuldenltsedsfiianenrmeytpuhdhreebafinasnoinsinets-iylotishnnteeeaamnrnesdre,edigsnaifmoomrrepdsaleeracrotdeonisshdtisiregtirbheivnulyet- 1sublinear regime congruent comparisons between cities. They also reveal 0.9 theimportanceofunderstandingtheoriginofdeviations 0.8 from a model, in order to construct a theory of cities. Urban indicators B Results Results for Larger Urban Zones in E&W (LUZ) Scaling laws in England and Wales (E&W) 1.3 n. train stations→ The complex, multi-layered nature of cities means electricity that more than one reasonable definition of their ex- 11..12 area of rail→ ↓ C↓rime I↓ncome tent can be produced depending on whether the polit- Scaling exponentβ000...7891 ↑ ←Pat↑Henottsels&Rest iiknpcnnouagorlc,pwcooeonnscnseopessninrdoosoemubfrsleieacdamnso[i3trinno,git4heti]hooa.wglerAesaaxtplutpchhdiltiooycyruarosgtehfhiaooucuntrhhlbdoiaosfbnfpsetcslhudyaeresliatfinselnyigmtseytdbse,.eimshtFhaaoievsrriweotbheueilesr-l area of roads 0.6 in England and Wales (E&W), we ignore the challenge ←d to work linear superlinear regime 0.5 regime of boundary delineation by adopting the definition em- sublinear regime 0.4 ←Agriculture ployed in [7], where cities are considered as integrated Urban indicators economic and social units. In the case of the US this definitioncorrespondstoMetropolitanStatisticalAreas (MSAs)[35], and an analogous set of areas for the Eu- Figure1: Exponentswith95%CIfordifferenturbanin- ropean Union are Larger Urban Zones (LUZs)[17]. We dicatorscoloured-codedaccordingtothecategorygiven select a range of observed socio-demographic indicators in [7]. A) Results for the US, Germany and China, producedbytheUKOfficeforNationalStatistics(ONS) taken from table in Bettencourt et al 2007, B) Results from the 2001 census for E&W, and aggregate high res- for E&W, Table S2 contains details of variables. olution census units up to the LUZ classification. We exclude Scotland from the analysis since the National Records of Scotland applies a different methodology for tions and sample distributions. In complex systems, datacollationthantheONSinEnglandandWales(de- it is well-known that an observed power-law is usually tails of all data sources are provided in the SI Text in only valid for the tail of the distribution containing the the ‘Individual Data Tables’ section). Analysis of scal- largest set of events [24, 34, 26, 27]. This is the case for inglawsisthenundertakenbyclassifyingeachindicator the Zipf distribution of city sizes, as has been pointed in terms of the above mentioned three regimes: sublin- out in [16, 13, 23, 2]. Hence it is crucial to identify how ear, linear and superlinear. city demarcations, and minimum population size cut- When attempting to classify the resulting scaling be- offs, give rise to large fluctuations in the value of the haviour from this initial exploration, many of the in- exponent. dicators could not be placed within a unique domain. In this work, we analyse the scaling behaviour of a See Table S1 and discussion in SI Text. Notwithstand- range of indicators using census data on cities in Eng- ing, there are some urban indicators in the list that can land and Wales. After initially adopting a predefined be clearly categorised in one of the three regimes. A set of standard city delineations, we find that the ob- summary of these results is given in Fig. 1B, where β served scaling relationships depart from the expected is the exponent in eq. (1). These exponents pertain to behaviour. We hypothesise that the unanticipated re- cities defined in terms of LUZs for E&W. Details of the sults may be due to the given definition of city bound- variables and the values for β, R2, and the confidence aries. In response, we explore new methodologies to intervals can be found in Table S2. generate more realistic city boundaries from more dis- Although we observe many variables that do behave aggregate data. Our method identifies city extent by asexpectedinthesublinearandlinearregime,giventhe clustering very small scale geographic units according confidence intervals associated with each exponent, fur- to population density and journey to work commut- therverificationisclearlyrequiredforsomeofthem. On ing trips, removing the need for an a-priori assumption the other hand, the values for the scaling exponent in about boundary demarcation. This enables us to define the superlinear regime do not corroborate the expected cities based on both their morphological and functional ones. This is particularly surprising for three variables extent. Aseriesofurbanareasisgeneratedbyclustering thatareclearlyoutcomesofsocialandeconomicinterac- adjacent neighbourhoods that lie within a defined den- tions: number of patents, household income and crime sitythreshold. Thentheeffectivecommutinghinterland incidents. The latter can be regarded as one kind of of each core is identified by calculating the proportion social activity, that therefore is expected to increase su- 2 perlinearly with city size as discussed in [8, 20]. Size of cluster rank=3 A 1 area C Let us now investigate if all these discrepancies are 0.9 population duetoaninappropriatedefinitionofcities. Fig.S2gives 0.8 atahrrbeeiiprtrrsaeirszieennedtsaissttiironinbtuohtfeiocsnite.ileeTscthiineonEm&oafpWcsithiinoewstesarnamdhstioghfheLddUeeZglirmseaeitnaod-f Normalised size00000.....34567 0.2 tion of their boundaries. The size distribution is more 0.1 or less Zipf. However, the next biggest cities after Lon- 05 10 De15nsity th20reshol2d5 (prs/h30a) 35 40 don seem to be underestimated according to this plot. T21h,einmaisdrdepitrieosnenttoattihoenaobfsetnhceeseofciitmiepsoorutatnotfcaititeostasulcohf B100 Cum.DistributionforPopulationcαi=ti2e.s07,p=0.80 as Oxford and Reading, questions the soundness of the LUZ representation. 10−1 In the next section we look into new ways of eluci- Pr(X≥x) datingcitiesinE&Winanattempttoanswertheques- 10−2 tion about the influence of boundary definition on ob- served scaling behaviour. Our first aim is to look for 10−1301 103 x 105 107 contours of cities that are consistent with the built en- vironmentandtheireconomicflows. Weachievethisby Figure 2: Clusters of cities for cutoff of 14prs/ha. A) using population density as a fundamental urban prop- Transition in cluster size; B) Zipf distribution of city ertytoconstructtheinitialsettlements. Thisisfollowed size; C) Corine satellite map of E&W: red corresponds by an expansion of the boundaries driven by commut- to the builtarea and the black contours are theclusters ing to work flows. Our second aim is to explore scaling defined for ρ=14 prs/ha laws in a comprehensive set of systems of cities, so that we can analyse the behaviour of the scaling exponent under these different definitions. This will give us in- sight into the stability or sensitivity of such laws to city is very similar to the one prescribed by the built envi- boundaries. ronment. This corresponds to ρ = 14, and we see from Fig. 2C that it recreates almost exactly the urbanised Redefining city boundaries through density areasdefinedusingtheCORINElandcoverdataset[15]. In order to redefine cities, we construct a clustering In addition cities follow a Zipf distribution. Fig. 2B algorithm parametrised by population density. A simi- shows the cumulative density function with an expo- lar algorithm can be be found in [30]. The unit of ag- nent of 2.07 and a very high p-value of 0.8, using the glomeration is a ward, which is the smallest geograph- method for fitting a power-law distribution proposed in ical unit in the census data across many variables (see [12]. SI Text, ‘Unit of Geography’ section for details). Let ρ be our density parameter. We cluster all adjacent Extending boundaries to include commuters wards with density ρw such that ρw ≥ ρ. If a ward At this point, we expand our definition of the city i has a density ρwi < ρ, but is surrounded by wards fromapurelymorphologicaldescriptiontoincludesome such that ρw ≥ ρ, then the ward is also included in sense of its functional extent through an incorporation the cluster. This is done in order to avoid cities with of commuting flow data into the clustering algorithm. holes. For example, if a ward contains a big park, such Data on the total journey to work flows of commuters as in Richmond in Greater London, its density will be from every ward in E&W is provided by the ONS and much lower than its adjacent wards. If left out of the is used to define the commuting hinterland for the orig- cluster, the city will not only have a hole, but will be inal 40 cluster systems. The procedure operates as fol- missing an important functional area. Cities are hence lows. For each realisation for ρ ∈ [1;40], we select only considered as continuous entities in this first approach. clusters whose population size N is such that N ≥ N , 0 In detail, the parameter ρ is varied within the interval where N ∈{0,10,50,100,150}×103 individuals. Each 0 [1;40] persons/hectare. The result is 40 different reali- ward is then added to the cluster for which the largest sations of systems of cities for E&W, varying from very percentage τ of people commute into if τ > τ , where 0 largeclusterscontainingvarioussettlements, toclusters τ ∈[0;100]. No continuity condition is imposed on the 0 containing only the core of cities for the highest density newclusters. Theextremevalueofτ =100reproduces 0 values. Maps for some of these cases are featured in the the original system. This procedure leads to a compre- SI Text (see Fig. S3). When we follow the growth of hensivelistof20.2×103 realisationsofsystemsofcities. cluster sizes resulting from the change in density from See the SI Text for a visual representation of some of high to low values, we observe a sharp transition of the these composites (Fig. S4). Exploring the full parame- rank 3 cluster between ρ=13 and ρ=12, see Fig. 2A. ter space is useful to assess the behaviour of the scaling This transition corresponds to the joining of Liverpool exponent, bearing in mind that for the extreme values and Manchester. The biggest cluster encompasses Lon- of ρ and low values of τ, the sets of aggregates move don,andthisgrowssteadilyincludingsmallsettlements further and further away from realistic descriptions of as the density lowers, but does not merge with another cities. big city within the interval considered, and this is why notransitionisobserved. Ifweselectadensitythreshold Sensitivity of the power law exponent before the merging of Liverpool and Manchester takes We make use of heatmaps to represent the values of place, but near the transition since these two cities are β in eq. (1), for each of the five initial population cut- veryclosespatially,wereproduceasystemofcitiesthat offs pre-commuting clustering: N ≥ N , in the whole 0 3 Total income (weekly) Total income (weekly) All household spaces All people aged 16−74 in employment 5 5 5 5 muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β Com7855 0.8 Com7855 0.8 Com7855 0.8 Com7855 0.8 95 ←cities well defined 95 ←cities well defined 95 ←cities well defined 95 ←cities well defined 5 10 15 20 25 30 35 40 0.7 5 10 15 20 25 30 35 40 0.7 5 10 15 20 25 30 35 40 0.7 5 10 15 20 25 30 35 40 0.7 Population density cutoff:ρ(P0=0k) Population density cutoff:ρ(P0=10k) Population density cutoff:ρ(P0=150k) Population density cutoff:ρ(P0=150k) Total income (weekly) Total income (weekly) 5 5 muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β Flwinihgeoualrreepre4ag:riamHmeee.atteTmrhaseppaefcxoeproonbesnetrvraebmleasintshcaotnbsiesltoenngt tino tthhee Com7855 0.8 Com7855 0.8 Employment in agriculture, hunting and forestry Employment in agriculture, hunting and forestry 95 ←cities well defined 95 ←cities well defined 0.7 0.7 5 5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 FpoigpuuProlepaulta3tiioo:n ndenHssitiye zcauetotfftm:ρh(arP0ep=1s0h0fkoo)rldisn:cnoomceutfPooopffrula,tido1ni d0ffen4esi,try e1cun0toft5f:ρam(Pn0=id1n50i1km)5u0m× muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β 103. Com7855 0.8 Com7855 0.8 95 ←cities well defined 95 ←cities well defined 0.7 0.7 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Population density cutoff:ρ(P0=0k) Population density cutoff:ρ(P0=10k) parameter space for τ ∈[0;100] and ρ∈[1;40]. Linear 0 Employment in agriculture, hunting and forestry Employment in agriculture, hunting and forestry relationships of variables with population size, are gen- 5 5 epvertareaerrlislaeysbnpltnaeeoscdett.eaanObffdoenvctoetteh.bdeeFobhotyorhmettrhhoehgiseadnnrieeffdoa,eusrnoseonnon,tv-ehldireneatfiehtanmeritwadiopehnsposelfenoodprfeatcnrhiactemiieseess- muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β cinomthiengscfarloinmgaegxgproengeanttioinfienffiteicatlsc,ownidlliteioxnhsib,igtivveanribaytiothnes Com7855 0.8 Com7855 0.8 95 ←cities well defined 95 ←cities well defined city limits and population aggregates through commut- 5 10 15 20 25 30 35 40 0.7 5 10 15 20 25 30 35 40 0.7 ing patterns, are changed. Population density cutoff:ρ(P0=100k) Population density cutoff:ρ(P0=150k) A first inspection indicates high variability between heatmaps for the same urban indicator but different Figure5: Heatmapforemploymentinagriculture,hunt- population size cutoffs. The effect of imposing a mini- ing and fishing for different minimum population size mum size on a settlement in order to consider it a city, thresholds: no cutoff, 104, 105 and 150×103. isthereductioninthenumberofcitiesthatareincluded in the system to test for scaling laws, see Fig. S4. For the extreme scenarios, i.e. very high density and mini- Fig. 5 for employment in agriculture. In this case, ex- mum population size of 150×103, the number of cities tending the area of cities to include commuters, makes included in the distribution can vary greatly, from 429 the exponent shift from the sublinear to the superlinear with no cutoff, to only 5 cities if a large population regime. This effect is pronounced if in addition set- size cutoff is imposed. Variations between the different tlements are removed through the constraint of mini- heatmaps for the extreme values of the density parame- mumpopulationsizeof105. Furthermore,fortherange ter, arethereforemainlyduetothesmallsamplesizein where clusters are just the central cores of cities, given the distribution, and no statistically sound conclusion byρ>30,theexponentalsoincreasesifpopulationsize can really be drawn from these extreme cases. This is constraints are applied. For variables such as distance particularly the case for Income as shown in Fig. 3. For to work, see Fig. S6, the exponent becomes superlin- the extreme scenarios, i.e. where there is only a very ear, although as stated earlier, for these extreme cases small number of cities included in the distribution, the no statistical validity can be obtained. There are also weight of London becomes important, biasing the ex- many variables that should present economies of scale, ponent to superlinearity. London is a positive outlier, such as infrastructure variables, e.g. area of roads; and and relevant economic agglomeration effects are clearly someemploymentcategoriessuchaselementaryoccupa- present, but do not seem to exist for the next biggest tions and manufacturing, that nevertheless show a lin- cities. We also observe that in this case no effects are ear exponent if no cutoff on population size is imposed, recorded by including commuters into the analysis. For see Fig. S7. If on the other hand only the tail of the properties that belong to the linear regime, and where distribution containing the large cities is taken into ac- London is not an outlier, such as number of households count,theexponenttendstosublinearity,althoughvery and of people employed, the exponent remains consis- weakly, and only for some of these variables. tent across the whole parameter space, see Fig. 4. The expected agglomeration effects for variables that Conversely, the exponent for the observables in the are the outcome of social and economic interactions sublinear regime can be greatly affected by variations are in general not observed. Most of the employment in the threshold for the percentage of commuters. See categories corresponding to this regime have linear 4 Employment in financial intermediation Employment in financial intermediation Area of Non Domestic Buildings (1000m2) Area of Non Domestic Buildings (1000m2) 5 5 5 5 muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912βmuting flow cutoff:τ123456555555 0111...912β Com7855 0.8 Com7855 0.8 Com7855 0.8Com7855 0.8 95 ←cities well defined 95 ←cities well defined 95 ←cities well defined 95 ←cities well defined 0.7 0.7 0.7 0.7 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Population density cutoff:ρ(P=0k) Population density cutoff:ρ(P=10k) Population density cutoff:ρ(P=0k) Population density cutoff:ρ(P=10k) 0 0 0 0 Employment in financial intermediation Employment in financial intermediation Area of Non Domestic Buildings (1000m2) Area of Non Domestic Buildings (1000m2) 5 5 5 5 Commuting flow cutoff:τ1234567855555555 00111....8912β Commuting flow cutoff:τ1234567855555555 00111....8912β Commuting flow cutoff:τ1234567855555555 00111....8912βCommuting flow cutoff:τ1234567855555555 00111....8912β 95 5 10 ←15citie2s0 well 2d5efine3d0 35 40 0.7 95 5 10 ←15citie2s0 well 2d5efine3d0 35 40 0.7 95 5 10 ←15citie2s0 well 2d5efine3d0 35 40 0.7 95 5 10 ←15citie2s0 well 2d5efine3d0 35 40 0.7 Population density cutoff:ρ(P0=100k) Population density cutoff:ρ(P0=150k) Population density cutoff:ρ(P0=100k) Population density cutoff:ρ(P0=150k) Figure 6: Heatmap for employment in financial inter- Figure 7: Heatmap for area of non-domestic buildings mediationfordifferentminimumpopulationsizethresh- for different minimum population size thresholds: no olds: no cutoff, 104, 105 and 150×103. cutoff, 104, 105 and 150×103. since more than 20×103 different configurations were exponents, see Fig. S8 with the exception of Financial explored, and linearity persisted. The main incongruity Intermediation. Nevertheless the value of the exponent between observed and expected outcomes, is the lack is nowhere close to the expected 1.15 from supercre- of superlinearity for exponents belonging to observables ative employment or 1.30 from R&D employment. In that are the product of economic and social dynamics, addition, once the cities are extended as integrated suchasincomeandsomeemploymentcategoriesrequir- economic entities by including commuters, the effect ing particular skills. on the exponent is to lower its value towards linearity, Our methodology provides a tool to construct multi- instead of increasing it. The heatmaps show how plecitylimitsinasystematicway, enablingustodefine these non-linear effects are highly sensitive to city consistentsystemsofcities. Moreover,itemphasisesthe definition and sample distribution, see Fig. 6. Different sensitivity to boundary definition for urban indicators cutoffs on population size give very different results. that do not show a linear dependency with population And once again, the effect of London as a positive size. This is specifically highlighted for patents, whose outlier becomes important for a small sample size. exponent is highly volatile across the whole parameter Other variables displaying superlinearity are area of space. Thesensitivityofnon-linearexponentstobound- non-domestic buildings and patents. For the former the aries indicates that comparisons drawn between cities superlinearity, which is considerable if no constraint on based on the value of the exponent can be misleading. population size is imposed, is completely washed out if a minimum population size of 50 × 103 individuals is Discussion imposed, see Fig. 7. Patents on the other hand, present This work shows that the search for patterns in ur- the higher volatility, and each heatmap for different banindicatorsismoreintricatethanpreviouslythought. population size cutoffs gives a very different result, see The specific demarcation and definition of cities play a Fig. 8. For this variable we need to impose a minimum crucial role in the distribution and measurement of ur- population size threshold of 104 in order to obtain a ban attributes. Any dependencies found between the dataset that does not contain many zeroes from the latter and population size, are strongly affected by city small settlements. selection and definition. The argument could however be reversed if universality existed. Contours of cities In conclusion, our results show that cities in E&W could be constructed according to expected statistical donotpresenteconomicagglomerationeffects,withthe or scaling laws. This can only be done if these are not exception of London, which is a positive outlier. Fur- too sensitive to borders, since otherwise one would face thermore, if non-linear effects are present, the scaling the dilemma of what comes first; the boundary upon exponentishighlysensitivetocitydelimitation. There- which the theory rests, or the theory that defines the forethevalueoftheexponentforasingledefinitioncan- boundary. not be taken as a proxy to draw comparisons between Thelackofsuperlinearexponentvaluesforindicators cities if no consistent way to construct cities has been driven by social and economic interactions sets a sig- devised. Ontheotherhand,wewerealsoabletoconfirm nificant discrepancy between results for E&W, and re- that all the urban indicators that have a linear depen- sultsfoundfortheUSandChinaintheliterature. This dency with population size, are robust to city demarca- raises many important questions. On the one hand, the tions. Any discrepancies found between observed linear distinct results obtained from different population size dependenciesandexpectednon-linearonesaccordingto cutoffs,indicatethatmaybeE&Wistoosmallasystem [7,8],arethereforenotduetoapoordefinitionofcities, fromwhichagglomerationeffectscanbemeasuredprop- 5 Patents 2000−2011 Patents 2000−2011 employment in London, stemming the so called ’drift 5 5 to the south’ and attempting to reinvigorate all regions muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β ondueoeTtnds’hisdetdeogoLemrooeigflnnredaaconptnchetiahcanesdscraaesfioaleunchtaohnfoctefihaaaesltacsnEiytdsnyt’gbeslmuaisnniodntef.ersiansctsteeirorevnsistc.emsLihgouhnbt- Com7855 0.8 Com7855 0.8 relates as much to a global organisation of trade and 95 ←cities well defined 95 ←cities well defined interaction. A characteristic which is reflected in the 0.7 0.7 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Population density cutoff:ρ(P=10k) Population density cutoff:ρ(P=50k) regressions as London is a positive outlier for attributes 0 0 Patents 2000−2011 Patents 2000−2011 that are expected to have superlinear dependencies. 5 5 The performance of cities such as London should muting flow cutoff:τ123456555555 0111...912β muting flow cutoff:τ123456555555 0111...912β poF[3poo3lse,lsroi3awb7tlii,ynn2gg8bSw]e,oiatrhendvienaitffltaueear’lseatneirddgteeparreesorlcnsaaptlteiehvcdeetinevtmeeotweorfoogtrtehkhneecorefdeiognsfltcoberbriiagpaltctithohioinunnbgosssf. Com7855 0.8 Com7855 0.8 cities could be adopted, in which these global hubs are 95 ←cities well defined 95 ←cities well defined evaluated separately to their domestic counterparts. 0.7 0.7 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 Sornette refers to the formers as dragon-kings. A two Population density cutoff:ρ(P=100k) Population density cutoff:ρ(P=150k) 0 0 systemtheoryofcitiesmightthenemerge. Aregimefor cities driving international dynamics, the dragon-kings, Figure 8: Heatmap for total number of patents from and a regime for the remaining cities composing a 2000-2011fordifferentminimumpopulationsizethresh- country. olds: no cutoff, 50×103, 105 and 150×103. The methodology employed in this paper, had the purpose of recreating several representations of cities in erly. This also prompts us to investigate the necessary England and Wales in order to explore the sensitivity conditions that need to be fulfilled in order to observe of the scaling exponent, and to distinguish between lin- the expected scaling exponents. In addition to the rela- earandnon-lineareffects,bylookingatthefluctuations tively diminutive scale of England and Wales compared betweendifferentrealisations. Acrossmorethantwenty totheUSandChina,thespatialdistributionofitscities thousanddescriptions, wewereabletorecordandchar- might be responsible for obscuring superlinear depen- acterise the behaviour of the scaling exponent over the dencies. Individual cities within the UK are perhaps whole set. It is our intention to apply this methodology too close to each other to be easily treated as discrete tootherEuropeancountries,distinguishingbetweensys- entities, or maybe London is simply too big relative to tems with and without primate cities. In addition, we therestofthesystem. Thiscanbeinvestigatedfurther, intend to re-analyse the US by applying clustering al- by looking for agglomeration effects in other countries gorithms to more disaggregate data than MSAs, and to that contain a primate city such as London [22]. assess historic datasets for the UK to evaluate the sta- Alternatively, the main discrepancies might be due bilityofthescalingexponentovertimeaswellasspace. to the fact that the UK has experienced a process of The challenge of studying cities in a consistent manner de-industrialisation for longer than most western coun- isclearlyconsiderableduetospatialandqualitativedif- tries. The combined impact of globalisation and de- ferences in data from location to location. It is however industrialisation may be causing a slow down in the a necessary step for identifying truly universal patterns growth of the largest cities outside of London, in turn ofbehaviour. Itisourhopethattheapproachdescribed affectingthevalueoftheexponent. Thisaffectmightbe heregoessomewaytomeetingthischallenge,progresses particularlynoticeableintheresults,astheexponentof the debate on scaling and city size and moves us closer a power law is driven by the largest events in the dis- to a better understanding of systems of cities. tribution and the weight of London is not big enough to skew the exponent. This hypothesis can also be put Materials and Methods to the test by calculating the exponent for various indi- Most of the variables come from the 2001 UK census cators at the height of the British industrial revolution, dataset, produced by the Office for National Statistics. during the growth phase of the above mentioned indus- Thedataisofhighspatialresolution,andisgivenatthe trial cities. The scaling behaviour during this time pe- level of wards. This is aggregated for each of the differ- riodcouldbemoreinlinewiththeexpectedscalinglaws ent realisations of cities described in the text. Each of drafted in [7, 8]. If this is the case, perhaps there is a the tables from which the indicators were obtained is timeline that all countries follow after industrialisation described in detail in the SI Text. Data on patents was that alters expected scaling behaviour? provided by the intellectual property office at the post- On the other hand, the maturity of the UK with re- code level, for the years 2000 to 2011. The dataset on spect to industrialisation, makes it a unique integrated householdincomewastakenfromUKcensusexperimen- urbansystem[5],inwhich,followinggovernmentalpoli- tal statistics for 2001/02, and it was produced using a cies of regionalisation and decentralisation [11], critical model-basedprocess. Crimedatawasobtainedfromthe functionalitieswereremovedfromthecoreofmaincities Home Office, and is the average between the years 2003 and placed in other areas, smoothing away any agglom- and 2011. Finally, infrastructure data, such as the area eration effects from economic output. Such regionalisa- of roads, paths and buildings, come from the 2001 Gen- tion policies were applied by the government from the eralisedLandUseDatabase. Detailsforallthevariables 1920s onwards, with respect to reducing congestion of are provided in the SI Text. 6 Acknowledgements [20] A. Gomez-Lievano, H. Youn, and L. M. A. Bet- tencourt. The statistics of urban scaling and their EA,EH,PF,AJandMBacknowledgethesupportof connection to Zipf’s law. PLoS ONE, 7(7):e40393, ERC Grant 249393-ERC-2009-AdG. Useful discussions 2012. with Luis Bettencourt, and Geoffrey West of the Santa [21] P. Hall. Cities in Civilisation: Culture, Innovation Fe Institute, and Jos´e Lobo of Arizona State University and Urban Order. Weidenfeld and Nicholson, Lon- helpedclarifymanyissues. HYacknowledgethesupport don, 1998. ofgrantsfromtheRockefellerFoundationandtheJames [22] M. Jefferson. The law of the primate city. Geo- McDonnell Foundation (no. 220020195). graphical Review, 29:226–232, 1939. [23] B. Jiang and T. Jia. Zipf’s law for all the natural References citiesintheunitedstates: ageospatialperspective. [1] M. Batty. The size, scale and shape of cities. Sci- Int. J. of Geo. Inf. Sci., 25(8):1269–1281, 2011. ence, 319:769, 2008. [24] M. E. J. Newman. Power laws, pareto distribu- [2] M. Batty. Visualizing spacetime dynamics in scal- tions and Zipf’s law. Contemp. Phys., 46(5):323– ing systems. Complexity, 16(2):51–63, 2010. 351, 2005. [3] M.Batty. Cities,prosperity,andtheimportanceof [25] A. Noulas, S. Scellato, R. Lambiotte, M. Pontil, being large. Environment and Planning B: Plan- and C. Mascolo. A tale of many cities: Univer- ning and Design, 38(3):385–387, 2011. sal patterns in human urban mobility. PLoS ONE, [4] M. Batty. Commentary: When all the worlds a 7(5):e37027, 2012. city. Environment and Planning A, 43(4):765–772, [26] R.Perline. Weakandfalseinversepowerlaws. Sta- 2011. tistical Science, 20(1):68–88, 2005. [5] M. Batty and P. Ferguson. Defining city size. En- [27] C. M. A. Pinto, A. Mendes Lopes, and J.A. Ten- vironment and Planning B: Planning and Design, reiro Machado. A review of power laws in real life 38(5):753–756, 2011. phenomena. Commun Nonlinear Sci Numer Simu- [6] M. Batty and P. Longley. Fractal Cities: A Geom- lat, 17:3558–3578, 2012. etry of Form and Function. Academic Press, San [28] V. F. Pisarenko and D. Sornette. Robust statisti- Diego, CA and London, 1994. cal tests of dragon-kings beyond power law distri- [7] L. M. A. Bettencourt, J. Lobo, D. Helbing, butions. European Physical Journal, Special Top- C. Ku¨hnert, and G. B. West. Growth, innovation, ics (special issue on power laws and dragon-kings), scaling, and the pace of life in cities. Proc. Natl. 205:95–115, 2012. Acad. Sci. USA, 104(17):7301–7306, 2007. [29] H. D. Rozenfeld, D. Rybski, J. S. Andrade, Jr., [8] L. M. A. Bettencourt, J. Lobo, D. Strumsky, and M. Batty, H. E. Stanley, and H. A. Makse. Laws G. B. West. Urban scaling and its deviations: of population growth. Proc. Natl. Acad. Sci. USA, Revealing the structure of wealth, innovation and 105(48):18702–18707, DEC 2 2008. crimeacrosscities. PLoS ONE,5(11):e13541,2010. [30] H. D. Rozenfeld, D. Rybski, X. Gabaix, and H. A. [9] L. M. A. Bettencourt and G. B. West. A unified Makse. The area and population of cities: new theory of urban living. Nature, 467:912–913, 2010. insights from a different perspective on cities. Am. [10] L. M. A. Bettencourt, J. Lobo, and G. B. West. Econ. Rev., 101:2205–2225, 2011. Why are large cities faster? Universal scaling and [31] K. Schmidt-Nielsen. Scaling in Biology - Conse- self-similarityinurbanorganizationanddynamics. quencesofsize.J.Exp.Zool.,194(1):287–308,1975. Eur. Phys. J. B, 63(3):285–293, 2008. [32] F.Sembolini. Hierarchy,citiessizedistributionand [11] T. Champion, M. Coombes, S. Raybould, and Zipf’s law. Eur. Phys. J. B., 63:295–301, 2008. C. Wymer. Migration and Socioeconomic Change: [33] D. Sornette and G. Ouillon. Dragon-kings: mecha- A 2001 Census Analysis of Britain’s Larger Cities. nisms, statistical methods and empirical evidence. Policy Press, London, 2007. European Physical Journal, Special Topics (special [12] A. Clauset, C. R. Shalizi, and M. E. J. Newman. issue on power laws and dragon-kings), 205:1–26, Power-law distributions in empirical data. SIAM 2012. Review, 51(4):661–703, 2009. [34] M. P. H. Stumpf and M. A. Porter. Critical truths [13] M. Cristelli, M. Batty, and L. Pietronero. There is about power laws. Science, 335:665–666, 2012. more than a power law in Zipf. Sci. Rep., 2:812, [35] USCB. Geographic areas reference 2012. manual. united states census bureau. [14] M. Davis. Planet of Slums. Verso, London, 2006. http://www.census.gov/geo/reference/garm.html, [15] EEA. Corine land cover update 2000, technical 1994. guidelines. European Environment Agency. ISBN: [36] G. B. West, J. H. Brown, and B. J. Enquist. A 92-9167-511-3, Copenhagen, 2002. general model for the origin of allometric scaling [16] J. Eeckhout. Gibrat’s law for (All) cities. Am. laws in biology. Science, 276(122), 1997. Econ. Rev., 94(5):1429–1451, 2004. [37] V. I. Yukalov and D. Sornette. Statistical outliers [17] EUROSTAT. Urban Audit, Methodological Hand- and dragon-kings as bose-condensed droplets. Eu- book. Office for Official Publications of the Euro- ropean Physical Journal, Special Topics (special is- pean Communities. ISBN 92-894-7079-8, Luxem- sue on power laws and dragon kings), 205:53–64, bourg, 2004. 2012. [18] X.Gabaix. Zipf’slawandthegrowthofcities. The [38] G. K. Zipf. Human Behavior and the Principle American Economic Review, 89(2):129–132, 1999. of Least Effort. Addison-Wesley, Cambridge, MA, [19] X. Gabaix. Zipf’s law for cities: An explanation. 1949. The Quarterly Journal of Economics, 114(3):739– 767, 1999. 7 Supporting Information Abbreviations: CAS, Census Area Statistics; ST, householdspaces. Ahouseholdspaceistheaccommoda- Standard Table; SIC, Standard Industrial Classification tion occupied by an individual household or, if unoccu- pied,availableforanindividualhousehold. Thepopula- SI Text tion of this table is therefore all household spaces. The Unit of Geography category used for regression was all household spaces and therefore included all spaces whether they were oc- The underlying spatial unit for all city cluster aggre- cupied or unoccupied. gations is the Census Area Statistics (CAS) ward def- inition produced by the UK Office for National Statis- Travel to Work (Table KS15) tics. Ward boundaries reflect the political geography DataontraveltoworkdistanceswastakenfromtheUK of the UK at a fine resolution and due to the need to census table KS15. The table shows both the length maintainequalityofrepresentationinpoliticalelections, and the means of travel to work used for the longest havesimilarpopulations. CAS(CensusAreaStatistics) part, by distance, of the usual journey to work. For ward boundaries in particular have been the standard the purposes of this table, public transport is defined format for the release of ward level census information as Underground, metro, light rail or tram, train and since 2003. They reflect electoral ward boundaries pro- bus, minibus or coach. The distance travelled to work mulgated as at 31/12/2002 and contain 8850 separate is the distance in kilometres of a straight line between wards for England and Wales. the residence postcode and workplace postcode. The Much of the 2001 census data was initially aggregated distance is not calculated for people working mainly at into a previous definition of ward boundaries known or from home, people with no fixed workplace, people as Standard Table (ST) wards which are closer to the working on an offshore installation or people working original electoral ward boundaries. These original ward outsidetheUK.Thepopulationofthetableisallpeople boundaries contain 18 wards with fewer than 100 resi- aged 16 to 74 in employment. dents or 40 households so these small wards have been merged with otherwardsto protect data confidentiality Industry of Employment (Table UV34) and create the CAS ward definition. Further informa- Dataontheindustryofemploymentofresidentemploy- tiononwardboundarydefinitionsandfortheconversion ees was taken from UK census table UV34. The table tablebetweentheolderSTwarddefinitionandthecon- showstheusualresidentpopulationaged16to74inem- temporary CAS ward definition can be found at [1, 2]. ployment by the industry they work in. The industry General information on census geography including in- in which a person works is determined by the response formation on the geographic hierarchy of the vari- to the 2001 census question asking for a description of ous units including Lower Layer Super Output Areas thebusinessofthepersonsemployer(orownbusinessif (LSOAs), Middle Layer Super Output Areas (MSOAs), self-employed). Theresponseswerecodedtoamodified Wards,LocalAuthorities(LAs),GovernmentOfficeRe- version of the UK Standard Industrial Classification of gion (GOR) and geometric centroid definitions can be EconomicActivities1992UKSIC(92). Thepopulation found at [1, 2]. in each category calculated for all people aged 16 to 74 in employment. Census data used in the study was only provided for England and Wales as the process for collating data Inthe2001census,industryofemploymentinformation in Scotland and the definition of geographic boundaries was collected for usual residents. A usual resident was meant that equivalent datasets could not be produced. generally defined as someone who spent most of their More information on census geography for Scotland for time at a specific address. It included: people who usu- the 2001 census can be found at [3]. ally lived at that address but were temporarily away (on holiday, visiting friends or relatives, or temporar- Individual Data Tables ily in a hospital or similar establishment); people who worked away from home for part of the time; students, The original data and associated metadata for tables if it was their term-time address; a baby born before 30 UV02, UV53, KS15, UV34, KS12 and Income can be April 2001 even if it was still in hospital; and people foundunderthetopicssectionoftheUKneighbourhood present on census day, even if temporarily, who had no statistics website [1]. otherusualaddress. However,itdidnotincludeanyone Population (Table UV02) presentoncensusdaywhohadanotherusualaddressor The data on population was taken from UK census ta- anyonewhohadbeenlivingorintendedtoliveinaspe- ble UV02. Population data was taken from a data ta- cial establishment, such as a residential home, nursing ble on population density at the CAS ward level which home or hospital, for six months or more. provided separate statistics for total population, ward The category name for industry of employment used area and a result in population density figure. The to- in the study were the following: Agriculture, hunt- tal population figure was used for all regressions with ing and forestry; Manufacturing; Construction; Hotels socio-demographic variables used in the study. andRestaurants;FinancialIntermediation;RealEstate, renting and business activities; Public administration, Housing Stock (Table UV53) defence and social security; Education. Data on household dwelling numbers comes from cen- sus table UV53. The table provides information on the Occupational Groups (Table KS12a) number of households, occupied or unoccupied, within DataonoccupationalgroupswastakenfromUKcensus each ward. Unoccupied household spaces are split into table KS12. The information on this table comes from second residences/holiday accommodation, and vacant responsestoquestionsaskingforthefulltitleofthemain 8 jobanddescriptionofwhatisdoneinthatjobfromthe Resources Survey (FRS) for the same year (2001/02)1. 2001 census. The population of the table is all people The total sample size for the 2001 survey was 42,000 aged 16 to 74 in employment and the values see were addresses taken from across the UK. The FRS provides the absolute count values as opposed to the percentage four variables that can then be generated for the whole values also provided. country: For employment related data, any person who carried 1) Average weekly household total income (unequiv- out paid work in the week before the census, whether alised). self- employed or an employee, is described as em- ployed or in employment. ’Paid work’ includes casual 2) Average weekly household net income (unequiv- or temporary work, even if only for one hour; being on alised). a government- sponsored training scheme; being away from a job/business ill, on maternity leave, on holiday 3) Average weekly household net income before hous- or temporarily laid off; or doing paid or unpaid work ing costs (equalised by McClements equivalence for their own or family business. A person’s occupation scale)2. is coded from the responses to the questions asking for the full title of the main job (the job in which a person 4) Averageweeklyhouseholdnetincomeafterhousing usually works the most hours). costs(equalisedbyMcClementsequivalencescale)2. ResponsesarecodedtotheStandardOccupationalClas- Total income gross earnings, investments, pension pro- sification 2000 (SOC 2000). Where possible census re- visions and welfare payments is the closest representa- sults are presented using standard classifications. The tion of the gross earning power of a given ward. This category used for regression were the following: Man- is the income variable associated with the regressions agersandSeniorOfficials;Professionaloccupations;As- discussed in the results section. sociate Professional and Technical operations; Skilled trades occupations; Administrative and Secretarial Oc- Crime Data cupations; PersonalServiceOccupations; Salesandcus- The data on annual incidence of crime was obtained tomer service occupations; Process; plant and machine from the Home Office web site [7]. This data consists operatives; Elementary Occupations (examples of ele- of the number of occurrences of different categories of mentaryoccupationsinclude FarmWorkers, Labourers, crime (excluding homicides) at the local authority level Kitchen Assistants and Bar Staff). annually between 31st March 2003 to 31st March 2011. The data analysed is the average number over the pe- Patents riod, covering England and Wales. Patent information was provided by the intellectual property office with postcode level reference that was Land Use Statistics subsequently aggregated to the CAS ward level. Data (Generalised Land Use Database), 2001 was provided for the years 2000 to 2011 inclusive to en- The Generalised Land Use Database (GLUD) figures sure sufficient quantity to avoid null values for individ- show the areas of different land types for census Out- ual wards. The total number of patents in the dataset put Areas (OAs), Lower Layer Super Output Areas that could be identified in E&W was 66,270. The val- (LSOAs), Middle Layer Super Output Areas (MSOAs), ues used for regression were simply the gross number of Local Authorities (LAs), and Government Office Re- patents registered to a particular postcode whether it gions (GORs) in England as at 1st November 2001. be business or home address. Output level data was aggregated to the ward level for comparative analysis with population. More information on patent information from the UK FortheGLUD,aclassificationhasbeendevelopedwhich IPO can be found at [4]. allocates all identifiable land features on the UK Ord- Household Income nance Survey MasterMap national mapping product into nine simplified land categories and an additional The dataset on household income was taken from UK ’unclassified’ category. census experimental statistics for 2001/02 and is pro- These are: vided at a fine geographic resolution for the whole of England and Wales. The original data and associated 1. Domestic buildings; metadataforHouseholdIncomecanbefoundunderthe topics section of the UK neighbourhood statistics web- 2. Non-domestic buildings; site [1]. The income data was produced using a model-based 3. Roads; process which involves finding a relationship between 4. Paths; survey data (data available on income) and other data drawn from administrative and census data sources. A 1The FRS is produced by the UK Department for Work and model fitting process is used to select co-variates with Pensions (DWP) to ensure a large sample sizes when collating a consistently strong relationship to the survey data. information on household expenditure. Information on the FRS The strength of the relationship with these covariates for 2001 can be found on the research section of the dWP.gov is used to provide estimates on income for those wards website[5],themethodologysection(section8)oftheFRSsum- where survey data on income is not available. More in- maryreportforthatyearat[6]andtheassociatedtechnicalreport formation on the provenance of the income data can be availablethroughtheONS[2] 2The McClements equivalence scale adjusts income according foundontheappropriatepageoftheUKneighbourhood to the relative advantage or disadvantage associated with house- statistics census access site. holdsofdifferentsizes,primarilytotakeintoaccounttheeffectof The survey data on income was taken from the Family economiesofscaleonhouseholdbudgetsofhouseholdsize. 9 5. Rail; Urban indicators for Larger Urban Zones in E&W 6. Gardens (domestic); We point out that that the classification of urban indi- cators in terms of three unique categories is not always 7. Greenspace; possible. For example, one would expect that highly skilled employment categories would lie in the superlin- 8. Water; ear regime and conversely, basic employment categories would be expected to lie in the sublinear regime [9]. 9. Other land uses (largely hardstanding); and However, it is clear from Table S1 that this is not the case for the LUZ definition. Two very similar employ- 10. Unclassified. ment categories, i) public administration and defence, social security and in ii) administration and secretar- The statistics are created by identifying different land ial occupations belong to two different regimes, sublin- parcels and buildings on an Ordnance Survey digital ear/linear and superlinear ones respectively. Other em- map product, and records their type and area. Each ployment categories that are predicted to belong to the land parcel is then assigned to a specific Output Area superlinear regime, such as education, and others that based on its central point, and the information is ag- aredifficulttojustifyeitherway,suchaspersonalservice gregated to higher geographies. The building blocks for occupations andsales and customer service occupations, the statistics are a combination of objects in the elec- lie in the linear regime. The issue is not unique to the tronic Ordnance Survey MasterMap product and Ad- employmentdataaswewouldalsoexpecttofinddiffer- dressPoint business information(see below). The com- entexponentsfortheareaofdomesticandnon-domestic bination of MasterMap attributes, contextual analysis, buildings. Theformerispartofthebasicinfrastructure and Address-Point(TM) information provides the ba- of a city, so its sublinear exponent complies with the sis for the nine generalised land classes. Each polygon predicted regime. The latter on the other hand reflects on MasterMap has attributes associated with it, and the economic activity of a city, and could be argued to these provide information about the type of land cov- belong to the superlinear regime. However it can be ered, which can be used as a basis for generating a land seenfromthetablethatthisisnotthecase. Thismight use classification. betheresultoftherentincreasingmuchfasterthanthe OrdnanceSurveyMasterMapisalargescaledigitalmap economicoutcome,constrainingtheincreaseoflanduse. for use in geographical information systems (GIS) and Inalltheseexamples,argumentsforandagainstthehy- database systems. Real world objects are represented pothesised regimes can be found, but the very fact that as explicit features, by polygons, each identified by a they are contestable, makes assignment of an expected unique number called the TOID (topographic identi- regime very difficult for some indicators. fier) Each polygon also has digital attributes such as a Table S2 contains the details of the variables in Fig. 2. theme,whichenablequeryingandsearchingforspecified These are urban indicators measured according to the features. The TOID enables linking to other datasets, geographical delimitation of British cities given by the including Ordnance Survey Address-Point (TM), via in European statistical bureau in terms of LUZ. The table this case, a linkable dataset called the National Build- givesthevaluesforthescalingexponent,theconfidence ings DataSet (NBDS). This release of Generalised Land interval and the precision of the regression. UseDatabasestatisticsusestheoriginalNovember2001 Variables that are the outcome of social and economic version of MasterMap. interactions are predicted to have a superlinear expo- Ordnance Survey Address-Point is a dataset that nent. Patents, income and crime incidents show a lin- uniquely defines and locates residential, business and ear relationship with population size. Often, data on publicpostaladdresses. Itiscreatedbymatchinginfor- patent registration can be affected by sparse records, mation from OS digital map databases with more than howeverthedatausedinthisstudywasaccumulatedfor 25 million addresses recorded in the Royal Mail Post- an eleven year period between 2000 and 2011 to ensure code Address File (PAF). The Generalised Land Use theresultingexponentwouldnotbeaffectedbymissing Database for England uses both Address-Point(TM) values, so it is unlikely that the unexpected exponent (version2002.2.1,Dated27May2002)andtheNational forpatentsisaresultofpoordatainthiscase. Dataon Buildings Dataset (NBDS) (July 2002). The NBDS is householdincomeismodel-based(seeprevioussection), used as a link dataset to match Address-Point records and assigned to the resident location rather than the to MasterMap records. workplace location where the income was earned. This All building TOIDs were classified as ’Domestic Build- may affect results as individuals may choose to live far ings’,unlessanyoneormoreofthefollowingconditions from the source of their income, and so the wealth gen- were met: eratedbyacitymaynotbecapturedwithinitsphysical a)itwasseentobeadjacenttoanareaofhard-standing bounds. We investigate this further by using a differ- (such as a tarred car park or estate road) which was ent data source for income allocated at the work place morethan300squaremetres;b)itcontainedanaddress fromtheOfficialLabourMarketStatisticsfor2009. The point with a business or organisation name; or c) it had Annual Survey of Hours and Earnings provides weekly an area greater than 1,000 square metres and did not gross income at the local authority level, which is ag- containanyaddresspoint. BuildingTOIDsfulfillingany gregated at the LUZ level. We find that the linearity one or more of these criteria were recorded and classed persists, seeFig.S1, indicatingthatthisbehaviourmay as ’Non-Domestic Buildings’. The category ’Other’ is notbeduetothespecificassignment. Thesamedataset largely areas of hardstanding such as car parks, estate containsaresidentialallocationforweeklyhouseholdin- roads and hard tennis courts. come and this also displays linearity implying that the 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.