Discovering and Characterizing Mobility Patterns in Urban Spaces: A Study of Manhattan Taxi Data∗ Lisette Espín-Noboa Florian Lemmerich GESIS-LeibnizInstitutefortheSocialSciences GESIS-LeibnizInstitutefortheSocialSciences [email protected] [email protected] Philipp Singer Markus Strohmaier GESIS-LeibnizInstitutefortheSocialSciences GESIS-LeibnizInstitutefortheSocialSciences [email protected] &UniversityofKoblenz-Landau [email protected] 6 1 0 2 ABSTRACT tend to ignore several aspects of human mobility such as time, weather,raceofpeopleormeansoftransportation. Towardsthat b Nowadays, humanmovementinurbanspacescanbetraceddig- e itallyinmanycases. Itcanbeobservedthatmovementpatterns end,weproposetodiscoverandcharacterizemobilitypatternsin F are not constant, but vary across time and space. In this work, humanbehaviorinurbanspace. 9 wecharacterizesuchspatio-temporalpatternswithaninnovative Material and approach. In this work, we expand existing re- combinationoftwoseparateapproachesthathavebeenutilizedfor search[3,23]onstudyinghumanmobilitywithacasestudyusing ] studyinghumanmobilityinthepast.First,byusingnon-negative taxidataofManhattan. Foridentifyingbehavioraldifferencesin I tensorfactorization(NTF),weareabletoclusterhumanbehavior termsoftimeandspace,previousresearch[29,36]hassuggested S basedonspatio-temporaldimensions. Second,forunderstanding toutilizetensordecomposition[5,14]. However,interpretingre- . s theseclusters,weproposetouseHypTrails,aBayesianapproach sultsfromtensordecompositioncanbedifficultandhasbeenbased c forexpressingandcomparinghypothesesabouthumantrails. To onpersonalintuitionsinpreviousapproaches. Ontheotherhand, [ formalizehypothesesweutilizedatathatispubliclyavailableon recentresearch[1,33]proposedmethodsthatallowtounderstand 2 theWeb,namelyFoursquaredataandcensusdataprovidedbyan humansequencesbycomparinghypothesesabouttheproduction v opendataplatform.Byapplyingthiscombinationofapproachesto ofmobilitytrailsatinterest. Yet, thisapproachislimitedinthe 4 taxidatainManhattan,wecandiscoverandcharacterizedifferent sensethatitcanonlyexplainglobalbehaviorwithoutbeingableto 7 patternsinhumanmobilitythatcannotbeidentifiedinacollective providemoredetailedinsights.Tocircumventtheselimitations,we 2 analysis. Asoneexample,wecanfindagroupoftaxiridesthat proposeauniqueandoriginalcombinationoftensordecomposition 5 endatlocationswithahighnumberofpartyvenues(accordingto andHypTrailstounderstandhumanbehavioronaspatio-temporal 0 Foursquare)onweekendnights. Overall,ourworkdemonstrates level. Inparticular,wefirstutilizenon-negativetensorfactoriza- . thathumanmobilityisnotone-dimensionalbutrathercontainsdif- tion(NTF)[5,14]forautomaticallyidentifyingclustersofmobility 1 ferentfacetsbothintimeandspacewhichweexplainbyutilizing behavior.Second,forcharacterizingtheseclusters,weutilizeHyp- 0 6 onlinedata.Thefindingsofthispaperargueforamorefine-grained Trails [33]—a Bayesian approach for expressing and comparing 1 analysisofhumanmobilityinordertomakemoreinformeddeci- hypotheses,i.e.,transitionalassumptions,abouthumantrails. : sionsfore.g.,enhancingurbanstructures,tailoredtrafficcontrol Findingsandcontributions.Insummary,ourmaincontributions v andlocation-basedrecommendersystems. arethree-fold:First,wepresentaninnovativecombinationoftwo i X Keywords:HumanMobility;TensorFactorization;HypTrails methodologies,i.e.,non-negativetensorfactorizationandHypTrails, inordertocharacterizeheterogeneoushumanmobilitybehavior. r a 1. INTRODUCTION Second, we incorporate existing human mobility patterns into a Humanmobilitycanbestudiedfromseveralperspectivesutilizing hypothesis-based schema built upon human beliefs. Third, we differentkindsofdatafromtheonline(e.g.,TwitterorFoursquare) demonstratethebenefitsofonlinedatainourattempttoexplain andtheoffline(e.g.,taxiridesorbiketrips)world. Alargebody humanbehavioronaspatio-temporallevel. Asoneexample,we ofworkhasfocusedonidentifyinggeneralmechanismsthatguide candiscoveragroupoftaxiridesthathavedrop-offlocationswith and explain human mobility behavior on an individual [13, 34] ahighnumberofpartyvenues(accordingtoFoursquare)onweek- or collective level [1, 25]. For example, previous research has endnights. Resultsofthisstudycouldimprovee.g.,planningof shownthathumanmobilityishighlypredictable[34]andshows futureeventsorreconstructions,trafficcontrol,location-basedrec- temporal and spatial regularity [13]. At the same time, spatio- ommendersystemstoenablepublictransportationcompaniesto temporalheterogeneityexists,as,forexample,discussedinprevious adapttheircapacitiesbasedonthedemandatcertainhoursorareas. studies[3,23].Forinstance,dailyroutinessuchasgoingfromhome towork(space)inthemorning(time)andfromworktohomein 2. DATASETS theeveningcanbeobserved. Thisarguesforamorefine-grained WefocusonManhattan,oneofthemostdenselypopulatedareas analysis,thatgoesbeyondtheuniversal(mobility)patternswhich intheworld,sinceithasahugeamountofgovernmentalandprivate datapubliclyavailableontheWeb.Indetail,westudytaxiridedata ∗ PleasecitetheWWW’16versionofthispaper. inthiswork.Whilethisdatarepresentsaprominentrepresentative ofhumanmobility,ourmethodologypresentedinthispapercanbe Table 1: Centroids of places of interest: This table lists geo- appliedtootherkindsofmobilitydata.Werepresenthumanmobility coordinatesandcorrespondingtractsofthegeographicalcen- asusertrailswhichcanbeseenassingletransitionsbetweenpick- terofManhattan,theFlatironbuildingandTimesSquare. upanddrop-offlocationsoftaxirides.Inordertoconstructarichset Place lat,lon TractID ofhypothesesforexplainingthemovementofusers,weadditionally Manhattan(geo-center) 40.79090,-73.96640 018100 retrievedinformationonlocalvenues(e.g.,parks,schools,churches) FlatironBuilding 40.74111,-73.98972 005600 fromFoursquareandpubliccensusdata(i.e.,demographics,land- TimesSquare 40.75773,-73.98570 011900 useandsocio-economics). Next, weshortlyexplainthevarious datasetsingreaterdetail. Taxirides. Weusepubliclyavailabledata1of173,179,759NYC wesuggesttousenon-negativetensorfactorization(NTF)[5]for Taxiridesin2013. Itconsistsofanonymizedrecordsregistering automaticallyclusteringhumanmobilitybehavior. Researchhas whenandwhereataxiridestartedandendedandvariousfeatures shown thatNTFcandetect latentfeaturesofhuman mobility in suchasthenumberofpassengersorthetotalfareofthetrip.Fromall differentdimensionssuchasspaceandtime,cf.forexample[36]. records,weremovedtaxiridesoutsidethearea(polygon)ofManhat- Second,forintuitivelycharacterizetheseclusters,weutilizeHyp- tanandsomeinconsistenciessuchasrecordswithtrip_distance≤0, Trails [33]—a Bayesian approach for expressing and comparing trip_time_in_secs≤0andpassenger_count≤0;ourfinaldataset hypothesesabouthumantrails. Weoutlineinthefollowingboth consistsof143,064,684rides. Adescriptionofallattributesand methodologicalcomponentsofthiswork,butrefertotheoriginal datatypescanbefoundintheTLCTaxiData-APIDocumentation2. publicationsfordetails. Censusdata. Insteadofworkingdirectlywithlongitudeandlati- Clusteringmobilitypatterns. Forclusteringthedata,weutilize tudedata,ourmethodologicalapproach(seeSection3)operateson NTFwhichdecomposesagivenn-waytensorXintoncomponents adiscretetractstatespace.Tractsaresmallsubdivisionsofacounty (matrices) that approximate the original tensor when multiplied thatprovideastablesetofgeographicunitsforthepresentationof with each other. Each matrix contains information on r factors statisticaldata3. Weextractedthespecificationsofall288tracts (clusters).Inthispaper,wedefineathree-waytensoroftaxirides inManhattanfromtheNYCPlanningportal4asashapefilefrom whose dimensions capture human transitions from one place to the 2010 census. Having this information, we mapped the GPS anotheratacertaintime: pick-uptract,drop-offtractandpickup coordinatesofalltaxiridestooneofthe288tracts. Besidesthe time(hourofweek)respectively;thus,clusteringintermsofboth areaandlocationoftracts,wequeriedtheNYCOpenData5andthe spaceandtime.Eachelementofeverycomponentdeterminesthe AmericanFactFinder6databasesforaccessingrelevantonlinedata scaleofmobilityflow(weight)withrespecttothecorresponding suchascensusandland-use.Moreover,wecalculatedtheoverlap factor. Inotherwords,thehighertheweight,themoredominant betweenlanduse-typessuchasresidentialandcommercialzones thatinstanceisinthatcluster.Similartootherclusteringmethods, andeachtracttoobtaininformationonatractlevel. definingthenumberofclustersisarbitrary. However,thereexist Foursquarevenues. Accordingtopreviousworks,humanmobil- some guidelines to determine a good value of r, see e.g., [12]. itycanbeexplainedbythetransitionofvisitedplaces[29]orby Inthiswork,wearenotfocusedonfindingthemostappropriate interveningopportunities[25,35]. Forstudyingtheseandsimilar numberofbehavioralcomponents,butbeingabletocharacterize theories, we gathered information about physical places such as differentbehaviors,whichwillbedetailedinthefollowingsection. churches,housesandparkssituatedinManhattanbyqueryingthe Characterizingclusteredmobilitypatterns. Forcharacterizing FoursquareSearchAPI7 toextractplacesineachtractfor10dif- theclusteredhumanmobilitybehavior,weutilizeHypTrails[33],a ferentcategories8(e.g.,Residence,WorkPlaceorNightlifeSpot). Bayesianapproachforexpressingandcomparinghypothesesabout Overall, we collected 153,694 unique places within Manhattan. human trails. Technically, HypTrails models the data with first- Sinceeveryvenueisdescribed(besidesothersattributes)byaGPS orderMarkovchainmodelswherethestatespacecontainsall288 location,theyallweremappedtotheirrespectivetract. tractsofManhattan. Fundamentally, hypothesesarerepresented Centroids.Typically,popularplacesaremostlikelytobevisitedat asmatricesQexpressingbeliefsinMarkovtransitions;Section4 anytime.Forthisreason,weconsideredthreecandidateplacesas describesthehypothesesabouthumanmobilityusedinthiswork. centroidstostudywhetherpeoplevisitthemornot.Table1,shows Elements qi,j indicate the belief in the corresponding transition thegeographicalcoordinatesandrespectivetractsofthecenterof probability between states i and j; higher values refer to higher Manhattan(approximatedasthegeographicalcenterofManhattan), belief.ThemainideaofHypTrailsistoincorporatethesehypotheses theiconicFlatironbuildingandwell-knownTimesSquare. asDirichletpriorsintotheBayesianinferenceprocess.HypTrails automatically elicits these priors from expressed hypotheses; an 3. METHODOLOGY additionalparameterk(weightingfactor)needstobeprovidedto steer the overall belief in a hypothesis. For then comparing the Thegoalofthisworkistodiscoverandcharacterizemobility relativeplausibilityofhypotheses,HypTrailsutilizesthemarginal patternsintaxidata,inordertobetterunderstandpeople’stravel likelihood(evidence)oftheBayesianframeworkwhichdescribesthe behaviorfromoneplacetoanotherwithinthecity.Tothatend,we probabilityofahypothesisgiventhedata.Wecaninferthepartial proposeaninnovativecombinationoftwomethodologies. First, orderingbasedontheplausibilityofagivensetofhypothesesby 1 rankingtheirevidencesfromthelargesttothesmallestforaspecific http://www.andresmh.com/nyctaxitrips/ 2 valueofk. https://dev.socrata.com/foundry/data.cityofnewyork.us/gkne-dk5s 3 https://www.census.gov/geo/reference/gtc/gtc_ct.html 4 http://www.nyc.gov/dcp 4. HYPOTHESES 5 https://nycopendata.socrata.com/ 6http://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml Inthissection,wedescribeourdifferenthypothesesatinterestthat 7https://developer.foursquare.com/docs/venues/search areusedtoexplaintheindividualclusterswithHypTrails.Theseare 8https://developer.foursquare.com/categorytree expressedashypothesismatricesQwheretheelementsqi,jcapturea Table2:UniversalTheories:Listof10universaltheoriesonhumanmobility.FollowingtheMarkovchainassumption,indexesiand jrefertostartandendstatesrespectively.Functionsdist(.,.),venues(.),andcheckins(.)denotethegeographicaldistancebetweentwo states,allFoursquarevenuesinagivenstate,andthenumberofcheck-insinFoursquarefromagivenvenue,respectively.VectorsA andBrepresentcategoricalinformationofthestartandendstate,respectively(e.g.,race,poverty,land-use). # MobilityTheory Formulaforqi,j Description Thenextlocationischosenrandomly,independentlyofthecurrent 1 Uniform 1 location. 2 InverseGeographicDistance 1 Thecloserthetargetlocation,themorelikelyitistobevisited,see[15]. dist(i,j) Theprobabilityofvisitingaplace j isgivenbyatwo-dimensional 3 GaussianDistribution σ√12π e−dis2t(σi,2j)2 Garaouusnsdiatnhedipstlraicbeutii,otnhewcitlhossetrantodatrhdedleovciaattiioonniσ. ;itchaensmeitahlleerrbtehethσe currentlocationorafixedlandmarkinthecity,see[1]. 4 Density |(venues(j)| Peoplegetattractedbytheamountofvenuesatdestinationplace. 5 Popularity ∑V∈venues(j)checkins(V) Venueswithmorenumberofcheck-insattractmorepeople,see[26]. 6 GravitationalMass venues(i).venues(j) Peoplegetattractedbytheproductofforcesfromthecurrentandclose dist(i,j) targetlocations,see[42]. 7 GravitationalTarget venues(j) Peoplegetattractedonlybyaforceinatargetlocationwhichisclose dist(i,j) tothecurrentplace.Thisisavariantof(6) 8 RankDistance 1 Theprobabilityofmovingtoaplaceincreasesiftherearefewvenues |{w:dist(i,w)<dist(i,j)}| waroundthecurrentlocation,see[25]. Thenumberofpersonsgoingagivendistanceisdirectlyproportionalto 9 InterveningOpportunities |{w1:dist(i,w1)=dist(i,j)}| thenumberofopportunitiesatthatdistanceandinverselyproportional |{w2:dist(i,w2)<dist(i,j)}| tothenumberofinterveningopportunities,see[35]. 10 CosineSimilarity A.B Measureshowsimilarthecurrentandnextlocationare,see[37]. ||A||||B|| beliefinpeopletransitioningtostate(tract) jwhencurrentlylocated rides—which(inthiswork)donotcontributeonmobility(i.e.,taxi atstate(tract)i(seeSection3). ridesthatstartandendinthesametract). Ourhypothesesaremostlybasedonexistingtheories.Inthisre- Distance-basedhypotheses.Basedonthegeographicdistance[15], gard,themostprevailinghumanmobilitymodelisthegravitational wecanconstructverysimpleandintuitivehypotheses:theproximity law[42],whichexplainsmobilitybyanattractionforcebetween hypothesisandthecentroidhypothesis.Theproximityhypothesis twoplacesthatisdeterminedbysomeweightforeachplace(e.g., assumesthatplacesnearbythecurrentlocationaremorelikelyto densityofvenues/pointsofinterest)andtheinvertedshortestdis- bevisitednext,thecentroidhypothesissuggeststhatlocationsnear tancebetweenthem.However,duetosomelimitationsfoundinthis tothecitycenter(specifiedbyfixedgeographiccoordinates,see model[32],manyothersolutionshaveemergedtocircumventsuch Section2)aremorelikelyforthenextstop. Accordingtothese problems.Forexample,therankmodel[25]hasbeenproposedas hypotheses, the location of the next visited place follows a two- avariationoftheradiationmodel,whichindicatesthatthenumber dimensionalGaussiandistributionthatiscenteredatthecurrent ofpeopletravelingtoagivenlocationisinverselyproportionalto locationorthecitycenterrespectively,see[1]foramoredetailed thenumberofplacessurroundingthesourcelocation.Similarly,the description.Fortheparametrizationofthedistribution,weincluded socalledinterveningopportunities[35],additionallyincludesthe severalvariationsofthehypotheseswithdifferentvaluesforthe numberofopportunitiesatagivendistance.Cosinesimilaritycan standarddeviationσ,i.e.,σ∈{0.01,0.5,1.0,2.0,3.0,4.0,5.0}km. alsobeusedundertheassumptionthatpeopleprefertovisitplaces Foursquarehypotheses.Tobuildmoreinformedhypotheses,we whicharesimilar(accordingtosomemetric)tothedepartureplace. leverageFoursquarevenuesbymeasuringthedensityofaplace, Table2summarizes10universaltheoriesconsideredinthiswork. which consists of counting the number of all venues (regardless Forpracticalapplicationsthesetheoriesrequireadditionaldata, oftheircategories)inagivenlocation. Similarly,thenumberof e.g.,todeterminetheweightofplacesforthegravitationallaw.In check-inscanbeusedtomeasurethepopularityofagivenplace. our attempt to explain mobility we also utilize online data from Thesecanbecombinedwiththegeographicaldistanceinvarious Foursquarevenuesandcensusdata.Wecategorizeourhypotheses intothreetypes:Distance-based,FoursquareandCensushypothe- waysaccordingtomobilitytheories,seetheoriesNo.6,7,8and9in Table2.Additionally,wegroupplacesaccordingtotheircategory ses,seeTable3.Additionally,weusetheuniformhypothesisasa (e.g.,residence,shoporchurch)basedonvenuesofasingletype. baselinetoexpressthebeliefthatalltractsareequallylikelyforthe nextstopofataxi.Forallhypotheses,wesetthediagonalofQto0 Inotherwords,weusecategoriesasfiltersoftheavailablevenues. toavoidself-looptransitions—accountingforonly1.5%ofalltaxi Thus,everyFoursquarecategoryinducesasubsetofallvenuesper state(i.e.,tract). Table3showsall10categoriesincludedinthis Table3: Tractproperties. Thesethreecategoriescontainalltractindicators,i.e.,statisticsabouttracts,usedincombinationwith universaltheoriestoconstructhypotheses. Forinstance, thehypothesischurchexpressesthattheprobabilityofvisitingacertain locationisproportionaltoitsnumberofchurches(usingforexamplethedensitytheory). Category Propertiespertract Distance-based Fixedpointsofinterest:GeographicalCenter,FlatironBuilding,TimesSquare. Numberofvenues:Arts&Entertainment,Education(i.e.,colleges,universities,elementaryschoolsandhighschools), Food, Nightlife Spot, Outdoors & Recreation, Work (i.e., auditoriums, buildings, convention centers, event space, Foursquare factories,governmentbuildings,libraries,medicalcenters,militarybase,non-profit,office,postoffice,prison,radio station,recruitingagency,TVstation,andwarehouse),Residence,Shop&Service,Travel&Transport,andChurch. Populationsize,andTractarea.Percentageof:Whitepeople,Blackpeople,Peopleinlaborforce,Unemployedpeople, Peoplebelowpovertylevel,Peopleabovepovertylevel.Numberofplaces:Libraries,ArtGalleries,Theaters,Museums, Census WiFi Hotspots, and Places of Interest. Moreover, the occupied area of: Residential Zoning, Commercial Zoning, ManufacturingZoning,Parkproperties,HistoricDistrictsandEmpowerzones. study. Toavoidanabundantamountofhypotheses,weonlyuse 5.2 NTF:Mobilitypatterns the gravitational target theory in combination with the category- Sinceourdatatensorhasthreedimensions,thedecomposition basedhypotheses. Furthermore, weconstructasimilarity-based returnedthreedifferentcomponentswhichdeterminethescaleof hypothesisthatsuggeststhattransitionsaremorelikelybetween mobilityflowineachdimensionforeverycluster: time(hourof twostatesthathaveasimilarcategorydistributionofvenuesbased week) and space (departure and arrival tracts). Thus, individual onCosinesimilarity,seeTable2. clustersrepresentgroupsoftaxiridesindifferentplacesatdifferent Census hypotheses. Similar to the Foursquare hypotheses, the periodsoftimeinaweekday-hourscale. Census hypotheses add relevant information to every state (i.e., The time component is shown in Figure 1a, whereas location tract).Thisinformationondemographics(e.g.,%ofwhitepeople), componentsforpickups(departures)anddrop-offs(arrivals)are land-use(e.g., residentialzoning)orsocio-economics(e.g., #of shownontheright-handsideofFigure1.Duetospacelimitations peopleabovepovertylevel)canbeusedinsteadofthenumberof we show only the first three clusters. From Figure 1a, it can be venues(i.e.,density)ofagivenstate.Table3showsall20indicators observedthatallclustersshowstrongdailyregularities,whichcan usedtoformulatethiskindofhypotheses.Similaritymeasures(i.e., beassumedasdailyroutinesinhumanmobilitybehavior.Clusters cosinesimilarity)wereobtainedunderthreedifferentcategories: C1,C2,C4,C5andC7captureallbehaviorsonworkdays(almostall RaceGroup(i.e., white, black, americanindian, asian, hawaiian peaksarewithinthefirstfivedaysoftheweek),whereasclusterC3is andotherpacificislander,otherrace,tworaces),PovertyLevel(i.e., stronglydominatedbyweekends,speciallybyFridayandSaturday below,above)andEmploymentstatus(i.e.,employed,unemployed, nights. ClusterC6 on the other hand, shows a more periodical inlaborforce). behavioracrosstheentireweek,howeveritspeaksarearound6pm Overallwedefined70hypotheses:(a)1uniform,(b)29distance- fromMondaytoSaturdayand2pmonSaturdaysandSundays. based(i.e.,geographicaldistance,proximityand3centroidstimes ThelocationcomponentsshownfromFigure1btoFigure1g, 7; for each value of σ), (c) 17 from Foursquare (i.e., all venues togetherwiththeirrespectivetimeperiods,provideusinitialcontext integratedinformulas4to9fromTable2,gravitationaltargetfor aboutwhenandwherepeoplemovewithinthecity. Forinstance, eachFoursquarecategoryand1similarity),and(d)23fromCensus clusterC1representsmorningtaxiridesaround9am(seeFig.1a) data(i.e.,20fromthegravitationaltargetwitheachcensusindicator whichgotothesouth-eastofManhattan(seeFig.1e).ClusterC2 and3fromsimilarities.). ontheotherhand,concentratesitstaxisridesintheeveningaround 6pmnearCentralPark(seeFig.1f).Finally,clusterC3includesall 5. EXPERIMENTS taxiridesonFridayandSaturdayaround1aminthelowerpartof Manhattan(seeFig.1g). Thissectionreportsonexperimentalresultsobtainedbyapplying Thefollowingsectionpresentscharacterizationsforsuchbehav- thepresentedmethodstotheManhattantaxiridesdataset. iorsbycomparingalistof70hypothesesusingHypTrails. 5.1 Configuration 5.3 HypTrails: Rankingofhypotheses AsdescribedinSection3,westartedbyusingNTFinorderto identifydifferentclustersinthetaxidata.Weworkedwithathree- SinceNTFonlyassignsaweighttoeveryinstanceofeachfactor waytensorwhosedimensionsrepresentthepickuphourofweek, (i.e.,doesnotexplicitlypartitionthetransitioninput),wefirstneed thetractofthepickupofthetaxirideandthetractofthedrop-off toidentifyalltransitionsforeachclusterinordertorunHypTrails. ofthetaxiride.Forthehourofweek,thefirststatecorrespondsto Generally[29,36],interpretingclustersreturnedbyNTFrequires thetimefrom12:00a.m.to12:59a.m.onMonday,thelastofthe toextractthetop-Nmostrepresentativestatesfromeachfactorto 168statestothetimefrom11:00p.m.to11:59p.m.onSunday.We determinetheinstancesthatarepartofeverycluster.Inoursetting, didnotincludetimeofthedrop-offasanextradimensionbecause weextractedthetop-10pickupweekday-hoursanddrop-offtracts, mostoftherideslastlessthananhour.Sincethereare288different andqueryalltaxiridesfulfillingtheseconditions.Notethatwedid tracts,thetensorforthedecompositionhassize168×288×288. notincludethetoppickuptracts,becauseweareinterestedonplaces Afterexperimentingwithdifferentparameters,wesetthenumber wherepeoplegoto,ratherthanplaceswheretheycomefrom.We ofclusterstor=7toclusterthedataintosevendifferentgroups,as thenappliedHypTrailsandcomputedtherankingsfortheweighting thisnumbersubjectivelybestcapturedallbehavioralcomponents. factork=10—seeSection6foradiscussion. (b)DeparturesC1 (c)DeparturesC2 (d)DeparturesC3 (a)Temporaldistributionofclusters (e)ArrivalsC1 (f)ArrivalsC2 (g)ArrivalsC3 Figure1:Spatio-Temporalbehaviorobtainedbytensorfactorization.ThisfigureillustratestheresultsobtainedwhenapplyingNTF on a three-way tensor. (a) Each row represents a behavioral component respect to pickup time (hour of week). Maps shown at therightofthisfigureshowlowandmidtownManhattan,thecolorofeachtractdetermineshowrepresentativethatlocationisin thatcluster;thedarker,themoredominant. (b,e)clusterC1standsfortaxisridesat9amaroundthesouth-eastofManhattan;(c,f) ClusterC2containsalleveningshorttaxiridesat6pmaroundthecenterofManhattan;and(d,g)ClusterC3representsallweekend nighttaxiridesaroundthelowerandmiddlepartofManhattan. Exemplaryresultsofthecharacterizationstepforahandselected Peopleinthisclusterusuallygotonearbytractscontainingvery subsetofhypothesesaredisplayedinTable4. Itshowsforeach popularplacessuchasnightlifespotsandrestaurants. Below,we hypothesistherespectiverankinthecluster;lowernumbersimply discussthecharacteristicpropertiesofallclusterssummarizedby a higher rank and therefore a better explanation. We show the thetypesofhypotheses. resultsoftheFoursquareandCensushypothesesonlyappliedtothe Distance-based. AsmentionedinSection2,thesehypothesesre- gravitationaltargetformula(seeSection4).Gravitationallaws(i.e., quire the standard deviation (σ) of a two-dimensional Gaussian gravitationaltargetandgravitationalmass)inallcasesperformthe distribution. InTable4,weshowthebestresultforparametrized sameduetothenormalizationsappliedintheHypTrailsapproach. hypothesesandtheirrespectivevalueofσ inparenthesis. Inthe Thus,fromnowonwardstheywillbereferredtoasgravitational. overalldataaswellasclustersC2andC3,taxiridesaremorelikely Thus,forinstance,thehypothesisGravitational(%Whitepeople) tovisitproximateplacesinaradioof3km,0.5kmand1kmrespec- expresses a belief in people going to nearby tracts with a high tively.ClustersC1,C4,C6andC7showpreferenceonvisitingthe percentage of white people living there. In the column Overall surroundingsofTimesSquareinaradioof0.5km,0.5km,0.01Km ofTable4,weshowtherankingofhypothesesevaluatedoverthe and0.01kmrespectively. Finally,taxiridesinclusterC5tendto wholedataset,tocomparewiththeresultsobtainedfortheindividual visitplacesneartheFlatironbuildinginaradioof0.5km. clusters. Theuniformhypothesisisabaselinewhichallowsusto verifywhetherahypothesiscanbeagoodexplanationofhuman Foursquare.Taxiridesintheoveralldatasetaremorelikelytovisit mobilityornot.Greencellsinthetableindicatethatahypothesis verydenseareascontainingworkplaces, restaurants, discosand performedbetterthantheuniformhypothesisinthatcluster. bars.Similarly,clustersC1,C4andC5whichhappentobemorning Tocharacterizethedifferentpatternsinhumanmobility(i.e.,the ridesaround7−9am,prefertogototractscontainingworkplaces clusterresultsoftheNTFstep),weinspecttheobtainedrankings suchasbuildingsandoffices. TaxiridesinclusterC6gototracts forthedifferentclusters. Inparticular,weareinterestedinwhich dominatedbyrecreationplacessuchasparksandotheroutdoors hypothesesperformexceptionallywell(i.e.,haveatoprankindi- intheeveningsaround6pm.ClusterC3canbebestcharacterized catedbyasmallnumber)intheclusterbothonanabsolutescale bythepartyhypothesiswhichmeansthattaxiridestendtovisit andincomparisontotherankingobtainedfromtheoveralldataset. tractscontainingnightlifespots(onweekendnights—inferredwith Forexample,considerClusterC3thatcapturestimeframesatthe NTF).Likewise,inclusterC7peopletendtovisittractscontaining weekendnights.Forthiscluster,wecanobserveveryhighranksfor popularplacessuchasrestaurantsandnightlifespots.Notethatall thegravitationalhypothesesParty,Popularity(check-ins),Foodand hypothesesinthisgroupperformbetterthantheuniformhypothesis Recreation. inallclusters,demonstratingtheoverallhighexplanatorypowerof Insummary,clustersC1,C2andC3(showninFigure1)canbe suchdatasourceswithrespecttohumanmobility. characterizedasfollows.ClusterC1predominantlyrepresentstaxi ridesaround9amonworkdays.Peopleinthisclusterprefertogo Census. From the overall data, we can infer that users tend to tonearbytractscontainingpopularplacessuchasworkplacesand visitclosebytractswithhighpercentageofwhitepeoplelivingin restaurantsnearTimesSquareinaradioof0.5km.ClusterC2groups them.Wecanalsoobservethatingeneraltaxiridesarenotgoingto taxiridesgoingtobigtractscontainingartgalleries,museumsand residentialareasbuttocommercialzones,whichisalsothecaseof parks. Transitionsinthisgroupusuallyleavetractswithasmall clustersC1,C3andC7,oppositetoclustersC2,C4andC5whereit numberofvenuesonworkdaysaround6pm. Finally, clusterC3 ismorelikelytovisittractscontainingartgalleriesandmuseums.In identifiesalltaxiridesonFridaysandSaturdaysnightaround1am. clusterC6wecandeducethatpeopletendtovisittractscontaining parksratherthanresidentialzones. Table4: RankingofHypotheses. Thistableshowstherankingof23outof70hypothesesevaluatedwithHypTrailsover3different groups.Overallrepresentsall143MtaxiridesinManhattan2013,clustersCiareclustersidentifiedbyNTF.Numericcellsrepresent theranksofthehypothesesinrespectiveclusters.Forthedistance-basedhypotheses,weonlyshowresultsforthebestparameterof thestandarddeviationσ (parameterinparentheses).Greencellshighlightallhypothesesthatoutperformtheuniformhypothesis. Overall C1 C2 C3 C4 C5 C6 C7 HYPOTHESES Workdays Workdays Weekends Workdays Workdays Mo-Sa6pm Workdays 2013 9am 6pm 1am 7am 9am,6pm Sa-Su2pm 6pm Baseline Uniform 42 56 56 56 56 55 62 59 Distance-based(σ) Proximity 14(3.0) 7(1.0) 2(0.5) 14(1.0) 10(1.0) 19(1.0) 10(0.01) 13(1.0) Centroid(GeographicalCenter) 38(5.0) 50(5.0) 25(1.0) 58(5.0) 52(5.0) 58(5.0) 51(3.0) 51(5.0) Centroid(FlatironBuilding) 29(5.0) 32(2.0) 51(5.0) 17(1.0) 2(0.01) 4(0.5) 44(3.0) 20(0.5) Centroid(TimesSquare) 22(3.0) 1(0.5) 43(3.0) 46(3.0) 1(0.5) 43(2.0) 2(0.01) 1(0.01) Foursquare Gravitational(Allvenues) 1 12 14 10 14 10 14 9 Gravitational(Check-ins) 9 3 30 2 11 5 4 3 Gravitational(Work) 2 5 12 24 8 6 13 11 Gravitational(Food) 5 4 31 4 12 15 11 4 Gravitational(Party) 7 17 37 1 19 9 20 5 Gravitational(Recreation) 15 21 10 9 17 13 7 33 VenueSimilarity 39 53 53 53 53 52 58 53 Census Gravitational(Population) 21 61 28 20 59 46 42 25 Gravitational(TractArea) 23 34 8 26 24 20 24 38 Gravitational(%Whitepeople) 6 24 13 28 28 27 35 27 Gravitational(Residentialzoning) 50 65 19 35 65 61 67 49 Gravitational(Commercialzoning) 13 8 32 22 9 24 19 15 Gravitational(ArtGalleries) 46 23 1 38 5 2 52 54 Gravitational(Museums) 54 13 3 40 6 7 26 58 Gravitational(Parks) 63 62 4 44 63 59 6 64 RaceSimilarity 32 48 50 52 50 50 59 50 PovertySimilarity 37 55 52 54 55 53 61 56 EmploymentSimilarity 40 57 54 55 58 54 60 58 6. DISCUSSION clusteringapproachisexchangeableandcouldbereplacedbyany Inthiswork,wehaveshownthatspatio-temporaldynamicsin otherclusteringtechnique. humanbehaviorcanpotentiallybebetterexplainedbyconsidering Statespace. Sinceourapproachrequiresadiscretestatespace,it parts of the data separately. We identified clusters of taxi rides isnecessarytoaggregatepick-upanddrop-offlocationsofthetaxi andutilizedopenlyavailabledatafromtheWebtoexplainthem. ridesinanarea. Thechoiceoftheseaggregationalunits(i.e.,the However,therearesomeaspectsthatneedtobetakenintoaccount statesinourstatespace)canpotentiallyinfluencetheresults.Thisis forthecurrentapproach. knowninliteratureastheModifiablearealunitproblem[28].Inthis paper,wechosetractsforthelevelofouranalysisasitallowedfor Concentrationparameterk.Asdiscussedin[33],theHypTrails thedirectintegrationofinformationfromcensusdata.Experiments approach requires setting a parameter k to elicit Dirichlet priors withdifferentstatespacesaresubjectsoffuturework. fromhypotheses.Highervaluesofkexpressstrongerbelievesinthe Multipledimensions. Inthispaper,weclustertaxitripswithre- respectivehypotheses.Technically,largervaluesofkimplyhigher specttotime,pick-upanddrop-offlocation.However,ourapproach valuesofthehyperparameters(pseudocounts)oftheDirichletdis- allowstoextendthesebyadditionalinformation,e.g.,numberof tributions.Inourexperiments,wetriedseveralvaluesofkfrom0to passengers.Inthiscase,ahigherdimensionaltensorwouldbeused 100;overallverysimilarresults.Thereportedresultsinthispaper byNTF,buttheresultingclusterscouldalsobeexplainedwithhelp useanintermediatevalueofk=10. oftheHypTrailsapproach.Thescaleofthesedimensionscouldalso Correlationsintheexplanations.UsingHypTrailstoexplainclus- suggestmorefine-grainedmobilitybehaviors.Forinstance,inthis terswillnotbeabletoidentifycausesofmovementpatterns,but workwedefinedthetimedimensionasall168hoursofaweekin onlycorrelations.Asanexample,forClusterC2inourcasestudy ordertodistinguishpatternsonworkdaysandweekends. theArtGallerieshypothesisperformsbest.Thisofcoursedoesnot Normalization. Theexternalonlinedataweuseforconstructing meanthattaxitripsinthatclusterprominentlyhaveartgalleriesas mobilityhypothesesmightnotbeevenlydistributedacrossalltracts destinations,butthatpeoplegotoplacesthatalsohavenearbyart (e.g.,#ofchurchesvs.#ofrestaurants).ThereforetheHypTrailsap- galleries.Inthatdirection,wealsointendtointegrateacorrelation proachnormalizeseachbeliefmatrixtobeabletocomparedifferent analysisbetweentheusedhypothesesinfutureworkinorderto hypothesestoeachother. identifyexplanationsthataresimilartoeachother. 7. RELATEDWORK Clustering method. In this paper, we use HypTrails to explain clustersobtainedbyNon-negativetensorfactorization.WhileNTF Humanmobilityresearch.Humanmobilityisaphenomenonthat is a reliable and established method in this line of research, the has attracted the attention of governments and researchers from differentfields.Studyingthemovementofpeople(e.g.,migration correlatingmobilitypatternsandenergyconsumptiontoprovide orcommuting)fromasocialscienceperspectivehashelpedusto bettersustainableurbanforms[20],andbymeasuringtheirinflu- understandwho,whereandwhypeoplemove[2,24,35,40]aswell enceintheworldonattractingvisitors[21],andbyquantifyingtheir aswhatconsequencessuchmovementcarries, bymeansofe.g., resiliencetoextremeevents[7].In[4,9],theauthorsimplemented demographic,socio-economicandland-usefactors. Inliterature, visualizationtoolstounderstandurbanflowsacrosstimeandspace theyarealsoreferredtoasactivity-basedanalysis.Naturalsciences, fromtaxiridesandsocialmedia. ontheotherhand,haveshownusthatuniversalpatternsexistand Thenoveltyofourapproachreliesonthreefactors:(1)amulti- aremodelledbymovement-basedtechniqueswhichcanpredicthu- dimensionalpatternrecognitionprocessusingNTF[5]toidentify mandynamics[25,32].Tothebestofourknowledge,thereexists differentmobilitybehaviorsintaxidata,(2)theexpansionofthe littleresearchonexplaininghumanmobilitybycombiningboth activity-based human mobility behavior into a hypothesis-based movement-basedapproachesandactivity-basedanalyses.Further- schemabuiltuponhumanbeliefsand(3)quantifyingtheplausibil- more,theyarenotabletocaptureoccasionalorseasonalbehaviors ityofbeliefsforeverymobilitybehaviorusingHypTrails[33]—a thatoccuronlyatcertainperiodsoftime. Bayesianapproachforexpressingandcomparinghypothesesabout humantrails. Ubiquitousdata. Duetothelackofopenandupdatedinforma- tion at global scale (e.g., surveys or census data), and thanks to 8. CONCLUSIONS theriseofubiquitoustechnologiessuchasmobilephonedataand GPS,researcherscangetaccesstohumantrailswhichfacilitates Inthispaper,wehavepresentedaninnovativeapproachfordis- thestudyofhumanmovements. Thereexistseveralstudiesthat coveringandcharacterizingpatternsinhumanmobilitybehavior.It haverevealedspatio-temporalpatternsindifferentcitiesbasedon combines(i)theclusteringoftransitiondatautilizingnon-negative mobilephonecalldetailrecords(CDR)[17,38],taxitrips[3,6,23] tensorfactorization(NTF)withbothtimeandspacedimensions andbikerides[19,31]. Therapidemergeofsocialnetworkshas and(ii)characterizingtheseclustersusingtheBayesianHypTrails alsobenefitedthestudyofhumanmobilitybasedongeo-tagged method.InourexperimentsontaxidatafromManhattan,wewere data.Forinstance,theworkbyJurdaketal.[18]studiesTwitteras abletoidentifyseveralinterestingfacetsofhumanmobilityand aproxyofhumanmovement,byusinguniversalindicatorssuchas characterizethemusingcensusdataandadditionalinformationcol- displacementdistributionandgyrationradiusdistributionthatmea- lectedfromFoursquare.Asoneexample,wediscoveredagroupof surehowfarindividualstypicallymovebasedongeo-locatedtweets. taxiridesthatendatlocationswithahighdensityofpartyvenueson Similarly,theauthorsin[27]proposedanetworkofplacesbuilt weekendnights.Thestrengthofthisapproachreliesonthefactthat uponFoursquare’svenuesandmodelhumanmobilitybyconsider- theinterpretationoftheclusteringresultscanbeeasilycharacterized ingtemporalandnetworkdynamicsinferredfromuser’scheck-ins. withhighlevelhypothesesusingHypTrails. Gabrielli et al. [10] proposed a technique to analyze human tra- Ourworkextendsrecentresearchconcernedwithabetterunder- jectoriesofresidentsandtouristsbysemanticallylabelingsource standingofhumanmobility. We have demonstrated thathuman anddestinationspots.Basedontime-evolvingnetworks,thework mobilityisnotone-dimensionalbutrathercontainsdifferentfacets in[11]identifiesandrankscollectivefeaturesforepidemicspread. including(butnotlimitedto)timeandspace.Futureresearchcan Thisisamoreintrusivewayoftrackinghumanmovement,sinceit benefit from our methodological and experimental concepts pre- requirestheuseofwearablesensors. sentedinthiswork.Amorefine-grainedviewonhumanmobility canalsofacilitatee.g.,cityplanners,trafficcontrol,location-based Activity-basedhumanbehavior. Comparedtothispaper, other recommendersystemsoreventplanning. works have identified and explained periodical movement-based Inthefuture,weaimtogeneralizeourfindingsbystudyingsimilar patternsasactivity-basedhumanbehaviors. In [41],theauthors data(e.g.,biketripsorgeo-taggedtweets)availableforNewYork proposedamodeltorepresentthetransitionprobabilityoftravel andothercities.Indoingso,wecouldnotonlyunveilnovelgeneral demandsduringatimeintervalandsuggestedthattraveldemands patternsofmobility,butalsodiscoversimilaritiesanddifferences canbeassociatedwithfixedlocationsundersomecircumstances. betweencities. Jiangetal.[16]explainedwhen,whereandhowindividualsinteract Acknowledgments. ThisworkwaspartiallyfundedbyDFGGer- with places in metropolitan areas based on activity survey data manScienceFundresearchprojects“KonSKOE”and“PoSTsII”. in Chicago. The work shows daily patterns as eigenvectors and employsK-meansclusteringtoidentifygroupsofindividualsbased References ontheirdailyactivitiesonweekdaysandweekends.Fromtaxitrips inShanghai,theworkin[29]showshowtodetectbasispatternsfor [1] M.Becker,P.Singer,F.Lemmerich,A.Hotho,D.Helic,andM.Strohmaier. collectivetrafficflowandcorrelatesthemwithtripcategoriesand Photowalkingthecity:Comparinghypothesesabouturbanphototrailson Flickr.InInt.ConferenceonSocialInformatics,2015. temporalactivitiessuchascommutingto/fromworkinthemornings [2] C.BrettellandJ.Hollifield.MigrationTheory:TalkingAcrossDisciplines. andeveningsorgoingoutatnight. Linearcombinationsareused Taylor&Francis,2014. todescribemacropatternsandnon-negativematrixfactorization [3] G.Chen,X.Jin,andJ.Yang.Studyonspatialandtemporalmobilitypatternof (NMF)fordetectinghowmanydifferentpatternsexistinaday. urbantaxiservices.InInt.ConferenceOnIntelligentSystemsandKnowledge Engineering,2010. Appliedurbancomputing.Mobilityhasalsobeenstudiedinmany [4] A.Chua,E.Marcheggiani,L.Servillo,andA.V.Moere.FlowSampler:Visual otherdifferenturbancontexts:Salnikovetal.[30]studiedtaxiride AnalysisofUrbanFlowsinGeolocatedSocialMediaData.InInt.Conference dynamicsinordertorecommendpeoplethecheapestwayofcom- onSocialInformatics.Springer,2014. [5] A.Cichocki,R.Zdunek,A.H.Phan,andS.-i.Amari.Nonnegativematrixand mutingbasedontwodifferenttaxicompanies.In[8,39],theauthors tensorfactorizations:applicationstoexploratorymulti-waydataanalysisand proposedtwodifferenttechniquestoinferland-useortaxonomy blindsourceseparation.JohnWiley&Sons,2009. ofplacesbasedongeo-taggedpostsfromTwitterandFoursquare [6] L.Ding,H.Fan,andL.Meng.UnderstandingTaxiDrivingBehaviorsfrom respectively.Taxidriverscanalsobenefitfromstudyingtheirdriv- MovementData.InAGILE2015,pages219–234.Springer,2015. ingbehaviorwhilepassive(nopassengers)[6]andbypredicting [7] B.DonovanandD.B.Work.Usingcoarsegpsdatatoquantifycity-scale transportationsystemresiliencetoextremeevents.arXiv:1507.06011,2015. the number of passengers they could potentially get in a certain [8] D.Falcone,C.Mascolo,C.Comito,D.Talia,andJ.Crowcroft.Whatisthis hotspotinthenexttimeinterval[22].Citiescanbenefitaswellby place?Inferringplacecategoriesthroughuserpatternsidentificationin geo-taggedtweets.InInt.ConferenceonMobileComputing,Applicationsand [25] A.Noulas,S.Scellato,R.Lambiotte,M.Pontil,andC.Mascolo.Ataleof Services,2014. manycities:universalpatternsinhumanurbanmobility.PloSOne, [9] N.Ferreira,J.Poco,H.T.Vo,J.Freire,andC.T.Silva.Visualexplorationof 7(5):e37027,2012. bigspatio-temporalurbandata:Astudyofnewyorkcitytaxitrips. [26] A.Noulas,S.Scellato,C.Mascolo,andM.Pontil.AnEmpiricalStudyof TransactionsOnVisualizationandComputerGraphics,19(12):2149–2158, GeographicUserActivityPatternsinFoursquare.2011. 2013. [27] A.Noulas,B.Shaw,R.Lambiotte,andC.Mascolo.TopologicalPropertiesand [10] L.Gabrielli,S.Rinzivillo,F.Ronzano,andD.Villatoro.Fromtweetsto TemporalDynamicsofPlaceNetworksinUrbanEnvironments.InInt. semantictrajectories:mininganomalousurbanmobilitypatterns.InCitizenin ConferenceonWorldWideWebCompanion,2015. SensorNetworks,pages26–35.Springer,2014. [28] S.Openshaw.Themodifiablearealunitproblem.GeoBooksNorwich,UK, [11] L.Gauvin,A.Panisson,A.Barrat,andC.Cattuto.Revealinglatentfactorsof 1983. temporalnetworksformesoscaleinterventioninepidemicspread. [29] C.Peng,X.Jin,K.-C.Wong,M.Shi,andP.Liò.Collectivehumanmobility arXiv:1501.02758,2015. patternfromtaxitripsinurbanarea.PloSOne,7(4):e34487,2012. [12] L.Gauvin,A.Panisson,andC.Cattuto.Detectingthecommunitystructureand [30] V.Salnikov,R.Lambiotte,A.Noulas,andC.Mascolo.OpenStreetCab: activitypatternsoftemporalnetworks:anon-negativetensorfactorization ExploitingTaxiMobilityPatternsinNewYorkCitytoReduceCommuter approach.PloSOne,9(1),2014. Costs.arXiv:1503.03021,2015. [13] M.C.Gonzalez,C.A.Hidalgo,andA.-L.Barabasi.Understandingindividual [31] A.Sarkar,N.Lathia,andC.Mascolo.ComparingcitiesâA˘Z´cyclingpatterns humanmobilitypatterns.Nature,453(7196):779–782,2008. usingonlinesharedbicyclemaps.Transportation,42(4):1–19,2015. [14] N.Hao,L.Horesh,andM.Kilmer.NonnegativeTensorDecomposition.In [32] F.Simini,M.C.González,A.Maritan,andA.-L.Barabási.Auniversalmodel CompressedSensing&SparseFiltering,pages123–148.Springer,2014. formobilityandmigrationpatterns.Nature,484(7392):96–100,2012. [15] F.Ivis.Calculatinggeographicdistance:conceptsandmethods.InProceedings [33] P.Singer,D.Helic,A.Hotho,andM.Strohmaier.HypTrails:ABayesian ofthe19thConferenceofNortheastSASUserGroup,2006. ApproachforComparingHypothesesAboutHumanTrailsontheWeb.InInt. [16] S.Jiang,J.Ferreira,andM.C.González.Clusteringdailypatternsofhuman ConferenceonWorldWideWeb,2015. activitiesinthecity.DataMiningandKnowledgeDiscovery,25(3):478–510, [34] C.Song,Z.Qu,N.Blumm,andA.-L.Barabási.Limitsofpredictabilityin 2012. humanmobility.Science,327(5968):1018–1021,2010. [17] S.Jiang,J.FerreiraJr,andM.C.González.Activity-BasedHumanMobility [35] S.A.Stouffer.Interveningopportunities:atheoryrelatingmobilityand PatternsInferredfromMobilePhoneData:ACaseStudyofSingapore.InInt. distance.AmericanSociologicalReview,5(6):845–867,1940. WorkshoponUrbanComputing. [36] K.Takeuchi,R.Tomioka,K.Ishiguro,A.Kimura,andH.Sawada. [18] R.Jurdak,K.Zhao,J.Liu,M.AbouJaoude,M.Cameron,andD.Newth. Non-negativemultipletensorfactorization.InInt.ConferenceonDataMining, UnderstandingHumanMobilityfromTwitter.arXiv:1412.2154,2014. 2013. [19] A.Kaltenbrunner,R.Meza,J.Grivolla,J.Codina,andR.Banchs.Bicycle [37] P.-N.Tan,M.Steinbach,V.Kumar,etal.Introductiontodatamining,volume1. cyclesandmobilitypatterns-Exploringandcharacterizingdatafroma PearsonAddisonWesleyBoston,2006. communitybicycleprogram.arXiv:0810.4187,2008. [38] J.L.Toole,M.Ulm,M.C.González,andD.Bauer.Inferringlandusefrom [20] F.LeNéchet.Urbanspatialstructure,dailymobilityandenergyconsumption:a mobilephoneactivity.InInt.WorkshoponUrbanComputing,2012. studyof34europeancities.Cybergeo:EuropeanJournalofGeography,2012. [39] C.K.Vaca,D.Quercia,F.Bonchi,andP.Fraternali.Taxonomy-Based [21] Lenormand,MaximeandGonçalves,BrunoandTugores,Antòniaand DiscoveryandAnnotationofFunctionalAreasintheCity.InInt.Conference Ramasco,JoséJ.Humandiffusionandcityinfluence.arXiv:1501.07788,2015. onWebandSocialMedia,2015. [22] X.Li,G.Pan,Z.Wu,G.Qi,S.Li,D.Zhang,W.Zhang,andZ.Wang. [40] K.Willis.Introduction:mobility,migrationanddevelopment.Int.Development Predictionofurbanhumanmobilityusinglarge-scaletaxitracesandits PlanningReview,32(3-4):i–xiv,2010. applications.FrontiersofComputerScience,6(1):111–121,2012. [41] L.Wu,Y.Zhi,Z.Sui,andY.Liu.Intra-urbanhumanmobilityandactivity [23] X.Liu,L.Gong,Y.Gong,andY.Liu.Revealingdailytravelpatternsandcity transition:evidencefromsocialmediacheck-indata.PloSOne,9(5),2014. structurewithtaxitripdata.arXiv:1310.6592,2013. [42] G.K.Zipf.TheP1P2/Dhypothesis:Ontheintercitymovementofpersons. [24] P.Merriman.Mobility,Space,andCulture.Routledge,2012. AmericanSociologicalReview,11(6):677–686,1946.