SIP(2018),vol.7,e4,page1of17©TheAuthors,2018. ThisisanOpenAccessarticle,distributedunderthetermsoftheCreativeCommonsAttributionlicence(http://creativecommons.org/licenses/by/4.0/),whichpermits unrestrictedre-use,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. doi:10.1017/ATSIP.2018.4 industrial technology advances Spatio-temporal multidimensional collective data analysis for providing comfortable living anytime and anywhere naonoriuedaandfutoshinaya Machinelearningisapromisingtechnologyforanalyzingdiversetypesofbigdata.TheInternetofThingserawillfeaturethe collectionofreal-worldinformationlinkedtotimeandspace(location)fromallsortsofsensors.Inthispaper,wediscussspatio- temporalmultidimensionalcollectivedataanalysistocreateinnovativeservicesfromsuchspatio-temporaldataanddescribethe coretechnologiesfortheanalysis.Wedescribecoretechnologiesaboutsmartdatacollectionandspatio-temporaldataanalysis andpredictionaswellasanovelapproachforreal-time,proactivenavigationincrowdedenvironmentssuchaseventspacesand urbanareas.Ourchallengeistodevelopareal-timenavigationsystemthatenablesmovementsofentiregroupstobeefficiently guidedwithoutcausingcongestionbymakingnear-futurepredictionsofpeopleflow.Weshowtheeffectivenessofournavigation approachbycomputersimulationusingartificialpeople-flowdata. Keywords: Spatio-temporaldataanalysis,IoT,Smartcities,Proactivenavigation,Machinelearning Received8August2017;Revised19January2018;Accepted19January2018 I. INTRODUCTION As envisioned by MLC, the research and development visionsforbigdataanalysistechnologiesintheIoTeraare Machinelearningisapromisingtechnologyforanalyzing shown in Fig. 1. From this perspective, regression analy- diversetypesofbigdata.However,sinceasinglecoretech- sis, which is representative of conventional data analysis, nologyisinsufficientforachievinginnovativeservicesthat aimstodescribeobjectivevariablesusingmultipleexplana- exploitbigdata,acompositetechnologythatcombineskey tory variables. That is, it has an analysis technology for technologiesisessential.Furthermore,toconducttechnol- determining whether an objective variable (e.g. sales) can ogy trials in the field and uncover business needs, close be expressed as a function of explanatory variables (fac- discussions are needed with specialists in applied fields. tors).IntheIoTera,however,thereisaneedfortechnology Against this background, the Machine Learning and Data thatcanextractlatentinformationspanningheterogeneous ScienceCenter(MLC)wasestablishedinApril2013atNTT multidimensional datasets that cannot be discovered by Laboratoriesasaninter-laboratorycollaborativeorganiza- individually analyzing different sets of data. To this end, tionfocusedonbigdataanalysis[1]. we developed a technique called “multidimensional data It has been more than a decade since the concept of analysis” that enables multiple heterogeneous sets of data “bigdata”wasfirstproposed.Unfortunately,definitionsof to be simultaneously analyzed whose usefulness has been big data and its analysis were unclear from the start. In demonstratedthroughactualfieldtrials[2–4]. time, though, advances in sensor technologies will enable With the improvement of sensing technology and the the use of sensors in all sorts of fields including social rapid spread of smartphone applications and IoT devices, infrastructure, medicine and healthcare, transportation, various big data on car and object movements, human and agriculture. An environment is currently evolving in behavior,andenvironmentalchangescanbemeasuredand which massive amounts of data can be collected and ana- collectedeverywhereinrealtime.Bymakingmovingenti- lyzedinrealtime:thebirthofaconceptcalledtheInternet ties such as people, cars, and things function as sensors, ofThings(IoT).ThroughIoTthetruenatureofbigdatais wecanefficientlycollectavarietyoffine-graineddatacol- finallybeingrevealed. lectionofreal-worldinformationlinkedtotimeandspace (location)frommanykindsofsensors.Thistypeofinfor- NTT Communication Science Laboratories, NTT Corporation, 2-4 Hikaridai, mation is called “spatio-temporal data”. In this paper, we Seika-cho,Soraku-gun,Kyoto,Japan.Phone:+81774935108 alsodescribeourtrialsthatcollectedanddetectedcity-wide Correspondingauthor: eventsbygovernmentvehiclesinFujisawacity,Kanagawa, N.Ueda Japanasmobileenvironmentalsensors[5–8]. Email:[email protected] 1 https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press 2 naonori ueda and futoshi naya (cid:129) Navigationshouldcopewiththeavoidanceofcongestion, while also takinginto account the intentionsof individ- uals as much as possible. That is, it must enable many peopletoreachtheirdestinationsasquicklyaspossible. Our challenge is to develop a real-time navigation system that satisfies the above ideal navigation characteristics. In thispaper,wedescribeournavigationsystemandshowits effectivenessbycomputersimulationusingartificialpeople- flowdata. Therestofthepaperisorganizedasfollows.InSection II, we describe our smart data collection system to detect Fig.1. CoretechnologiesforeraofbigdataandInternetofThings. city-wide events. In Section III, we briefly explain our machinelearningtechnologiesforspatio-temporalmultidi- mensionalcollectivedataanalysis.InSectionIV,weexplain ReferringtoFig.1,time-seriesanalysismodelsthetem- real-time,proactivenavigation,whichisacoretechnology poral interaction or cause-and-effect relationship among forprovidingcomfortablelivinganytimeandanywhere. data, but spatio-temporal analysis constructs models that alsoconsiderthespatialdynamicsamongdata.Toanalyze II. SMART CITY-WIDE DATA thespatio-temporalbehaviorofpeopleandthingsandpre- COLLECTION USING dict when, where, and what in real time, we established a GOVERNMENT VEHICLES research theme called “spatio-temporal multidimensional collective data analysis”. This form of analysis addresses In the following, we describe our trials that collected and timeandspacealongmultidimensionalaxesandusespast detected city-wide events using government vehicles in datafromacertainperiodoftimetolearnaboutthemutual Fujisawacityasmobileenvironmentalsensors. relationships between time and space with respect to the “flow”ofpeople,things,information,etc. Theglobaltrendofurbanizationhasincreasedthecon- A) City-widedatacollectionbysensorized centrationofpeopleinmetropolitanregions[9].Thistrend governmentvehicles willintensifycongestionatmasseventsaswell.Pedestrians One of the most important concerns of citizens in urban oftengetlostinunfamiliarurbanenvironmentsorinlarge, areasisthehealthimpactofairpollution.TheWorldHealth indoor spaces. Even if people are familiar with their sur- Organizationestimatesthat4.3milliondeathsoccurannu- roundings,theywilllikelyhavedifficultyevacuatingduring allyfromexposuretoindoorairpollutionand3.7million emergencies. During Japan’s most recent natural disaster, to outdoor pollution [12]. Although cities in developing 2011’sGreatEastJapanEarthquake,manypeoplecouldnot countriesaregenerallymorepollutedthanfirst-worldcities, return to their homes due to heavy congestion caused by millionssufferworldwidefromallergiestocedarpollenand damage to the infrastructure of highways and train lines. PM concentration.Accumulatingfine-grainedairquality Insuchsituations,eitheroutdoorsorindoors,guidanceon 2.5 informationanddeliveringittothepublicareimportantto demand must be provided that is individually tailored to assessurbanairconditionstoprotectthehealthofcitizens. humanneeds. However,acquiringdetailedairqualityinformationisusu- As one critical application of spatio-temporal collec- allydifficultbothspatiallyandtemporallybecauseexpen- tive data analysis, we describe our novel approach for sive atmospheric observation stations are limited even in real-time and proactive navigation in such crowded envi- cityareasandthesamplingratefrequencyisaslowasjust ronments as event spaces and urban areas where many onceashour. peoplearesimultaneouslymovingtowardtheirdestinations Tocopewiththeaboveissue,wehavebeeninvestigating [10,11].Theidealnavigationsystemrequiresthefollowing city-wideeventdetectiontechnologyusingenvironmental characteristics: data collected by car-mounted sensors. We have installed dozens of sensors on garbage trucks that operate daily all (cid:129) Navigation should be optimized not for individuals, but overFujisawacity[5,6].Figure2showsasensorizedgarbage foroverallhumannavigationsothatnewcongestionwill truckandablockdiagram.Thetruckisequippedwithvar- notbecausedbyindependentindividualnavigations. ious environmental sensors, such as NO , CO, O , PM , 2 3 2.5 (cid:129) Navigation should be performed in real time because pollen,temperature,humidity,luminance,UV,twotypesof countermeasuresmustbefacilitatedbasedonthepeople- air contaminants, and a microphone for measuring ambi- flowchangesfrommomenttomoment,dependingonthe entnoise.Thesesensordataaresampledeveryfewseconds people-flowcondition. bythesensortype,andmanysensorswithglobalposition- (cid:129) Navigation should be proactive; it should be achieved ingsystem(GPS)geolocationinformationaretransmitted before congestion occurs by predicting near-future toaremoteserverbyamobileInternetconnectionservice people-flowconditions. every30s.Suchfine-grainedenvironmentaldatahelpdetect https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press spatio-temporal multidimensional collective data analysis 3 Fig.2. Sensorizedgarbagetruckandblockdiagram. Fig.3. HeatmapsofNO2andambientnoiselevelsofFujisawacity. spatio-temporaleventsinmoredepth,e.g.theemergenceof services in the future. Even though developing countries airpollutionhotspotsandthegenerationofambientnoise. are transitioning toward better waste management, they Figure3showstheheatmapsofNO andtheambientnoise have insufficient collection and inefficient waste disposal. 2 levelsofFujisawacitybydatacollectedfrom1month.Some Sincerapidpopulationgrowththroughurbanizationcreates partsofthemainroadsarenoisierthanotherroadsaswell a huge amount of solid waste, appropriate waste manage- assomeparticularhotspots. ment is required based on future population projections Normally,fordesigningsuchasensingsystem,thedata for each region. To this end, we have been conducting sampling intervals and communication frequencies are experimentstoevaluateregionalsolid-wasteproductionby fixed.However,theseparametersforcapturingchangesin motionsensorsmountedongarbagetrucks[8]. sensordatawithfinergranularityareoftenunknownuntil Inconventionalwastemanagement,theamountofsolid thedataarecollectedandanalyzedonce.Theyalsodepend wasteforagivenareaissummarizedbyweighingthetotal ontheservicerequirementsfromcitizensandlocalmunic- waste delivered to incineration plants by garbage trucks ipalities. To address this, we developed a remote program assigned to the region. In fact, one garbage truck collects relocation function that allows us to remotely change the solidwasteinmultipleseparatedareastobalanceworkloads program on the sensor nodes on demand. Detailed infor- amongmultipletrucks.Therefore,estimatingtheweightof mationisavailable[6,7]. regionalwasteisdifficultbasedonthesummarizedweight; it is roughly estimated based on human intuition, which includes sizable error. From interviews with municipality B) Smartwastemanagement officials, we concluded that a method that accurately esti- Another important issue in cities across the world is matestheamountofwasteforeachareaisrequiredforplan- waste management. Efficient waste management is essen- ninggarbagecollectionschemesbasedonregionalseasonal tialforlocalmunicipalitiestosustainstablewastecollection variations in the amount of solid waste. This information https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press 4 naonori ueda and futoshi naya Fig.4. Accelerationmeasurementpatternsandtheirspectrogramswhilecollectinggarbage,driving,andidling. willenablemunicipalitiestominutelyestimatefuturesolid- blue, green) correspond to estimated garbage collection wasteamounts,thusincreasingtheefficiencyofwasteman- spots by different operator companies and/or municipal- agement.Theregionalgarbageamountinformationisalso ity sections. Currently, 60 of 100 garbage trucks are sen- important to promote citizen awareness and reduce the sorized. The bottom-left of Fig. 5 shows choropleth maps amountofwaste. of the total amount of solid waste summarized weekly by Tomeettheaboverequirements,weproposedamethod town region. By accumulating long-term spatio-temporal forestimatingtheregionalamountsofsolidwastebasedon garbage amount data, we can analyze spatial and seasonal thegarbagecollectiondurationineachareaandthewaste tendenciesalongwithcensusstatisticsofthecity.Further- weightmeasuredatincinerationplants.Thegarbagecollec- more,bycreatingasystemtoprovidefeedbackoftheabove tion duration at garbage collection place in each house or information, we can expect to quantitatively evaluate the apartment building is estimated from the garbage truck’s effectivenessofourapproachforwastereduction. vibration that is caused when it scoops up garbage with a rotating plate. We mounted motion sensors on garbage III. SPATIO-TEMPORAL trucks to measure and detect specific vibration patterns MULTIDIMENSIONAL DATA whencollectinggarbage,driving,idling,andsoon.Figure4 ANALYSIS showsexamplesoftheaccelerationsensordataofthehor- izontal axis parallel to the truck’s traveling direction and In this section, we describe our two representative results their spectrograms. Peak acceleration frequencies differ ofspatio-temporalmultidimensionaldataanalysis.Thefirst dependingonthesituationofthegarbagetrucks. isamethodforpatterndiscoveryandmissingvaluecom- We annotated the sensor data with labels based on pletionfromspatio-temporalcountingdata.Thesecondis their situations by listening to the sounds recorded by amethodforestimatingpeopleflowfromspatio-temporal microphones attached to the garbage trucks. By applying populationdata. astandardsupervisedlearningmethod,suchasasupport vectormachine,totheextractedspectrogramfeaturesofthe A) Spatio-temporalpatterndiscoveryand accelerometerdata,wecanclassifyandspottemporaldura- missingvaluecompletion tionsthatcorrespondtogarbagecollection.Theestimated temporal durations of the garbage collection are recorded Variouskindsofspatio-temporaldata,suchastrafficflow, with the truck’s locations from a GPS receiver. When the purchasingrecords,andweatherdata,areobservedbysen- garbage truck arrives at the incineration plant, the total sormonitoringsystemsinsmartcities.Thesedataareuti- weight of its garbage is measured. We then calculate the lizedtounderstandthedynamicsofacity’shumanactivities amountofsolidwastebyregionbydistributingtheweight anditsenvironmentalconditions[13].However,asthenum- fromtheoperatingdurationoftherotatingplate.Herewe berofsensornodesincreases,itbecomeshumanlyimpos- assumethattheoperatingdurationoftherotatingplateis sible to check every dimension of these spatio-temporal proportionaltotheamountofsolidwaste. data.Currentmonitoringsystemsarenotalwaysstableand Figure 5 shows a screenshot of our regional garbage often fail to observe data, because of problems with sen- amount visualization system. Its right side shows a map sor nodes and data transmission errors. Thus, in practice that compares the estimated regional solid-waste collect- we need to deal with missing values that are included in ing spots for two different weeks. The color dots (orange, spatio-temporaldata. https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press spatio-temporal multidimensional collective data analysis 5 Fig.5. Screenshotofregionalgarbageamountvisualizationsystem. Non-negativetensorcompletion(NTC)[14],whichwas trafficdataprovidedbyCityPulsecollectedinAarhus,Den- proposed for missing value completion, is an extension mark [16]. The horizontal and vertical axes in Fig. 6(a) of non-negative tensor factorization (NTF) that simulta- correspondtothetimesandthevehiclecounts.Figure6(a) neously extracts latent factors from observed values and shows the daily periodicity of the vehicle counts, and we infers the missing values. Tensor is a generalization of a identifiedatemporalcontinuityforthembetweentwoadja- matrix that can represent a high-dimensional data array centtimes.ThearrowsinFig.6(b)correspondtothestart without any loss of data attributes. NTF decomposes a and the end points of each monitoring site. The red and non-negativevaluetensorintosparseandreasonablyinter- bluearrows,respectively,correspondtothelargeandsmall pretablelatentfactorsandiswidelyusedformanyapplica- vehiclecountsatthesites.FromFig.7(b),weconfirmedthe tions [15]. However, the conventional NTC does not fully spatialcontinuityofthevehiclecountsamongneighboring utilize such spatio-temporal data properties as the orders siteswithsimilardirections. oftimepointsandthetopologyofsensors,althoughthese Examples of extracted spatial and temporal factor seg- structuresshouldbeutilizedtoregularizethelatentfactors. ments from CityPulse of our proposed method and NTC Toaddressthisproblem,weproposedanewNTCmethod withtheEucliddistanceareshowninFig.7.Byutilizingour [4]. graph-based regularizer, our proposed method extracted Asadiscrepancymetricandspatio-temporalstructures, interpretablefactors.Withdayandtimemodefactors,dif- the generalized Kullback–Leibler (gKL) divergence was ferent temporal dynamics were extracted in Figs 7(a) and employed as regularizers. We used a gKL that fits partic- 7(b). With the site mode factors, different spatial flows of ularly well if the data are non-negative integer values. To vehicles were extracted in Figs 7(c), 7(d), and 7(e), show- incorporate spatio-temporal structures, we introduced a ing that factor 2 revealed a traffic pattern that occurred graphLaplacian-basedregularizeranduseditforinducing onweekdaymorningswherevehicleswereheadingtoward latentfactorstorepresentauxiliarystructures.Thismodifi- downtownAarhus. cationenabledustoextractmoremeaningfullatentfactors The above result only reflects the count data, but the thantheconventionalmethodswithoutconsideringspatio- method is extendable to multidimensional data. Figure 8 temporalstructures.Theformalizationdetailsaregivenin illustrates the conceptual idea for multidimensionalcases. Appendix.Moreover,wedevelopedacomputationallyeffi- Variouskindsofinputdataaresimultaneouslyanalyzedto cient learning algorithm that only needs to solve linear find the latent structure over different kinds of observed equations[4]. data.Suchanalysisisimportantinthesensethatitcandis- Figure 6(a) shows the total vehicle counts for 2 weeks, coverlatentfactorsthatcouldnothavebeenobtainedfrom and Fig. 6(b) shows them at each site of the automobile individualdata. https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press 6 naonori ueda and futoshi naya Fig.6. CityPulsedataset. Fig.7. ExtractedlatentfactorsforCityPulsedataset. B) People-flowestimationfrom population figures in 500m grid squares in Japan’s urban spatio-temporalpopulationdata areas, calculated from mobile network operational data. DuetotheprevalenceofmobilephonesandGPSdevices, Toeffectivelyutilizepopulationdata,weestimatethepeo- spatio-temporalpopulationdatacannowbeobtainedeas- ple flow from the data. Of course, if we could obtain the ily.Forexample,mobilespatialstatistics[17]containhourly tracking data of individuals, then we would not need to https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press spatio-temporal multidimensional collective data analysis 7 Fig.8. Latentstructureextractionbymultidimensionalcomplexdataanalysis. Fig.9. Spatio-temporalpopulationdata:(a)TokyoonJuly1,2013and(b)OsakaonAugust8,2013.Darkercolorsrepresenthigherpopulationdensitiesineachgrid cell. estimate the people flow. However, such privacy data are of aggregated information about many individuals, are often hard to obtain. Therefore, estimating people flow aggregatedtopreserveprivacyorbecauseofthedifficultyof onlyfrompopulationdataischallenging.Estimatedpeople trackingindividualsovertime.Forinstance,mobilespatial flowscanbeusedforawidevarietyofapplications,which statistics [17] are preprocessed for privacy protection, and include simulating people movements during a disaster, no one can follow a particular user, which allows mobile detectinganomalouspeoplemovements,predictingfuture phoneoperatingcompaniestopublishpopulationdatathat spatialpopulationsgiventhecurrentspatialpopulation,and are calculated based on information of about 60 million designingtransportationsystems. mobile phone users. If the trajectories for each individual Thespatio-temporalpopulationdataweuseasinputare are given, people flow can be estimated straightforwardly the populations in each grid cell over the time shown in by counting the number of people who move between Fig.9.Thespatio-temporalpopulationdata,whichconsist gridcells,thatis,thetransitionpopulation.However,with https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press 8 naonori ueda and futoshi naya Fig.10. Estimatedpeopleflowsforeachtime-of-dayclusterforTokyoandOsakadata.Arrowsdenoteadirection. aggregated data, we can directly determine the size of the cellitself.Figure10showspeopleflowsestimatedbythepro- transition population. This is why we call this “collective posedmethod.WiththeTokyodata,frommidnightto6:00, dataanalysis”. theflowsaresparse,whichisreasonablesincemostpeople To overcome this problem of modeling individual aresleeping.From8:30to11:00,peoplemovetothecenterof behavior given aggregated data, we proposed a mixture Tokyofromthesuburbsforwork.Flowscontinuetothecity of collective graphical models [18]. Our proposed model centerfrom14:30to15:00.From18:30to23:00,flowsfrom assumes that individuals move by transition probabilities thecitycentertothesuburbswereestimatedsinceatthat that depend on their locations and time points. Since the timepeoplearereturninghomefromtheiroffices.Withthe transitionpopulationsarenotgiven,wetreatthemashid- Osakadata,similartemporalflowpatternswereextracted. denvariables.Thehiddentransitionpopulationsarerelated Wealsoevaluatedthepredictionoferrorsforpeopleflows totheobservedpopulationsineachcell;thepopulationina thatwasaveragedoverallthetimepointsandfoundthatour cellequalsthesumoftransitionpopulationsfromit,andthe proposed method outperformed the conventional related populationinacellinthenexttimepointequalsthesumof methods[18]. thetransitionpopulationstoit.Byusingtheserelationsthat Although our results are encouraging, our framework representflowconservationasconstraints,thehiddentran- can be improved in a number of ways. For example, we sitionpopulationsandtransitionprobabilitiesareinferred should incorporate the spatial correlation of flows using simultaneously. location-dependent mixture models. Moreover, we want The proposed model can handle people-flow changes to extend the proposed model to include information over time by segmenting time-of-day points into multiple about the day of the week and study its effectiveness clusters, where different clusters have different transition withdifferentgridsizes,timeintervals,andneighborhood probabilities. For our task that models people flow, incor- settings. porating time information is crucial. For example, people move from the suburbs to the city center for work in the morning,returntothesuburbsintheeveningafterwork, IV. REAL-TIME AND PROACTIVE andrarelytravelinthemiddleofthenight.Theproposed NAVIGATION BASED ON SPATIO- modelisanextensionofcollectivegraphicalmodels[19,20] TEMPORAL PREDICTIONS for handling changes over time. Since it does not use tra- jectoryinformation,whenwehaveenoughpopulationdata A) Spatio-temporalpredictions thatapproximatetheactualpopulationdistribution,wecan estimatethepeopleflow. To achieve the proactive navigation mentioned in Section We evaluated the proposed method using real-world I, our approach detects future congestion with a spatio- spatio-temporalpopulationdatasetsobtainedinTokyoand temporal statistical method that predicts people flow. The Osaka(Fig.9).Inallofthedatasets,thegridcellsaresquare, peopleflow(orvelocity)canbeusedtomeasurecongestion and the neighbors are the surrounding eight cells and the by extrapolating velocities to determine where congestion https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press spatio-temporal multidimensional collective data analysis 9 Fig.11. Processingflowforfutureprediction. willoccurinthenearfuture.Weassumethatasetoftrajec- above spatio-temporal kriging approach to obtain dense toriesisobservedinthecurrenttimeslotsbyapositioning statistics (Fig. 11(b)). Then we estimated a kernel regres- sensorsuchasGPS.Thepositioningtrajectorydataaresim- sionfunctionoverthenear-pasttimedomainbyfittingaset pletimeseriesofcoordinatevalueswithwhichweconstruct ofkernelfunctionstoeachofthenear-pastdensestatistics spatio-temporal data. The observed spatio-temporal data (Fig.11(c)).Sinceweusedaradialbasisfunctionasakernel areusuallyverysparse,andtheratioofthenumberofobser- function, the obtained regression function is a mixture of vations to the time stamps over a certain time interval is radialbasisfunctions. minute for each region. Therefore, we need to interpolate Oncewehavethetime-consecutivekernelfunctions,we velocitiesthatarenotobservedovertime. estimatethenear-futurekernelparametersfromthenear- More specifically, for each region we interpolate and pastkernelparametersbylinearregression.Wefoundthat extrapolateeight-directionalvelocitiesduringacertaintime kernel parameter prediction is more robust and accurate interval.Interpolationmeansanestimationoftheunknown than the direct prediction of statistics because the kernel pastvelocities,whileextrapolationmeansthepredictionof function is smoother than the statistics themselves. Since near-futurevelocities.Forthispurpose,wefirst employed thekernelfunctioniscontinuous,wecanalsoestimatethe aspatio-temporalstatisticalapproachbasedonthekriging statisticsforanyplaces. approach,whichiscommonlyutilizedinfieldslikegeology Since this approach does not assume a second-order and climatology [21] and is equivalent to Gaussian pro- stationarycondition,itcanprovidemoreaccurateextrapo- cess prediction in machine learning. As for extrapolation, lationresults.Weappliedourmethodtothepopulationdata wealsocomparethekrigingapproachwiththerepresenta- collected in an actual event venue (about 110m×360m), tivetime-seriesanalysis,whichisthevectorauto-regression whichmorethan20000peoplevisitperday.Thepopula- (VAR)model[10].Inourexperiment,thekrigingapproach tiondatawerecollectedonceaminuteby24Wi-Fi1 access slightlyoutperformedVAR[22]. pointsinstalledinthevenue.Thepredictionperformance However, we found that these methods are inadequate comparison between an existing method (Gaussian pro- for cases where people flows rapidly change because the cess)andtheproposedmethodisshowninFig.12.Thepro- observed values are assumed to be second-order station- posedmethodobtainedbetterpredictionresults,especially aryrandomvariablesintheconventionalmethods.Thatis, forthe20and30minpredictions. the mean and the variance of the observed values should be constant, which is unreasonable for actual people-flow B) Learningmulti-agentsimulation data. To solve this problem, we developed a new method basedonamachinelearningapproach.Asetofkernelfunc- Weexaminedtheeffectivenessofourspatio-temporalpre- tions is trained using observed people-flow data for each dictionmethodusingactualpeople-flowdata.Byexploiting timestepforthespatialinterpolation,andthenthetrained ourmethod,wecanpredictthenear-futurecongestionrisk time-serieskernelparametersareusedforextrapolation. justfromnear-past,people-flowobservations.Thenextstep Figure 11 shows the processing flow for our spatio- isdesigninganoptimalnavigationplanforagroupofpeo- temporal predictions. Figure 11(a) shows an example of pletoavoidsuchcongestionrisksinrealtime.Totacklethis observedstatisticssuchaspeoplecounts.Inpractice,since problem, we have been developing a learning multi-agent observationlocationsarelimitedandoftensparseinspatio- temporal domains, we performed interpolation using the 1Wi-FiisaregisteredtrademarkofWi-FiAlliance. https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press 10 naonori ueda and futoshi naya Fig.12. Predictionperformancecomparisonandheatmapofpeople-flowprediction. simulation(MAS).Startingwithreal-spaceinformationon commonlyusedmodelconsiderstheaveragewalkingspeed people or vehicles collected from real-time observations, (e.g.4km/h)relatedtosuchattributesofindividualspecta- we input this information online into a simulation envi- torsasageandgenderaswellastheattenuationofthatwalk- ronmentincyberspace,modelitsspatio-temporalfeatures, ingspeedinproportiontothecongestionconditions(crowd instantaneouslypredicttheimmediatecongestionrisk,and density).Crowdcontrolmanuals[23]statethatovertaking pre-emptively and optimally navigate the crowd to avoid becomesdifficultandwalkingspeedbeginstodecreaseata thatrisk. crowddensityof1.2persons/m2andthatmovementgrinds MASisawidelyusedtechniqueformodelingtheindi- to a halt at 4persons/m2. Crowd density is also called the vidual behavior of such autonomous entities (=agents) as “servicelevel”[24],andtheroadwidth,space,andflowrate people,cars,andanimalsbymodelingtheirmicrointerac- canbesecuredinpedestrianlanesthatmaintainacertain tionswiththesurroundingenvironmentandanalyzingand servicelevelforsatisfyingsafetystandards. predictingmacrophenomenathatdevelopfromtheinter- InconventionalanalysismethodsusingMAS,itiscom- actionsamongmultipleagentsandtheirinteractionswith montomanuallydesignsuchparametersaswalkingspeed theirenvironment.AnalyticaltechniquesusingMAShave andnavigationplans(includingpedestrianroutesandflow been widely investigated in fields called complex systems, rate)byconductingsimulationstoevaluatetheireffectsin where such overall macro behaviors as social activities, advance. Such methods have been applied when actually brainactivities,andweatherphenomenacannotbebroken implementing crowd control (Fig. 13(a)). However, these downintothebehavioroftheindividualelementsofpeople parameters and navigation plans are limited to a few pre- withwhichsocietyiscomprised,neuronsthatconstitutethe determinedcombinationsanddonotnecessarilymatchthe neuralnetworksinthebrain,andmoleculesandatomsin observation results of actual person movement or naviga- the atmosphere. These techniques are recently being used tionoperations. in sensor networks, smart grids, and intelligent transport Today,however,whereIoTandsensingtechnologiesare systems as well as in evacuation guidance simulations for rapidlyadvancing,localpeople-flowandcongestioncondi- disastercountermeasures. tionsintherealworldcanbemeasuredinrealtimeusing As an example, consider a navigation scene of tens of a variety of positioning means, such as surveillance cam- thousandsofspectatorsexitingastadium.Inthiscase,the eras, GPSs, Wi-Fi, and beacons. With this in mind, NTT individual spectators are agents who are leaving the sta- laboratories continue to develop a learning-MAS system diumfromtheircurrentlocationsandmovingtowardtheir thattransferseventsintherealworld,suchaspeopleflow trainstationdestinations.Inthemovementofspectators,a observedinrealtimetoasimulationenvironmentincyber https://doi.org/10.1017/ATSIP.2018.4 Published online by Cambridge University Press
Description: