Naoki Katoh · Yuya Higashikawa · Hiro Ito · Atsuki Nagao · Tetsuo Shibuya · Adnan Sljoka · Kazuyuki Tanaka · Yushi Uno Editors Sublinear Computation Paradigm Algorithmic Revolution in the Big Data Era Sublinear Computation Paradigm Naoki Katoh Yuya Higashikawa (cid:129) (cid:129) Hiro Ito Atsuki Nagao (cid:129) (cid:129) Tetsuo Shibuya Adnan Sljoka (cid:129) (cid:129) Kazuyuki Tanaka Yushi Uno (cid:129) Editors Sublinear Computation Paradigm Algorithmic Revolution in the Big Data Era 123 Editors NaokiKatoh Yuya Higashikawa Graduate Schoolof Information Science Graduate Schoolof Information Science University of Hyogo University of Hyogo Kobe,Hyogo,Japan Kobe,Hyogo,Japan HiroIto Atsuki Nagao Schoolof Informatics andEngineering Department ofInformation Science University of Electro-Communications Ochanomizu University Chofu, Tokyo,Japan Bunkyo,Tokyo,Japan TetsuoShibuya Adnan Sljoka Human GenomeCenter Centerfor AdvancedIntelligence Project University of Tokyo RIKEN Minato,Tokyo,Japan Chuo,Tokyo,Japan Kazuyuki Tanaka YushiUno Graduate Schoolof Information Science Graduate Schoolof Engineering Tohoku University Osaka Prefecture University Sendai, Miyagi,Japan Sakai, Osaka,Japan ISBN978-981-16-4094-0 ISBN978-981-16-4095-7 (eBook) https://doi.org/10.1007/978-981-16-4095-7 ©TheEditor(s)(ifapplicable)andTheAuthor(s)2022.Thisbookisanopenaccesspublication. Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adap- tation,distributionandreproductioninanymediumorformat,aslongasyougiveappropriatecreditto the originalauthor(s)and the source, providealink tothe CreativeCommonslicense andindicate if changesweremade. The images or other third party material in this book are included in the book’s Creative Commons license,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthebook’s CreativeCommonslicenseandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthe permitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder. Theuse ofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc. inthis publi- cationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromthe relevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained hereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregard tojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSingaporePteLtd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Preface This book gives an overview of cutting-edge work on a new paradigm called the “sublinear computation paradigm,” which was proposed in the large multiyear academicresearchproject“FoundationsofInnovativeAlgorithmsforBigData”in Japan. In today's rapidly evolving age of big data, massive increases in big data have led to many new opportunities and uncharted areas of exploration, but have also brought new challenges. To handle the unprecedented explosion of big data sets in research, industry, and other areas of society, there is an urgent need to developnovelmethodsandapproachesforbigdataanalysis.Tomeetthisneed,we are pursuing innovative changes in algorithm theory for big data. For example, polynomial-time algorithms have thus far been regarded as “fast,” but if we apply anOðn2Þ-timealgorithmtoapetabyte-scaleorlargerbigdataset,wewillencounter problems in terms of computational resources or running time. To deal with this critical computational and algorithmic bottleneck, we require linear, sublinear, and constant-time algorithms. In this project, which ran from October 2014 to September2021,wehaveproposedthesublinearcomputationparadigminorderto supportinnovationinthebigdataera.Wehavecreatedafoundationofinnovative algorithms bydeveloping computational procedures, data structures, and modeling techniques for big data. The project is organized into three teams that focus on sublinear algorithms, sublinear data structures, and sublinear modeling. Our work has provided high-level academic research results of strong computational and algorithmic interest, which are presented in this book. This book consists of five parts: Part I, which consists of a single chapter introducingtheconceptofthesublinearcomputationparadigm;PartsII,III,andIV review results on sublinear algorithms, sublinear data structures, and sublinear modeling, respectively; and Part V presents some application results. v vi Preface We deeply appreciate the members of this project and everyone else who was involved. This project was conducted as a subproject of the research project “AdvancedCoreTechnologiesforBigDataIntegration,”whichwassupervisedby Prof. Masaru Kitsuregawa. We would like to express our gratitude to him and everyoneinvolvedinthatproject.WealsothanktheeditorialofficeofSpringerfor the opportunity to publish this book. Kobe, Japan Naoki Katoh Tokyo, Japan Hiro Ito Kobe, Japan Yuya Higashikawa Contents Part I Introduction 1 What Is the Sublinear Computation Paradigm? . . . . . . . . . . . . . . . 3 Naoki Katoh and Hiro Ito Part II Sublinear Algorithms 2 Property Testing on Graphs and Games. . . . . . . . . . . . . . . . . . . . . 13 Hiro Ito 3 Constant-Time Algorithms for Continuous Optimization Problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Yuichi Yoshida 4 Oracle-Based Primal-DualAlgorithms forPackingandCovering Semidefinite Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Khaled Elbassioni and Kazuhisa Makino 5 Almost Linear Time Algorithms for Some Problems on Dynamic Flow Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Yuya Higashikawa, Naoki Katoh, and Junichi Teruyama Part III Sublinear Data Structures 6 Information Processing on Compressed Data . . . . . . . . . . . . . . . . . 89 Yoshimasa Takabatake, Tomohiro I, and Hiroshi Sakamoto 7 Compression and Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . 105 Takuya Kida and Isamu Furuya 8 Orthogonal Range Search Data Structures . . . . . . . . . . . . . . . . . . . 121 Kazuki Ishiyama and Kunihiko Sadakane 9 Enhanced RAM Simulation in Succinct Space . . . . . . . . . . . . . . . . 149 Taku Onodera vii viii Contents Part IV Sublinear Modelling 10 Review of Sublinear Modeling in Probabilistic Graphical Models by Statistical Mechanical Informatics and Statistical Machine Learning Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Kazuyuki Tanaka 11 Empirical Bayes Method for Boltzmann Machines . . . . . . . . . . . . . 277 Muneki Yasuda 12 Dynamical Analysis of Quantum Annealing . . . . . . . . . . . . . . . . . . 295 Anthony C. C. Coolen, Theodore Nikoletopoulos, Shunta Arai, and Kazuyuki Tanaka 13 Mean-Field Analysis of Sourlas Codes with Adiabatic Reverse Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Shunta Arai Part V Applications 14 Structural and Functional Analysis of Proteins Using Rigidity Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 Adnan Sljoka 15 Optimization of Evacuation and Walking-Home Routes from Osaka City After a Nankai Megathrust Earthquake Using Road Network Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Atsushi Takizawa and Yutaka Kawagishi 16 Stream-Based Lossless Data Compression. . . . . . . . . . . . . . . . . . . . 391 Shinichi Yamagiwa Part I Introduction Chapter 1 What Is the Sublinear Computation Paradigm? NaokiKatohandHiroIto Abstract Thischapterintroducesthe“sublinearcomputationparadigm.”Asublinear- timealgorithmisanalgorithmthatrunsintimesublinearinthesizeoftheinstance (input data). In other words, the running time is o(n), where n is the size of the instance. This century marks the start of the era of big data. In order to manage big data, polynomial-time algorithms, which are considered to be efficient, may sometimesbeinadequatebecausetheymayrequiretoomuchtimeorcomputational resources.Insuchcases,sublinear-timealgorithmsareexpectedtoworkwell.Wecall this idea the “sublinear computation paradigm.” A research project named “Foun- dations on Innovative Algorithms for Big Data (ABD),” in which this paradigm is thecentralconcept,wasstartedundertheCRESTprogramoftheJapanScienceand TechnologyAgency(JST)inOctober2014andconcludedinSeptember2021.This bookmainlyintroducestheresultsofthisproject. 1.1 WeAreintheEraofBigData Thetwenty-firstcenturycanbecalledtheeraofBigData.Thenumberofwebpages on the Internet was estimated to be more than 1 trillion (=1012) in 2008 [22], and thenumberofwebsitesgrowstentimesinthese10years[21].Thusthenumberof webpages is estimated to be more than 10 trillion (=1013) now. If we assume that 106bytes(≈107bits)ofdataiscontainedinasinglewebpageonaverage,1thenthe totalamountofthedatastoredontheInternetwouldbemorethan100exabits(=1020 bits)!Thevariousactionsthateveryoneperformsarecollectedbyoursmartphones andarestoredinthememoryofstoragedevicesaroundtheworld.Theremarkable developmentofcomputermemoryhasmadeitpossibletostorethisinformation. 1Notethatone1080×1920pixeldigitalphotoconsistsofmorethan2×106pixels. N.Katoh UniversityofHyogo,8-2-1Gakuennishi-machi,Nishi-ku,Kobe,Hyogo651-2197,Japan e-mail:[email protected] B H.Ito( ) TheUniversityofElectro-Communications,1-5-1Chofugaoka,Chofu,Tokyo182-8585,Japan e-mail:[email protected] ©TheAuthor(s)2022 3 N.Katohetal.(eds.),SublinearComputationParadigm, https://doi.org/10.1007/978-981-16-4095-7_1