ebook img

An invitation to statistics in Wasserstein space PDF

157 Pages·2020·4.876 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview An invitation to statistics in Wasserstein space

SPRINGER BRIEFS IN PROBABILITY AND MATHEMATICAL STATISTICS Victor M. Panaretos Yoav Zemel An Invitation to Statistics in Wasserstein Space SpringerBriefs in Probability and Mathematical Statistics Editor-in-Chief GesineReinert,UniversityofOxford,Oxford,UK MarkPodolskij,UniversityofAarhus,AarhusC,Denmark SeriesEditors NinaGantert,TechnischeUniversita¨tMu¨nchen,Mu¨nich,Nordrhein-Westfalen, Germany TailenHsing,UniversityofMichigan,AnnArbor,MI,USA RichardNickl,UniversityofCambridge,Cambridge,UK SandrinePe´che´,Universite´ ParisDiderot,Paris,France YosefRinott,HebrewUniversityofJerusalem,Jerusalem,Israel AlmutE.D.Veraart,ImperialCollegeLondon,London,UK MathieuRosenbaum,Universite´ PierreetMarieCurie,Paris,France WeiBiaoWu,UniversityofChicago,Chicago,IL,USA SpringerBriefs present concise summaries of cutting-edge research and practical applications across a wide spectrum of fields. Featuring compact volumes of 50 to 125 pages, the series covers a range of content from professional to academic. Briefsarecharacterized byfast,globalelectronic dissemination,standardpublish- ing contracts, standardized manuscript preparation and formatting guidelines, and expeditedproductionschedules. Typicaltopicsmightinclude: - Atimelyreportofstate-of-thearttechniques - A bridge between new research results, as published in journal articles, and a contextualliteraturereview - Asnapshotofahotoremergingtopic - Lectureofseminarnotesmakingaspecialisttopicaccessiblefornon-specialist readers - SpringerBriefs in Probability and Mathematical Statistics showcase topics of currentrelevanceinthefieldofprobabilityandmathematicalstatistics Manuscriptspresentingnewresultsinaclassicalfield,newfield,oranemerging topic,orbridgesbetweennewresultsandalreadypublishedworks,areencouraged. Thisseriesisintendedformathematiciansandotherscientistswithinterestinprob- ability and mathematical statistics. All volumes published in this series undergo a thoroughrefereeingprocess. TheSBPMSseriesispublishedunder theauspices oftheBernoulliSocietyfor MathematicalStatisticsandProbability. Moreinformationaboutthisseriesathttp://www.springer.com/series/14353 Victor M. Panaretos • Yoav Zemel An Invitation to Statistics in Wasserstein Space VictorM.Panaretos YoavZemel InstituteofMathematics StatisticalLaboratory EPFL UniversityofCambridge Lausanne,Switzerland Cambridge,UK ISSN2365-4333 ISSN2365-4341 (electronic) SpringerBriefsinProbabilityandMathematicalStatistics ISBN978-3-030-38437-1 ISBN978-3-030-38438-8 (eBook) https://doi.org/10.1007/978-3-030-38438-8 ©TheEditor(s)(ifapplicable)andTheAuthor(s)2020.Thisbookisanopenaccesspublication. OpenAccess ThisbookislicensedunderthetermsoftheCreativeCommonsAttribution4.0Inter- nationalLicense(http://creativecommons.org/licenses/by/4.0/),whichpermitsuse,sharing,adaptation, distributionandreproductioninanymediumorformat,aslongasyougiveappropriatecredittothe originalauthor(s)andthesource,providealinktotheCreativeCommonslicenceandindicateifchanges weremade. The images or other third party material in this book are included in the book’s Creative Commons licence,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthebook’s CreativeCommonslicenceandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthe permitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Toourfamilies Preface A Wasserstein distance is a metric between probability distributions μand νon a groundspaceX,inducedbytheproblemofoptimalmasstransportationorsimply optimaltransport.Itreflectstheminimaleffortthatisrequiredinordertoreconfig- ure the mass of μto produce the mass distribution of ν. The ‘effort’ corresponds to the total work needed to achieve this reconfiguration, where work equals the amountofmassattheorigintimesthedistancetotheprescribeddestinationofthis mass. The distance between origin and destination can be raised to some power otherthan1whendefiningthenotionofwork,givingrisetocorrespondinglydiffer- entWassersteindistances.WhenviewingthespaceofprobabilitymeasuresonX as a metric space endowed with a Wasserstein distance, we speak of a Wassertein Space. Mass transportation and the associated Wasserstein metrics/spaces are ubiqui- tous in mathematics, with a long history that has seen them catalyse core devel- opments in analysis, optimisation, and probability. Beyond their intrinsic mathe- matical richness, they possess attractive features that make them a versatile tool for the statistician. They frequently appear in the development of statistical the- ory and inferential methodology, sometimes as a technical tool in asymptotic the- ory, due to the useful topology they induce and their easy majorisation; and other timesasamethodologicaltool,forexample,instructuralmodellingandgoodness- of-fit testing. A more recent trend in statistics is to consider Wasserstein spaces themselves as a sample and/or parameter space and treat inference problems in such spaces. It is this more recent trend that is the topic of this book and is coming to be known as ‘statistics in Wasserstein spaces’ or ‘statistical optimal transport’. Fromthetheoreticalpointofview,statisticsinWassersteinspacesrepresentsan emerging topic in mathematical statistics, situated at the interface between func- tional data analysis (where the data are functions, seen as random elements of an infinite-dimensional Hilbert space) and non-Euclidean statistics (where the data satisfy non-linear constraints, thus lying on non-Euclidean manifolds). Wasser- vii viii Preface stein spaces provide the natural mathematical formalism to describe data collec- tions that are best modelled as random measures on Rd (e.g. images and point processes). Such random measures carry the infinite-dimensional traits of func- tional data, but are intrinsically non-linear due to positivity and integrability re- strictions.Indeed,contrarilytofunctionaldata,theirdominatingstatisticalvariation arisesthroughrandom(non-linear)deformationsofanunderlyingtemplate,rather than the additive (linear) perturbation of an underlying template. This shows op- timal transport to be a canonical framework for dealing with problems involving the so-called phase variation (also known as registration, multi-reference align- ment, or synchronisation problems). This connection is pursued in detail in this book and linked with the so-called problem of optimal multitransport (or optimal multicoupling). Inwritingourmonograph,wehadtwoaimsinmind: 1. To present the key aspects of optimal transportation and Wasserstein spaces (Chaps.1 and 2) relevant to statistical inference, tailored to the interests and backgroundofthe(mathematical)statistician.Thereare,ofcourse,classictexts comprehensivelycoveringthisbackground.1Buttheirchoiceoftopicsandstyle ofexpositionareusuallyadaptedtotheanalystand/orprobabilist,withaspects mostrelevantforstatisticiansscatteredamong(much)othermaterial. 2. To make use of the ‘Wasserstein background’ to present some of the funda- mentalsofstatisticalestimationinWassersteinspaces,anditsconnectiontothe problem of phase variation (registration) and optimal multicoupling. In doing so, we highlight connections with classical topics in statistical shape theory, suchasProcrustesanalysis.Onthesetopics,nobook/monographappearstoyet exist. The book focusses on the theory of statistics in Wasserstein spaces. It does not cover the associated computational/numerical aspects. This is partially due to space restrictions, but also due to the fact that a reference entirely dedicated to such issues can be found in the very recent monograph of Peyre´ and Cu- turi [103]. Moreover, since this book is meant to be a rapid introduction for non-specialists, we have made no attempt to give a complete bibliography. We have added some bibliographic remarks at the end of each chapter, but these are in no way meant to be exhaustive. For those seeking reference works, Rachev [106] is an excellent overview of optimal transport up to 1985. Other recent re- viewsareBogachevandKolesnikov[26]andPanaretosandZemel[101].Thelat- ter review can be thought of as complementary to the present book and surveys some of the applications of optimal transport methods to statistics and probability theory. 1E.g.byRachevandRu¨schendorf[107],Villani[124,125],AmbrosioandGigli[10],Ambrosio etal.[12],andmorerecentlybySantambrogio[119]. Preface ix StructureoftheBook Thematerialisorganisedintofivechapters. • Chapter1presentsthenecessarybackgroundinoptimaltransportation.Starting withMonge’soriginalformulation,itpresentsKantorovich’sprobabilisticrelax- ation and the associated duality theory. It then focusses on quadratic cost func- tions(squarednormedcost)andgivesamoredetailedtreatmentofcertainimpor- tantspecialcases.Topicsofstatisticalconcernsuchastheregularityoftransport maps andtheir stabilityunder weak convergence of theorigin/destination mea- suresarealsopresented.Thechapterconcludeswithaconsiderationofmoregen- eralcostfunctionsandthecharacterisationofoptimaltransportplansviacyclical monotonicity. • Chapter 2 presents the salient features of ((cid:2) -)Wasserstein space starting with 2 topologicalpropertiesofstatisticalimportance,aswellasmetricpropertiessuch ascoveringnumbers.Itcontinueswithgeometricalfeaturesofthespace,review- ing the tangent bundle structure of the space, the characterisation of geodesics, andthelogandexponentialmapsasrelatedtotransportmaps.Finally,itreviews the relationship between the curvature and the so-called compatibility of trans- portmaps,roughlyspeakingwhencanoneexpectoptimaltransportmapstoform agroup. • Chapter 3 starts to shift attention to issues more statistical and treats the prob- lem of existence, uniqueness, characterisation, and regularity of Fre´chet means (barycenters) for collections of measures in Wasserstein space. This is done by means of the so-called multimarginal transport problem (a.k.a. optimal multi- transportoroptimalmulticouplingproblem).Thetreatmentstartswithfinitecol- lectionsofmeasures,andthenconsidersFre´chetmeansfor(potentiallyuncount- ably supported) probability distributions on Wasserstein space and associated measurabilityconcerns. • Chapter 4 considers the problem of estimation of the Fre´chet mean of a prob- ability distribution in Wasserstein space, on the basis of a finite collection of i.i.d. elements from this law observed with ‘sampling noise’. It is shown that thisproblemisinextricablylinkedtotheproblemofseparationofamplitudeand phasevariation(a.k.a.registration)ofrandompointpatters,wherethefocusison estimating the maps yielding the optimal multicoupling rather than the Fre´chet meanitself.Nonparametricmethodologyforsolvingeitherproblemisreviewed, coupledwithassociatedasymptotictheoryandseveralillustrativeexamples. • Chapter 5 focusses on the problem of actually constructing the Fre´chet mean and/or optimal multicoupling of a collection of measures, which is a necessary step when using the methods of Chap.4 in practice. It presents the steepest de- scentalgorithmbasedonthegeometricalfeaturesreviewedinChap.2andacon- vergence analysis thereof. Interestingly, it is seen that the algorithm is closely relatedtoProcrustesalgorithmsinshapetheory,andthisconnectionisdiscussed indepth.Severalspecialcasesarereviewedinmoredetail. x Preface Eachchaptercomeswithsomebibliographicnotesattheend,givingsomeback- groundandsuggestingfurtherreading.Thefirsttwochapterscanbeusedindepen- dently as a crash course in optimal transport for statisticians at the MSc or PhD level depending on the audience’s background. Proofs that were omitted from the main text due to space limitations have been organised into an online supplement accessibleatwww.somewhere.com Acknowledgements Wewishtothankthreeanonymousreviewersfortheirthoughtfulfeedback.Weare especiallyindebtedtooneofthem,whoseanalyticalinsightswereparticularlyuse- ful.Anyerrorsoromissionsare,ofcourse,ourownresponsibility.VictorM.Panare- tos gratefully acknowledges support from a European Research Council Starting Grant. Yoav Zemel was supported by Swiss National Science Foundation Grant # 178220. Finally, we wish to thank Mark Podolskij and Donna Chernyk for their patienceandencouragement. Lausanne,Switzerland VictorM.Panaretos Cambridge,UK YoavZemel

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.