ebook img

Sparse Modeling: Theory, Algorithms, and Applications PDF

250 Pages·2014·8.499 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Sparse Modeling: Theory, Algorithms, and Applications

SPARSE MODELING Theory, Algorithms, and Applications Chapman & Hall/CRC Machine Learning & Pattern Recognition Series SERIES EDITORS Ralf Herbrich Thore Graepel Amazon Development Center Microsoft Research Ltd. Berlin, Germany Cambridge, UK AIMS AND SCOPE This series reflects the latest advances and applications in machine learning and pattern recognition through the publication of a broad range of reference works, textbooks, and handbooks. The inclu- sion of concrete examples, applications, and methods is highly encouraged. The scope of the series includes, but is not limited to, titles in the areas of machine learning, pattern recognition, computa- tional intelligence, robotics, computational/statistical learning theory, natural language processing, computer vision, game AI, game theory, neural networks, computational neuroscience, and other relevant topics, such as machine learning applied to bioinformatics or cognitive science, which might be proposed by potential contributors. PUBLISHED TITLES BAYESIAN PROGRAMMING Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha UTILITY-BASED LEARNING FROM DATA Craig Friedman and Sven Sandow HANDBOOK OF NATURAL LANGUAGE PROCESSING, SECOND EDITION Nitin Indurkhya and Fred J. Damerau COST-SENSITIVE MACHINE LEARNING Balaji Krishnapuram, Shipeng Yu, and Bharat Rao COMPUTATIONAL TRUST MODELS AND MACHINE LEARNING Xin Liu, Anwitaman Datta, and Ee-Peng Lim MULTILINEAR SUBSPACE LEARNING: DIMENSIONALITY REDUCTION OF MULTIDIMENSIONAL DATA Haiping Lu, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos MACHINE LEARNING: An Algorithmic Perspective, Second Edition Stephen Marsland SPARSE MODELING: THEORY, ALGORITHMS, AND APPLICATIONS Irina Rish and Genady Ya. Grabarnik A FIRST COURSE IN MACHINE LEARNING Simon Rogers and Mark Girolami MULTI-LABEL DIMENSIONALITY REDUCTION Liang Sun, Shuiwang Ji, and Jieping Ye REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES Johan A. K. Suykens, Marco Signoretto, and Andreas Argyriou ENSEMBLE METHODS: FOUNDATIONS AND ALGORITHMS Zhi-Hua Zhou Chapman & Hall/CRC Machine Learning & Pattern Recognition Series SPA R SE MODE L I NG Theory, Algorithms, and Applications Irina Rish IBM Yorktown Heights, New York, USA Genady Ya. Grabarnik St. John’s University Queens, New York, USA CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2015 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Version Date: 20141017 International Standard Book Number-13: 978-1-4398-2870-0 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro- vides licenses and registration for a variety of users. For organizations that have been granted a photo- copy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com ToMom,mybrotherIlya,andmyfamily–Natalie, Alexander,andSergey.Andinlovingmemoryofmy dadandmybrotherDima. ToFany,Yaacob,Laura,andGolda. Contents ListofFigures xi Preface xvii 1 Introduction 1 1.1 MotivatingExamples . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 ComputerNetworkDiagnosis . . . . . . . . . . . . . . . . 4 1.1.2 NeuroimagingAnalysis . . . . . . . . . . . . . . . . . . . 5 1.1.3 CompressedSensing . . . . . . . . . . . . . . . . . . . . . 8 1.2 SparseRecoveryinaNutshell . . . . . . . . . . . . . . . . . . . . 9 1.3 StatisticalLearningversusCompressedSensing . . . . . . . . . . 11 1.4 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 12 2 SparseRecovery:ProblemFormulations 15 2.1 NoiselessSparseRecovery . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3 Convexity:BriefReview . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 Relaxationsof(P )Problem . . . . . . . . . . . . . . . . . . . . . 20 0 2.5 TheEffectofl -RegularizeronSolutionSparsity . . . . . . . . . . 21 q 2.6 l -normMinimizationasLinearProgramming . . . . . . . . . . . 22 1 2.7 NoisySparseRecovery . . . . . . . . . . . . . . . . . . . . . . . . 23 2.8 AStatisticalViewofSparseRecovery . . . . . . . . . . . . . . . . 27 2.9 BeyondLASSO:OtherLossFunctionsandRegularizers . . . . . . 30 2.10 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 33 3 TheoreticalResults(DeterministicPart) 35 3.1 TheSamplingTheorem . . . . . . . . . . . . . . . . . . . . . . . 36 3.2 SurprisingEmpiricalResults . . . . . . . . . . . . . . . . . . . . . 36 3.3 SignalRecoveryfromIncompleteFrequencyInformation . . . . . 39 3.4 MutualCoherence . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 SparkandUniquenessof(P )Solution . . . . . . . . . . . . . . . 42 0 3.6 NullSpacePropertyandUniquenessof(P )Solution . . . . . . . 45 1 3.7 RestrictedIsometryProperty(RIP) . . . . . . . . . . . . . . . . . 46 3.8 SquareRootBottleneckfortheWorst-CaseExactRecovery . . . . 47 3.9 ExactRecoveryBasedonRIP . . . . . . . . . . . . . . . . . . . . 48 3.10 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 52 vii viii Contents 4 TheoreticalResults(ProbabilisticPart) 53 4.1 WhenDoesRIPHold? . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Johnson-Lindenstrauss Lemma and RIP for Subgaussian Random Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.1 ProofoftheJohnson-LindenstraussConcentration Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2.2 RIPforMatriceswithSubgaussianRandomEntries. . . . . 56 4.3 RandomMatricesSatisfyingRIP . . . . . . . . . . . . . . . . . . 59 4.3.1 EigenvaluesandRIP . . . . . . . . . . . . . . . . . . . . . 60 4.3.2 RandomVectors,IsotropicRandomVectors . . . . . . . . . 60 4.4 RIPforMatriceswithIndependentBoundedRowsandMatriceswith RandomRowsofFourierTransform . . . . . . . . . . . . . . . . . 61 4.4.1 ProofofURI . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4.2 TailBoundfortheUniformLawofLargeNumbers(ULLN) 67 4.5 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 69 5 AlgorithmsforSparseRecoveryProblems 71 5.1 UnivariateThresholdingisOptimalforOrthogonalDesigns . . . . 72 5.1.1 l -normMinimization . . . . . . . . . . . . . . . . . . . . 73 0 5.1.2 l -normMinimization . . . . . . . . . . . . . . . . . . . . 74 1 5.2 Algorithmsforl -normMinimization . . . . . . . . . . . . . . . . 76 0 5.2.1 AnOverviewofGreedyMethods . . . . . . . . . . . . . . 79 5.3 Algorithmsforl -normMinimization(LASSO) . . . . . . . . . . . 82 1 5.3.1 LeastAngleRegressionforLASSO(LARS) . . . . . . . . 82 5.3.2 CoordinateDescent . . . . . . . . . . . . . . . . . . . . . . 86 5.3.3 ProximalMethods . . . . . . . . . . . . . . . . . . . . . . 87 5.4 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 92 6 BeyondLASSO:StructuredSparsity 95 6.1 TheElasticNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.1.1 TheElasticNetinPractice:NeuroimagingApplications . . 100 6.2 FusedLASSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.3 GroupLASSO:l /l Penalty . . . . . . . . . . . . . . . . . . . . 109 1 2 6.4 SimultaneousLASSO:l1/l∞Penalty . . . . . . . . . . . . . . . . 110 6.5 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.5.1 Blockl /l -NormsandBeyond . . . . . . . . . . . . . . . 111 1 q 6.5.2 OverlappingGroups . . . . . . . . . . . . . . . . . . . . . 112 6.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.6.1 TemporalCausalModeling. . . . . . . . . . . . . . . . . . 114 6.6.2 GeneralizedAdditiveModels . . . . . . . . . . . . . . . . 115 6.6.3 MultipleKernelLearning . . . . . . . . . . . . . . . . . . 115 6.6.4 Multi-TaskLearning . . . . . . . . . . . . . . . . . . . . . 117 6.7 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 118 Contents ix 7 BeyondLASSO:OtherLossFunctions 121 7.1 SparseRecoveryfromNoisyObservations . . . . . . . . . . . . . 122 7.2 ExponentialFamily,GLMs,andBregmanDivergences . . . . . . . 123 7.2.1 ExponentialFamily . . . . . . . . . . . . . . . . . . . . . . 124 7.2.2 GeneralizedLinearModels(GLMs) . . . . . . . . . . . . . 125 7.2.3 BregmanDivergence . . . . . . . . . . . . . . . . . . . . . 126 7.3 SparseRecoverywithGLMRegression . . . . . . . . . . . . . . . 128 7.4 SummaryandBibliographicNotes . . . . . . . . . . . . . . . . . . 136 8 SparseGraphicalModels 139 8.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.2 MarkovNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.2.1 MarkovNetworkProperties:ACloserLook . . . . . . . . . 142 8.2.2 GaussianMRFs . . . . . . . . . . . . . . . . . . . . . . . . 144 8.3 LearningandInferenceinMarkovNetworks . . . . . . . . . . . . 145 8.3.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.3.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 8.3.3 Example:NeuroimagingApplications . . . . . . . . . . . . 147 8.4 LearningSparseGaussianMRFs . . . . . . . . . . . . . . . . . . . 151 8.4.1 SparseInverseCovarianceSelectionProblem . . . . . . . . 152 8.4.2 OptimizationApproaches . . . . . . . . . . . . . . . . . . 153 8.4.3 SelectingRegularizationParameter . . . . . . . . . . . . . 160 8.5 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 165 9 SparseMatrixFactorization:DictionaryLearningandBeyond 167 9.1 DictionaryLearning . . . . . . . . . . . . . . . . . . . . . . . . . 168 9.1.1 ProblemFormulation . . . . . . . . . . . . . . . . . . . . . 169 9.1.2 AlgorithmsforDictionaryLearning . . . . . . . . . . . . . 170 9.2 SparsePCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 9.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 174 9.2.2 SparsePCA:SynthesisView . . . . . . . . . . . . . . . . . 176 9.2.3 SparsePCA:AnalysisView . . . . . . . . . . . . . . . . . 178 9.3 SparseNMFforBlindSourceSeparation . . . . . . . . . . . . . . 179 9.4 SummaryandBibliographicalNotes . . . . . . . . . . . . . . . . . 182 Epilogue 185 Appendix MathematicalBackground 187 A.1 Norms,Matrices,andEigenvalues . . . . . . . . . . . . . . . . . . 187 A.1.1 ShortSummaryofEigentheory . . . . . . . . . . . . . . . 188 A.2 DiscreteFourierTransform . . . . . . . . . . . . . . . . . . . . . . 190 A.2.1 TheDiscreteWhittaker-Nyquist-Kotelnikov-Shannon SamplingTheorem . . . . . . . . . . . . . . . . . . . . . . 191 A.3 Complexityofl -normMinimization . . . . . . . . . . . . . . . . 192 0 A.4 SubgaussianRandomVariables . . . . . . . . . . . . . . . . . . . 192 A.5 RandomVariablesandSymmetrizationinRn . . . . . . . . . . . . 197

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.