TOPICS IN CONVEX OPTIMIZATION: INTERIOR-POINT METHODS, CONIC DUALITY AND APPROXIMATIONS Franc¸ois Glineur Service de Mathe´matique et de Recherche Ope´rationnelle, Faculte´ Polytechnique de Mons, Rue de Houdain, 9, B-7000 Mons, Belgium. [email protected] http://mathro.fpms.ac.be/~glineur/ January 2001 Co-directed by Jacques Teghem Tama´s Terlaky Contents Table of Contents i List of figures v Preface vii Introduction 1 I INTERIOR-POINT METHODS 5 1 Interior-point methods for linear optimization 7 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.1 Linear optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.2 The simplex method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.3 A first glimpse on interior-point methods . . . . . . . . . . . . . . . . 9 1.1.4 A short historical account . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Building blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.2 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.3 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.4 Barrier function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.5 The central path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.6 Link between central path and KKT equations . . . . . . . . . . . . . 15 1.3 Interior-point algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.1 Path-following algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3.2 Affine-scaling algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.3 Potential reduction algorithms . . . . . . . . . . . . . . . . . . . . . . 25 1.4 Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.4.1 Infeasible algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 i ii CONTENTS 1.4.2 Homogeneous self-dual embedding . . . . . . . . . . . . . . . . . . . . 27 1.4.3 Theory versus implemented algorithms . . . . . . . . . . . . . . . . . . 29 1.4.4 The Mehrotra predictor-corrector algorithm . . . . . . . . . . . . . . . 29 1.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.5.1 Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.5.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.5.3 Starting point and stopping criteria . . . . . . . . . . . . . . . . . . . 33 1.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2 Self-concordant functions 35 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.1.1 Convex optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.1.2 Interior-point methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.1.3 Organization of the chapter . . . . . . . . . . . . . . . . . . . . . . . . 38 2.2 Self-concordancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.2.2 Short-step method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.2.3 Optimal complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3 Proving self-concordancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.3.1 Barrier calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.3.2 Fixing a parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.3.3 Two useful lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.4 Application to structured convex problems . . . . . . . . . . . . . . . . . . . . 54 2.4.1 Extended entropy optimization . . . . . . . . . . . . . . . . . . . . . . 54 2.4.2 Dual geometric optimization . . . . . . . . . . . . . . . . . . . . . . . 55 2.4.3 l -norm optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 p 2.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 II CONIC DUALITY 59 3 Conic optimization 61 3.1 Conic problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2 Duality theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3 Classification of conic optimization problems . . . . . . . . . . . . . . . . . . 67 3.3.1 Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3.2 Attainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.3 Optimal duality gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4 l -norm optimization 73 p 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1.1 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.1.2 Organization of the chapter . . . . . . . . . . . . . . . . . . . . . . . . 75 4.2 Cones for l -norm optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 75 p 4.2.1 The primal cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.2.2 The dual cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Duality for l -norm optimization . . . . . . . . . . . . . . . . . . . . . . . . . 82 p CONTENTS iii 4.3.1 Conic formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.3.2 Duality properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5 Geometric optimization 95 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Cones for geometric optimization . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.2.1 The geometric cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.2.2 The dual geometric cone . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3 Duality for geometric optimization . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.1 Conic formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3.2 Duality theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.3.3 Refined duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.3.4 Summary and examples . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.4.1 Original formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.4.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6 A different cone for geometric optimization 119 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.2 The extended geometric cone . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 6.3 The dual extended geometric cone . . . . . . . . . . . . . . . . . . . . . . . . 122 6.4 A conic formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6.4.1 Modelling geometric optimization . . . . . . . . . . . . . . . . . . . . . 125 6.4.2 Deriving the dual problem . . . . . . . . . . . . . . . . . . . . . . . . . 126 6.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7 A general framework for separable convex optimization 129 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.2 The separable cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.3 The dual separable cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.4 An explicit definition of Kf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.5 Back to geometric and l -norm optimization . . . . . . . . . . . . . . . . . . . 136 p 7.6 Separable convex optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 III APPROXIMATIONS 143 8 Approximating geometric optimization with l -norm optimization 145 p 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.2 Approximating geometric optimization . . . . . . . . . . . . . . . . . . . . . . 146 8.2.1 An approximation of the exponential function . . . . . . . . . . . . . . 146 8.2.2 An approximation using l -norm optimization . . . . . . . . . . . . . . 147 p 8.3 Deriving duality properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 8.3.1 Duality for l -norm optimization . . . . . . . . . . . . . . . . . . . . . 149 p 8.3.2 A dual for the approximate problem . . . . . . . . . . . . . . . . . . . 150 8.3.3 Duality for geometric optimization . . . . . . . . . . . . . . . . . . . . 152 8.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 9 Linear approximation of second-order cone optimization 155 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 9.2 Approximating second-order cone optimization . . . . . . . . . . . . . . . . . 157 9.2.1 Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 9.2.2 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 9.2.3 A first approximation of L2 . . . . . . . . . . . . . . . . . . . . . . . . 159 9.2.4 A better approximation of L2 . . . . . . . . . . . . . . . . . . . . . . . 160 9.2.5 Reducing the approximation . . . . . . . . . . . . . . . . . . . . . . . 164 9.2.6 An approximation of Ln . . . . . . . . . . . . . . . . . . . . . . . . . . 166 9.2.7 Optimizing the approximation . . . . . . . . . . . . . . . . . . . . . . 167 9.2.8 An approximation of second-order cones optimization . . . . . . . . . 170 9.2.9 Accuracy of the approximation . . . . . . . . . . . . . . . . . . . . . . 171 9.3 Computational experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 9.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 9.3.2 Truss-topology design . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 9.3.3 Quadratic optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 181 9.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 IV CONCLUSIONS 187 Concluding remarks and future research directions 189 V APPENDICES 191 A An application to classification 193 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 A.2 Pattern separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 A.3 Maximizing the separation ratio. . . . . . . . . . . . . . . . . . . . . . . . . . 196 A.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 B Source code 201 Bibliography 209 Summary 215 About the cover 217 List of Figures 2.1 Graphs of functions r and r . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 1 2 3.1 Epigraph of the positive branch of the hyperbola x x = 1 . . . . . . . . . . . 68 1 2 4.1 The boundary surfaces of L(5) and L(2) (in the case n = 1). . . . . . . . . . . 77 4.2 The boundary surfaces of L(45) and L(5) (in the case n = 1). . . . . . . . . . . 81 5.1 The boundary surfaces of G2 and (G2)∗. . . . . . . . . . . . . . . . . . . . . . 102 9.1 Approximating B (1) with a regular octagon. . . . . . . . . . . . . . . . . . . 160 2 9.2 The sets of points P , P , P and P when k = 3. . . . . . . . . . . . . . . . . 162 3 2 1 0 9.3 Constraint matrices for L and its reduced variant. . . . . . . . . . . . . . . 165 15 9.4 Linear approximation of a parabola using L for k = 1,2,3,4. . . . . . . . . . 172 k 9.5 Size of the optimal approximation versus accuracy (left) and dimension (right). 176 A.1 A bidimensional separation problem. . . . . . . . . . . . . . . . . . . . . . . . 195 A.2 A separating ellipsoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 A.3 A simple separation problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 A.4 A pair of ellipsoids with ρ equal to 3. . . . . . . . . . . . . . . . . . . . . . . 197 2 A.5 The optimal pair of separating ellipsoids. . . . . . . . . . . . . . . . . . . . . 198 A.6 The final separating ellipsoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 v Preface This work is dedicated to my wife, my parents and my grandfather, for the love and support they gave me throughout the writing of this thesis. First of all, I wish to thank my advisor Jacques Teghem, which understood early that the field of optimization would provide a stimulating and challenging area for my research. Both his guidance and support were crucial in the accomplishment of this doctoral degree. He also provided me with very valuable feedback during the final redaction of this thesis. A great deal of the ideas presented in this thesis were originally developed during a research stay at the Delft University of Technology which took place in the first half of 1999. I am very grateful to Professors Kees Roos and Tam´as Terlaky for their kind hospitality. They welcomed me in their Operations Research department, which provided me with a very stimulating research environment to work in. Professor Tam´as Terlaky accepted to co-direct this thesis. I wish to express him my deep gratitude for the numerous and fruitful discussions we had about my research. Many other researchers contributed directly or indirectly to my current understanding of optimiza- tion, sharing with me at various occasions their knowledge and insight about this field. Let me mention Professors Martine Labb´e, Michel Goemans, Van Hien Nguyen, Jean-Jacques Strodiot and Philippe Toint, who made me discover some of the most interesting topics in optimization during my first year as doctoral student as well as Professor Yurii Nesterov, who was advisor in my thesis committee. IalsowishtoexpressspecialthankstotheentirestaffoftheMathematics and Operations Research department at the Facult´e Polytechnique de Mons, for their constant kindness, availability and support. I conducted this research as a research fellow supported by a grant from the F.N.R.S. (Belgian National Fund for Scientific Research), which also funded a trip to attend the Inter- national Mathematical Programming Symposium 2000 in Atlanta. My research stay at the Delft University of Technology was made possible with the aid of a travel grant awarded by theCommunaut´e Fran¸caise de Belgique, whichalsosupportedatriptotheINFORMSSpring 2000 conference in Salt Lake City. Mons, December 2000. vii
Description: