CONVEXITY AND R(cid:1) OPTIMIZATION IN CONVEXITY AND R(cid:1) OPTIMIZATION IN LEONARDD. BERKOVITZ Purdue University A Wiley-IntersciencePublication JOHN WILEY & SONS, INC. Fororderingandcustomerservice,call1-800-CALL-WILEY. LibraryofCongressCataloging-in-PublicationData: Berkovitz,LeonardDavid,1924— ConvexityandoptimizationinR[superscriptn]/LeonardD.Berkovitz. p.cm.--(Pureandappliedmathematics) ‘‘AWiley-Intersciencepublication.’’ Includesbibliographicalreferencesandindex. ISBN0-471-35281-0(cloth:alk.paper) 1.Convexsets.2.Mathematicaloptimization.I.Title.II.Pureandapplied mathematics(JohnWiley&Sons:Unnumbered) QA640.B462001 516(cid:1).08--dc21 2001045391 PrintedintheUnitedStatesofAmerica. 10 9 8 7 6 5 4 3 2 1 To my wife, Anna CONTENTS Preface ix I Topics in Real Analysis 1 1. Introduction 1 2. Vectors in R(cid:1) 1 3. Algebra of Sets 4 4. Metric Topology of R(cid:1) 5 5. Limits and Continuity 8 6. Basic Property of Real Numbers 11 7. Compactness 14 8. Equivalent Norms and Cartesian Products 16 9. Fundamental Existence Theorem 18 10. Linear Transformations 21 11. Differentiation in R(cid:1) 22 II Convex Sets in RRn 30 1. Lines and Hyperplanes in R(cid:1) 30 2. Properties of Convex Sets 35 3. Separation Theorems 45 4. Supporting Hyperplanes: Extreme Points 56 5. Systems of Linear Inequalities: Theorems of the Alternative 61 6. Affine Geometry 69 7. More on Separation and Support 80 III Convex Functions 87 1. Definition and Elementary Properties 87 2. Subgradients 102 3. Differentiable Convex Functions 106 4. Alternative Theorems for Convex Functions 113 5. Application to Game Theory 117 vii viii CONTENTS IV Optimization Problems 128 1. Introduction 128 2. Differentiable Unconstrained Problems 129 3. Optimization of Convex Functions 137 4. Linear Programming Problems 139 5. First-Order Conditions for Differentiable Nonlinear Programming Problems 145 6. Second-Order Conditions 163 V Convex Programming and Duality 179 1. Problem Statement 179 2. Necessary Conditions and Sufficient Conditions 181 3. Perturbation Theory 188 4. Lagrangian Duality 200 5. Geometric Interpretation 207 6. Quadratic Programming 210 7. Duality in Linear Programming 215 VI Simplex Method 222 1. Introduction 222 2. Extreme Points of Feasible Set 225 3. Preliminaries to Simplex Method 230 4. Phase II of Simplex Method 234 5. Termination and Cycling 245 6. Phase I of Simplex Method 251 7. Revised Simplex Method 255 Bibliography 261 Index 263 PREFACE Thisbookpresentsthemathematicsoffinite-dimensionaloptimization,featur- ingthoseaspectsofconvexitythatareusefulinthiscontext.Itprovidesabasis forthefurtherstudyof convexity,of moregeneraloptimizationproblems,and of numerical algorithms for the solution of finite-dimensional optimization problems. The intended audience consists of beginning graduate students in engineering, economics, operations research, and mathematics and strong undergraduates in mathematics. This was the audience in a one-semester course at Purdue, MA 521, from which this book evolved. Ideally, the prerequisites for reading this book are good introductory courses in real analysis and linear algebra. In teaching MA 521, I found that whilethemathematicsstudentshadtherealanalysisprerequisites,manyofthe other students who took the course because of their interest in optimization didnot havethis prerequisite.ChapterIis for thosestudentsandreaders who do not have the real analysis prerequisite; in it I present those concepts and results from real analysis that are needed. Except for the Weierstrass theorem on the existence of a minimum, the ‘‘heavy’’ or ‘‘deep’’ theorems are stated without proof. Students without the real variables prerequisite found the material difficult at first, but most managed to assimilate it at a satisfactory level. The advice to readers for whom this is the first encounter with the material in Chapter I is to make a serious effort to master it and to return to it as it is used in the sequel. Toaddressaswideanaudienceaspossible,Ihavenotalwayspresentedthe mostgeneralresultorargument.Thus,inChapterIIIchosethe‘‘closestpoint’’ approachto separationtheorems,ratherthanmoregenerallyvalidarguments, because I believe it to be more intuitive and straightforward for the intended audience. Readers who wish to get the best possible separation theorem in finite dimensions should read Sections 6 and 7 of Chapter II. In proving the Fritz John Theorem, I used a penalty function argument due to McShane rather than more technical arguments involving linearizations. I limited the discussion of duality to Lagrangian duality and did not consider Fenchel duality, since the latter would require the development of more mathematical machinery. ix x PREFACE Thenumberingsystemandreferencesystemfortheorems,lemmas,remarks, andcorollariesisthefollowing.Withinagivenchapter,theorems,lemmas,and remarks are numbered consecutively in each section, preceded by the section number. Thus, the first theorem of Section 1 is Theorem 1.1, the second, Theorem1.2,andsoon.Thesameappliestolemmasandremarks.Corollaries are numbered consecutively within each section without a reference to the section number. Reference to a theorem in the same chapter is given by the theoremnumber.Referencetoatheoreminachapterdifferentfromthecurrent oneisgivenbythetheoremnumberprecededbythechapternumberinRoman numerals.Thus,areferenceinChapterIVtoTheorem4.1inChapterIIwould be Theorem II.4.1. References to lemmas and remarks are similar. References tocorollarieswithinthesamesectionaregivenbythenumberofthecorollary. Referencesto corollariesinadifferentsectionofthesamechapteraregivenby prefixing the section number to the corollary number; references in a different chapter are given by prefixing the chapter number in Roman numerals to the preceding. IthankRitaSaerensandJohnGregoryforreadingthecoursenotesversion ofthisbookandfortheircorrectionsandsuggestionsforimprovement.Ithank Terry Combs for preparing the figures. I also thank Betty Gick for typing seemingly innumerable versions and revisions of the notes for MA 521. Her skill and cooperation contributed greatly to the success of this project. L D. B WestLafayette,Indiana ConvexityandOptimizationin(cid:1)n.LeonardD.Berkovitz Copyright(cid:182)2002JohnWiley&Sons,Inc. ISBN:0-471-35281-0 I TOPICS IN REAL ANALYSIS 1. INTRODUCTION Theseriousstudyofconvexityandoptimizationproblemsin R(cid:1) requiressome background in real analysis and in linear algebra. In teaching a course based on notes from which this text evolved, the author and his colleagues assumed that the students had an undergraduate course in linear algebra but did not necessarily have a background in real analysis. The purpose of this chapter is to provide the reader with most of the necessary background in analysis. Not all statements will be proved. The reader, however, is expected to understand the definitions and the theorems and is expected to follow the proofs of statements whenever the proofs are given. The bad news is that many readers will find this chapter to be the most difficult one in the text. The good news is that careful study of this material will provide background for many other coursesandthat subsequentchaptersshould beeasier. If necessary,the reader should return to this chapter when encountering this material later on. 2. VECTORS IN R(cid:1) By euclidean n-space, or R(cid:1), we mean the set of all n-tuples x(cid:1)(x ,...,x ), (cid:2) (cid:1) where the x, i(cid:1)1,...,n are real numbers. Thus, R(cid:1) is a generalization of the (cid:2) familiar two- and three-dimensional spaces R(cid:3) and R(cid:4). The elements x of R(cid:1) are called vectors or points. We will often identify the vector x(cid:1)(x ,...,x ) (cid:2) (cid:1) with the n(cid:1)1 matrix (cid:1)x (cid:2) (cid:2) (cid:1) x (cid:1) and abuse the use of the equal sign to write (cid:1)x (cid:2) (cid:2) x(cid:1) (cid:1) . x (cid:1) 1 2 TOPICSINREALANALYSIS In this case we shall call x a column vector. We shall also identify x with the 1(cid:1)n matrix(x ,...,x ) and write x(cid:1)(x ,...,x ). In this case we shallcall x (cid:2) (cid:1) (cid:2) (cid:1) a row vector. It will usually be clear from the context whether we consider x to be a row vector or a column vector. When there is danger of confusion, we will identify x with the column vector and use the transpose symbol, which is a superscript t, to denote the row vector. Thus x(cid:3)(cid:1)(x ,...,x ). (cid:2) (cid:1) We definetwooperations,vectoradditionandmultiplicationby ascalar.If x(cid:1)(x ,...,x ) andy(cid:1)(y ,...,y ) aretwovectors,wedefinetheirsumx(cid:5)y (cid:2) (cid:1) (cid:2) (cid:1) to be x(cid:5)y(cid:1)(x (cid:5)y ,...,x (cid:5)y ). (cid:2) (cid:2) (cid:1) (cid:1) For any scalar, or real number, (cid:2) we define (cid:2)x to be (cid:2)x(cid:1)((cid:2)x ,...,(cid:2)x ). (cid:2) (cid:1) We assume that the reader is familiar with the properties of these operations and knows that under these operations R(cid:1) is a vector space over the reals. Another important operation is the inner product, or dot product, of two vectors, denoted by (cid:3)x,y(cid:4) or x(cid:3)y and defined by (cid:1) (cid:3)x,y(cid:4)(cid:1) (cid:5) x y. (cid:2) (cid:2) (cid:2)(cid:6)(cid:2) Again, we assume that the reader is familiar with the properties of the inner product.Weusetheinnerproducttodefinethenorm(cid:6)·(cid:6)orlengthofavector x as follows: (cid:3) (cid:1) (cid:4)(cid:2)(cid:7)(cid:3) (cid:6)x(cid:6)(cid:1)(cid:3)x,x(cid:4)(cid:2)(cid:7)(cid:3)(cid:1) (cid:5) x(cid:3) . (cid:2) (cid:2)(cid:6)(cid:2) This norm is called the euclidean norm. In R(cid:3) and R(cid:4) the euclidean norm reducesto the familiar length. It is straightforward to show that the norm has the following properties: (cid:6)x(cid:6)(cid:7)0 for all x in R(cid:1), (1) (cid:6)x(cid:6)(cid:1)0 if and only if x is the zero vector 0 in R(cid:1), (2) (cid:6)(cid:2)x(cid:6)(cid:1)(cid:8)(cid:2)(cid:8)(cid:6)x(cid:6) for all real numbers (cid:2) and vectors x in R(cid:1). (3) The norm has two additional properties, which we will prove: For all vectors x and y in R(cid:1) (cid:8)(cid:3)x,y(cid:4)(cid:8)(cid:9)(cid:6)x(cid:6)(cid:6)y(cid:6) (4)

