Table Of ContentCONVEXITY AND
R(cid:1)
OPTIMIZATION IN
CONVEXITY AND
R(cid:1)
OPTIMIZATION IN
LEONARDD. BERKOVITZ
Purdue University
A Wiley-IntersciencePublication
JOHN WILEY & SONS, INC.
(cid:1)
Thisbookisprintedonacid-freepaper. (cid:2).
Copyright(cid:1)2002byJohnWiley&Sons,Inc.,NewYork,NY.Allrightsreserved.
PublishedsimultaneouslyinCanada.
Nopartofthispublicationmaybereproduced,storedinaretrievalsystemortransmittedinany
formorbyanymeans,electronic,mechanical,photocopying,recording,scanningorotherwise,
exceptaspermittedunderSections107or108ofthe1976UnitedStatesCopyrightAct,without
eitherthepriorwrittenpermissionofthePublisher,orauthorizationthroughpaymentofthe
appropriateper-copyfeetotheCopyrightClearanceCenter,222RosewoodDrive,Danvers,MA
01923,(978)750-8400fax(978)750-4744.RequeststothePublisherforpermissionshouldbe
addressedtothePermissionsDepartment,JohnWiley&Sons,Inc.,605ThirdAvenue,NewYork,
NY10158-0012,(212)850-6011,fax(212)850-6008,E-Mail:PERMREQ(cid:2)WILEY.COM.
Fororderingandcustomerservice,call1-800-CALL-WILEY.
LibraryofCongressCataloging-in-PublicationData:
Berkovitz,LeonardDavid,1924—
ConvexityandoptimizationinR[superscriptn]/LeonardD.Berkovitz.
p.cm.--(Pureandappliedmathematics)
‘‘AWiley-Intersciencepublication.’’
Includesbibliographicalreferencesandindex.
ISBN0-471-35281-0(cloth:alk.paper)
1.Convexsets.2.Mathematicaloptimization.I.Title.II.Pureandapplied
mathematics(JohnWiley&Sons:Unnumbered)
QA640.B462001
516(cid:1).08--dc21
2001045391
PrintedintheUnitedStatesofAmerica.
10 9 8 7 6 5 4 3 2 1
To my wife, Anna
CONTENTS
Preface ix
I Topics in Real Analysis 1
1. Introduction 1
2. Vectors in R(cid:1) 1
3. Algebra of Sets 4
4. Metric Topology of R(cid:1) 5
5. Limits and Continuity 8
6. Basic Property of Real Numbers 11
7. Compactness 14
8. Equivalent Norms and Cartesian Products 16
9. Fundamental Existence Theorem 18
10. Linear Transformations 21
11. Differentiation in R(cid:1) 22
II Convex Sets in RRn 30
1. Lines and Hyperplanes in R(cid:1) 30
2. Properties of Convex Sets 35
3. Separation Theorems 45
4. Supporting Hyperplanes: Extreme Points 56
5. Systems of Linear Inequalities: Theorems of the Alternative 61
6. Affine Geometry 69
7. More on Separation and Support 80
III Convex Functions 87
1. Definition and Elementary Properties 87
2. Subgradients 102
3. Differentiable Convex Functions 106
4. Alternative Theorems for Convex Functions 113
5. Application to Game Theory 117
vii
viii CONTENTS
IV Optimization Problems 128
1. Introduction 128
2. Differentiable Unconstrained Problems 129
3. Optimization of Convex Functions 137
4. Linear Programming Problems 139
5. First-Order Conditions for Differentiable Nonlinear
Programming Problems 145
6. Second-Order Conditions 163
V Convex Programming and Duality 179
1. Problem Statement 179
2. Necessary Conditions and Sufficient Conditions 181
3. Perturbation Theory 188
4. Lagrangian Duality 200
5. Geometric Interpretation 207
6. Quadratic Programming 210
7. Duality in Linear Programming 215
VI Simplex Method 222
1. Introduction 222
2. Extreme Points of Feasible Set 225
3. Preliminaries to Simplex Method 230
4. Phase II of Simplex Method 234
5. Termination and Cycling 245
6. Phase I of Simplex Method 251
7. Revised Simplex Method 255
Bibliography 261
Index 263
PREFACE
Thisbookpresentsthemathematicsoffinite-dimensionaloptimization,featur-
ingthoseaspectsofconvexitythatareusefulinthiscontext.Itprovidesabasis
forthefurtherstudyof convexity,of moregeneraloptimizationproblems,and
of numerical algorithms for the solution of finite-dimensional optimization
problems. The intended audience consists of beginning graduate students in
engineering, economics, operations research, and mathematics and strong
undergraduates in mathematics. This was the audience in a one-semester
course at Purdue, MA 521, from which this book evolved.
Ideally, the prerequisites for reading this book are good introductory
courses in real analysis and linear algebra. In teaching MA 521, I found that
whilethemathematicsstudentshadtherealanalysisprerequisites,manyofthe
other students who took the course because of their interest in optimization
didnot havethis prerequisite.ChapterIis for thosestudentsandreaders who
do not have the real analysis prerequisite; in it I present those concepts and
results from real analysis that are needed. Except for the Weierstrass theorem
on the existence of a minimum, the ‘‘heavy’’ or ‘‘deep’’ theorems are stated
without proof. Students without the real variables prerequisite found the
material difficult at first, but most managed to assimilate it at a satisfactory
level. The advice to readers for whom this is the first encounter with the
material in Chapter I is to make a serious effort to master it and to return to
it as it is used in the sequel.
Toaddressaswideanaudienceaspossible,Ihavenotalwayspresentedthe
mostgeneralresultorargument.Thus,inChapterIIIchosethe‘‘closestpoint’’
approachto separationtheorems,ratherthanmoregenerallyvalidarguments,
because I believe it to be more intuitive and straightforward for the intended
audience. Readers who wish to get the best possible separation theorem in
finite dimensions should read Sections 6 and 7 of Chapter II. In proving the
Fritz John Theorem, I used a penalty function argument due to McShane
rather than more technical arguments involving linearizations. I limited the
discussion of duality to Lagrangian duality and did not consider Fenchel
duality, since the latter would require the development of more mathematical
machinery.
ix
x PREFACE
Thenumberingsystemandreferencesystemfortheorems,lemmas,remarks,
andcorollariesisthefollowing.Withinagivenchapter,theorems,lemmas,and
remarks are numbered consecutively in each section, preceded by the section
number. Thus, the first theorem of Section 1 is Theorem 1.1, the second,
Theorem1.2,andsoon.Thesameappliestolemmasandremarks.Corollaries
are numbered consecutively within each section without a reference to the
section number. Reference to a theorem in the same chapter is given by the
theoremnumber.Referencetoatheoreminachapterdifferentfromthecurrent
oneisgivenbythetheoremnumberprecededbythechapternumberinRoman
numerals.Thus,areferenceinChapterIVtoTheorem4.1inChapterIIwould
be Theorem II.4.1. References to lemmas and remarks are similar. References
tocorollarieswithinthesamesectionaregivenbythenumberofthecorollary.
Referencesto corollariesinadifferentsectionofthesamechapteraregivenby
prefixing the section number to the corollary number; references in a different
chapter are given by prefixing the chapter number in Roman numerals to the
preceding.
IthankRitaSaerensandJohnGregoryforreadingthecoursenotesversion
ofthisbookandfortheircorrectionsandsuggestionsforimprovement.Ithank
Terry Combs for preparing the figures. I also thank Betty Gick for typing
seemingly innumerable versions and revisions of the notes for MA 521. Her
skill and cooperation contributed greatly to the success of this project.
L D. B
WestLafayette,Indiana
ConvexityandOptimizationin(cid:1)n.LeonardD.Berkovitz
Copyright(cid:182)2002JohnWiley&Sons,Inc.
ISBN:0-471-35281-0
I
TOPICS IN REAL ANALYSIS
1. INTRODUCTION
Theseriousstudyofconvexityandoptimizationproblemsin R(cid:1) requiressome
background in real analysis and in linear algebra. In teaching a course based
on notes from which this text evolved, the author and his colleagues assumed
that the students had an undergraduate course in linear algebra but did not
necessarily have a background in real analysis. The purpose of this chapter is
to provide the reader with most of the necessary background in analysis. Not
all statements will be proved. The reader, however, is expected to understand
the definitions and the theorems and is expected to follow the proofs of
statements whenever the proofs are given. The bad news is that many readers
will find this chapter to be the most difficult one in the text. The good news is
that careful study of this material will provide background for many other
coursesandthat subsequentchaptersshould beeasier. If necessary,the reader
should return to this chapter when encountering this material later on.
2. VECTORS IN R(cid:1)
By euclidean n-space, or R(cid:1), we mean the set of all n-tuples x(cid:1)(x ,...,x ),
(cid:2) (cid:1)
where the x, i(cid:1)1,...,n are real numbers. Thus, R(cid:1) is a generalization of the
(cid:2)
familiar two- and three-dimensional spaces R(cid:3) and R(cid:4). The elements x of R(cid:1)
are called vectors or points. We will often identify the vector x(cid:1)(x ,...,x )
(cid:2) (cid:1)
with the n(cid:1)1 matrix
(cid:1)x (cid:2)
(cid:2)
(cid:1)
x
(cid:1)
and abuse the use of the equal sign to write
(cid:1)x (cid:2)
(cid:2)
x(cid:1) (cid:1) .
x
(cid:1)
1
2 TOPICSINREALANALYSIS
In this case we shall call x a column vector. We shall also identify x with the
1(cid:1)n matrix(x ,...,x ) and write x(cid:1)(x ,...,x ). In this case we shallcall x
(cid:2) (cid:1) (cid:2) (cid:1)
a row vector. It will usually be clear from the context whether we consider x
to be a row vector or a column vector. When there is danger of confusion, we
will identify x with the column vector and use the transpose symbol, which is
a superscript t, to denote the row vector. Thus x(cid:3)(cid:1)(x ,...,x ).
(cid:2) (cid:1)
We definetwooperations,vectoradditionandmultiplicationby ascalar.If
x(cid:1)(x ,...,x ) andy(cid:1)(y ,...,y ) aretwovectors,wedefinetheirsumx(cid:5)y
(cid:2) (cid:1) (cid:2) (cid:1)
to be
x(cid:5)y(cid:1)(x (cid:5)y ,...,x (cid:5)y ).
(cid:2) (cid:2) (cid:1) (cid:1)
For any scalar, or real number, (cid:2) we define (cid:2)x to be
(cid:2)x(cid:1)((cid:2)x ,...,(cid:2)x ).
(cid:2) (cid:1)
We assume that the reader is familiar with the properties of these operations
and knows that under these operations R(cid:1) is a vector space over the reals.
Another important operation is the inner product, or dot product, of two
vectors, denoted by (cid:3)x,y(cid:4) or x(cid:3)y and defined by
(cid:1)
(cid:3)x,y(cid:4)(cid:1) (cid:5) x y.
(cid:2) (cid:2)
(cid:2)(cid:6)(cid:2)
Again, we assume that the reader is familiar with the properties of the inner
product.Weusetheinnerproducttodefinethenorm(cid:6)·(cid:6)orlengthofavector
x as follows:
(cid:3) (cid:1) (cid:4)(cid:2)(cid:7)(cid:3)
(cid:6)x(cid:6)(cid:1)(cid:3)x,x(cid:4)(cid:2)(cid:7)(cid:3)(cid:1) (cid:5) x(cid:3) .
(cid:2)
(cid:2)(cid:6)(cid:2)
This norm is called the euclidean norm. In R(cid:3) and R(cid:4) the euclidean norm
reducesto the familiar length. It is straightforward to show that the norm has
the following properties:
(cid:6)x(cid:6)(cid:7)0 for all x in R(cid:1), (1)
(cid:6)x(cid:6)(cid:1)0 if and only if x is the zero vector 0 in R(cid:1), (2)
(cid:6)(cid:2)x(cid:6)(cid:1)(cid:8)(cid:2)(cid:8)(cid:6)x(cid:6) for all real numbers (cid:2) and vectors x in R(cid:1). (3)
The norm has two additional properties, which we will prove:
For all vectors x and y in R(cid:1)
(cid:8)(cid:3)x,y(cid:4)(cid:8)(cid:9)(cid:6)x(cid:6)(cid:6)y(cid:6) (4)