Table Of ContentSvenO.Krumke
Interior Point Methods
with Applicationsto Discrete Optimization
Draft: July20,2004
ii
Thesecoursenotesarebasedonmylecture
»Interior Point Methods« at the University
ofKaiserslautern.
I would be happy to receive feedback, in
particularsuggestionsforimprovementand
notificiationsoftyposandothererrors.
SvenO.Krumke
krumkemathematik.uni-kl.de
File: Revision: Date:
nlp-le
ture.tex 1.30 2004/07/2016:06:47GMT
Contents
1 Preliminaries 1
1.1 SomeHistory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Newton’sMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 LinearPrograms 5
2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 TheCentralPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Newton’sMethodforthePrimal-DualSystem . . . . . . . . . . . . . . . . . . . . . 10
2.4 OrthogonalProjections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 OrthogonalProjectionsandtheNewtonStep . . . . . . . . . . . . . . . . . . . . . . 12
2.6 AnalysisofaSingleNewtonStep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 APrimal-DualShort-Step-Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8 FromNear-OptimalitytoOptimality . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.9 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 SemidefinitePrograms 27
3.1 BasicsandNotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 SemidefiniteProgramsandDuality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 TheMax-Cut-Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 FormulatingtheRelaxationasSemidefiniteProgram . . . . . . . . . . . . . 32
3.3.2 AnApproximationAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 TheCentralPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 APrimal-Dual-Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5.1 ChoiceoftheNewtonDirection . . . . . . . . . . . . . . . . . . . . . . . . . 40
Bibliography 43
Preliminaries
1.1 Some History
For a long time, the Simplex Method was judged to be the most efficient
method to solve Linear Programs. Although exponential time in the worst-
case,thissituationseemstooccurrarely“inpractice”. Khachian[Kha79]was
thefirsttopresentapolynomialtimealgorithmforLinearProgrammingand
hisresultevenmadeittothefrontpageoftheNewYorkTimes. Althoughtheo-
reticallyimportant,theellipsoid-method(whichhadbeenknownforalongtime
before Khachians result but which nobody considered to be efficient) was a
practicaldisappointment. Theresearchinterestinalternativetechniqueswas
sparkedbyKarmarkar’sseminalpaper[Kar84]whichshowedthatusingpro-
jective transformations one could obtain an algorithm with polynomial run-
ningtime. Sincethenmany“interiorpoint”methodshavebeeninvestigated
andthetechniqueshavebeenshowntocarryovertomoregeneraloptimiza-
tionproblemssuchasquadraticorsemidefiniteproblems.
In general, constrained optimization—finding the best value of an objective
function subject to constraints—has undergone a sweeping change, often
called the “interior point revolution”, starting with Karmarkar’s announce-
mentofagenuinelyfastpolynomial-timealgorithmforLinearProgramming.
Interior point methods have since provideda fascinatingmixtureof old and
new ideas with applications ranging from Linear Programming to approxi-
mation algorithms for NP-hard problems. One of the most important con-
cepts in the development of polynomial time algorithms was the notion of
“self-concordance” introduced by Nesterov and Nemirovski [NN94]. The
book [NN94] contains a wealth of results, but in my humble opinion, the
presentation of the material avoids that the results are accessible to a larger
community.
Theselecturenotesareintentedasanintroductiontointeriorpointmethods.
We start with Linear Programs in Chapter 2 before we discuss the currently
very active field of Semidefinite Programming in Chapter 3. We also show
how the methods can be used to obtain approximation algorithms for NP-
hardoptimizationproblemsfromtheareaofdiscreteoptimization.
1.2 Newton’s Method
Newton’s method lies at the heart of many interior point methods. It is a
methodforsolvingproblemsoftheform
(1.1) g(z)=0,
2 Preliminaries
whereg: Rn Rnisa(nonlinear)smoothfunction. Weassumeinthesequel
thatg(z¯) = 0forsome(unknown)vectorz¯ Rn,thatis,thereexistsinfacta
∈
solutionof(1.→1).
Supposethatwearegivenanapproximationztoz¯ (andg(z) = 0sinceother-
6
wisewewouldbedone).Wewishtodetermineastep∆zsuchthatz+ :=z+∆z
isabetterapproximationtoz. Tothisend,wereplacegbyitslinearizationatz
(cf.Taylor’sTheorem,seee.g. [Rud76]):
!
g(z+∆z) g(z)+Dg(z)∆z=0.
≈
IfDg(z)−1 exists,thentheNewtonstep∆zatzisgivenby
∆z:=−Dg(z)−1g(z),
whichisnonzeroifg(z)=0andleadstotheiterate
6
z+ :=z+∆z=z−Dg(z)−1g(z).
Theresultingalgorithm,Newton’smethod,startsfromaniteratez(0)andthen
iterativelycomputes
(1.2) z(k+1) :=z(k)−Dg(z(k))−1g(z(k)).
Before we consider the convergence properties of Newton’s method, let us
brieflyrecallTaylor’sTheorem:
Theorem1.1(Taylor’sTheoreminRn) Let D Rn be open and g: D Rk
withg C2(D). Letx Dandδ>0besuchtha⊆tB¯ (x ) D. Then,thereexists
0 δ 0
∈ ∈ ⊆
M=Mδ >0suchthatforallhwith h δwehave →
k k ≤
g(x0+h)=g(x0)+Dg(x0)h+r∞(h), with r(h) M h 2 .
k k ≤ k k
Proof: Seee.g.[Rud76,JS04]. ∞ ∞ 2
Inanutshell,asmoothfunctionbehaveslocallylikeaquadraticfunction.
Theorem1.2 Let g: Rn Rn be a functionwith g C3(Rn) and g(z¯) = 0 for
∈
someunknownpointz¯ Rn. Letz(0) Rn besomestartingpoint,and(z(k)) be
k
∈ ∈
thesequencegeneratedbyN→ewton’sMethod(see(1.2)).
Supposethatdet(Dg(z¯))=0. Thereexistsδ>0andc=c >0suchthatNewton’s
δ
6
methodwithanystartingpointz(0) with z(0)−z¯ δtheinequality
2
k k ≤
z(k+1)−z¯ c z(k)−z¯ 2
k k2 ≤ k k2
holdsforalllargek,thatis,thesequence(z(k)) convergeslocallyquadraticallytoz¯.
k
Proof: Sinceg C1(Rn)itfollowsthatdetDg(z)dependscontinuouslyonz.
∈
Inparticular,detDg(z)=0forallz B (z)={z: z−z¯ <ε}forsomesmall
ε
6 ∈ k k
ε>0. Clearlyg(z¯)=0ifandonlyifz¯isafixedpointoftheiterationfunction
Φ(z):=z−Dg(z)−1g(z)
anddetDg(z¯)=0. Usingthefactthattheinverseofamatrixdependsanalyt-
6
icallyonthematrixentriesanddetDg(z) = 0forallz B (z¯),itfollowsthat
ε
6 ∈
ΦistwicecontinuouslydifferentiableinB (z¯). Wehave
ε
DΦ(z)=I−Dg(z)−1Dg(z)−D(Dg(z)−1)g(z)
File: Revision: Date:
nlp-le
ture.tex 1.30 2004/07/2016:06:47GMT
1.2Newton’sMethod 3
and hence DΦ(z¯) = 0. Taylor’s Theoremstatesthatthereexists M > 0such
thatforallz B¯ (z¯)
ε/2
∈
Φ(z)=Φ(z¯)+DΦ(z¯)(z−z¯)+r(z−z¯),
where r(z−z¯) M z−z¯ 2. UsingthefactthatDΦ(z¯)=0weseethat
k k≤ k k
Φ(z)−Φ(z¯) = r(z−z¯) M z−z¯ 2.
k k k k≤ k k
With z := z(k), z(k+1) = Φ(z(k)) and z¯ = Φ(z¯) the claim follows if we set
δ:=min{ε/2,1/(2M)}. 2
File: Revision: Date:
nlp-le
ture.tex 1.30 2004/07/2016:06:47GMT
Linear Programs
2.1 Preliminaries
InthischapterweconsiderLinearProgramsinstandardform:
(2.1a) (LP) min cTx
(2.1b) Ax=b
(2.1c) x 0,
≥
whereAisanm n-matrix,b Rm andc Rn. The LinearProgramming
× ∈ ∈
dualof(2.1)isgivenby
(LD) max bTy
ATy c,
≤
which,afterintroductionofslack-variabless Rncanbewrittenequivalently
∈ +
as
(2.2a) (LD) max bTy
(2.2b) ATy+s=c
(2.2c) s 0
≥
Any vector x Rn which is feasible for (2.1) will be called a primal feasible
∈
solution. Similarly(y,s) Rm Rnwhichisfeasiblefor(2.2)istermedadual
∈ ×
feasiblesolution. Ifxisprimalfeasibleand(y,s)isdualfeasible,thenwehave
(2.3) 0 xTs=xT(c−ATy)=cTx−(Ax)Ty=cTx−bTy.
≤
Thus,cTx bTy. ThefamousDualityTheoremofLinearProgrammingstates
≥
that,infact,“usually”theoptimalsolutionvaluesof(2.1)and(2.2)coincide:
Theorem2.1(DualityTheoremofLinearProgramming) If both, the primal
problem (2.1) and the dual problem (2.2) have feasible solutions, then both prob-
lemsalsohaveoptimalsolutionsx and(y ,s ),respectively. Moreover,inthiscase
∗ ∗ ∗
wehavecTx =bTy ,thatis,theoptimalvaluesofbothproblemscoincide.
∗ ∗
Proof: Seee.g.[Chv83]. 2
FromtheDualityTheoremand(2.3)wecaneasilyderivetheso-calledcomple-
mentaryslacknessoptimalityconditionsforthepair(2.1)and(2.2)ofdualLinear
6 LinearPrograms
Programs: If x is primalfeasible and (y,s) is dualfeasible, then x is optimal
for(2.1)and(y,s)isoptimalfor(2.2)ifandonlyif
(2.4) x s =0, fori=1,...,n.
i i
This follows, since in (2.3) we have xTs = 0 for x 0 and s 0 if and only
≥ ≥
if(2.4)holds.
Asashorthand,by
P ={x Rn :Ax=b,x 0}
∈ ≥
D= (y,s) Rm Rn :ATy+s=c,s 0
∈ × ≥
wedenotethesetoffe(cid:8)asiblesolutionsoftheLP(2.1)anditsd(cid:9)ual(2.2).Wealso
use the following shorthands for the setof strictlyfeasiblesolutions for (2.1)
and(2.2):
P+ ={x Rn :Ax=b,x>0}
∈
D+ = (y,s) Rm Rn :ATy+s=c,s>0
∈ ×
Throughoutthischapt(cid:8)erwemakethefollowingassumptions(cid:9):
Assumption2.2 (i) ThematrixAhasfullrowrank,thatis,rankA=m.
(ii) Bothproblems,(2.1)and(2.2)havestrictlyfeasiblesolutions,thatis,P+ = ∅
6
andD+ =∅.
6
Assumption 2.2(i) can always be enforced by Gaussian elimination. We will
showlaterthatAssumption2.2(ii)doesnotimposealossofgenerality. Recall
that by Duality Theorem we have that Assumption 2.2(ii) implies that both
problems,(2.1)and(2.2)haveoptimalsolutions.
Inthischapteranuppercaseletterdenotesthediagonalmatrixcorresponding
thetherespectivevector,e.g.,ifx Rn,then
∈
x
1
x2
X=diag(x)= ...
x
n
Moreover,byewedenotethevectorconsistingofallonesofappropriatedi-
mension,thatis,
e=(1,...,1)T.
2.2 The Central Path
BytheDualityTheoremofLinearProgramming(seeTheorem2.1onthepre-
ceding page) x Rn and (y,s) Rm Rn are optimal for (2.1) and (2.2)
∈ ∈ ×
respectively,ifandonlyiftheysatisfy
(2.5a) Ax=b
(2.5b) ATy+s=c
(2.5c) Xs=0
(2.5d) x 0,s 0
≥ ≥
File: Revision: Date:
nlp-le
ture.tex 1.30 2004/07/2016:06:47GMT