Table Of ContentLecture Notes ni
Mathematics
A collection of informal reports and seminars
Edited by .A Dold, Heidelberg and .B Eckmann, ZiJrich
257
Richard .B Holmes
Purdue University, Lafayette, IN/USA
A Course no
noitazimitpO dna
Best noitamixorppA
Springer-Verlag
Berlin-Heidelberg • NewYork 1 972
AMS Subject Classifications (1970): 41-02, 41 A 50, 41 A 65, 46B99, 46N05, 49-02, 49B 30, 90C25
ISBN 3-540-05764-1 Springer-Verlag Berlin • Heidelberg • New York
ISBN 0-387-05764-1 Springer-Verlag New York • Heidelberg • Berlin
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned,
specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine
or similar means, and storage in data banks.
Under § 45 of the German Copyright waL where copies made for are other than private use, a eef si payable to the publisher,
the amount of the eef to be determined by agreement with the publisher.
@ by Springer-Verlag Berlin Heidelberg * .2791 Library of Congress Catalog Card Number .357981-07 Printed in Germany.
:kcmdtesffO Julius Hemsbach/Bergstr. Beltz,
PREFACE
The course for which these notes were originally prepared was a
one-semester graduate level course at Purdue University, dealing
with optimization in general and best approximation in particular.
The prerequisites were modest: a semester's worth of functional
analysis together with the usual background required for such a
course. A few prerequisite results of special importance have been
gathered together for ease of reference in Part .I
My general aim was to present an interesting field of application
of functional analysis. Although the tenor of the course si
consequently rather theoretical, I made some effort to include a
few fairly concrete examples, and to bring under consideration
problems of genuine practical interest. Examples of such problems
are convex programs (~'s 11-13), calculus of variations (§17),
minimum effort control (§21), quadrature formulas (§24), construction
of "good" approximations to functions (§'s 26 and 29), optimal
estimation from inadequate data (§33), solution of various ill-posed
linear systems (§'s 34-3S). Indeed, the bulk of the notes si devoted
to a presentation of the theoretical background needed for the study
of such problems.
No attempt has been made to provide encyclopedic coverage of
the various topics. Rather I tried only to show some highlights,
techniques~ and examples in each of the several areas studied.
Should a reader be stimulated to pursue a particular topic further,
he will hopefully find an adequate sample of the pertinent literature
included in the bibliographies. (Note that in addition to the main
bibliography between Parts IV and V, each section in Part V has its
own special set of references appended.)
IV
The first three parts of these notes constitute a slightly
fleshed-out arrangement of the material actually covered in the Purdue
course. That course also involved the solution of numerous problems;
about 50 of those problems have been included here and Part IV
contains hints and/or complete solutions to most of them. Thus this
portion of the notes is reasonably self-contained, modulo the
indicated prerequisites (minor exceptions to this assertion occur on
pages 28, 81 and 89). Part V is a bit more loosely written; in
particular, it contains a few references without proof to rather
deep results. I feel that all the topics in Part V might have
legitimately been included in the course had time permitted. The
order of ~'s 32 and 33 is somewhat arbitrary and could have been
reversed. §'s 34 and 35 provide some applications of metric
projections by illustrating their natural occurrence in attempts to
handle ill-posed linear equations.
It is my hope that the present notes can serve as the basis for
other courses besides the original; for example, a two-quarter
course covering essentially everything, a one-quarter course on best
approximation covering Part III, §'s 31, 32, and perhaps 19 and 35,
or a one-quarter course on convexity and optimization covering
Part If, ~33 (note that 33b) contains a proof of Valadier's formula
for the subdifferential of a supremum of convex functions), and
perhaps some of the early material in Part III.
As format goes, sections are divided into sub-sections; each
sub-section contains at most one theorem, at most one definition,
etc. (the sole exception to this being 33e)). A reference to (sub-
section) 15b), say, is unambiguous; a reference to b), say, refers
to sub-section b) of the current section.
Because of typographical limitations, the symbol "4" has been
used in two ways, which hopefully are distinguishable by context:
it denotes on occasion the empty set, and at other times, it
denotes a linear functional.
Some acknowledgments are now in order. Professor Frank Deutsch
generously made available to me a copy of his own lecture notes on
best approximation, and these proved quite useful in the arrangement
of some of the material in Part III. Mr. Philip Smith provided
several helpful comments about Chebyshev centers in §33. Professor
Paul Halmos kindly recommended the inclusion of the manuscript in
the Springer Lecture Notes Series. Finally, it is a pleasure to
thank Mrs. Nancy ~berle and Mrs. Judy Snider for their competent and
cheerful assistance in the preparation of the manuscript.
West Lafayette, Indiana
November, 1971
CONTENTS
Part .I Preliminaries . . . . . . . . . . . . . . . . . . . .
51. Notation . . . . . . . . . . . . . . . . . . . . . . 1
§2. The Hahn-Banach Theorem . . . . . . . . . . . . . 2
§S. The Separation Theorems . . . . . . . . . . . . . . 4
§4. The Alaoglu-Bourbaki Theorem . . . . . . . . . . . . 7
§5. The Krein-Milman Theorem . . . . . . . . . . . . . . 8
Part II. Theory of Optimization . . . . . . . . . . . . . . . 14
§6. Convex Functions . . . . . . . . . . . . . . . . . . 14
§7. Directional Derivatives . . . . . . . . . . . . . . 16
§8. Subgradients . . . . . . . . . . . . . . . . . . . . 20
§9. Normal Cones . . . . . . . . . . . . . . . . . . . . 23
§i0. Subdifferential Formulas . . . . . . . . . . . . 25
29
§II. Convex Programs . . . . . . . . . . . . . . . . . .
§12. Kuhn-Tucker Theory . . . . . . . . . . . . . . . . . 32
513. Lagrange Multipliers . . . . . . . . . . . . . . . . 36
§14. Conjugate Functions . . . . . . . . . . . . . . . . 42
§lB. Polarity . . . . . . . . . . . . . . . . . . . . . . 48
516. Dubovitskii-Milyutin Theory . . . . . . . . . . . . 51
§17. An Application . . . . . . . . . . . . . . . . . . . 56
§18. Conjugate Functions and Subdifferentials ...... 58
§19. Distance Functions . . . . . . . . . . . . . . . . . 61
§20. The Fenchel Duality Theorem . . . . . . . . . . . . 65
§21. Some Applications . . . . . . . . . . . . . . . . . 7O
Part III. Theory of Best Approximation • . . . . . . . . . . 76
§22. Characterization of Best Approximations ...... 76
§23. Extremal Representations . . . . . . . . . . . . . . 81
§24. Application to Gaussian Quadrature . . . . . . . . . 88
§25. Haar Subspaces . . . . . . . . . . . . . . . . . . . 91
§26. Chebyshev Polynomials . . . . . . . . . . . . . . . 98
§27. Rotundity . . . . . . . . . . . . . . . . . . . . . i05
§28. Chebyshev Subspaces . . . . . . . . . . . . . . . . 109
§29. Algorithms for Best Approximation . . . . . . . . . 118
§30. Proximinal Sets . . . . . . . . . . . . . . . . . . 123
VIII
Part IV. Comments on the Problems . . . . . . . . . . . . . . . 128
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Part V" Selected Special ToPic s . . . . . . . . . . . . . . . . 145
§31. E-spaces . . . . . . . . . . . . . . . . . . . . . . . 145
§32. Metric Projections . . . . . . . . . . . . . . . . . . 157
§33. Optimal Estimation . . . . . . . . . . . . . . . . . . 177
§34. Quasi-Solutions . . . . . . . . . . . . . . . . . . . 203
§35. Generalized Inverses . . . . . . . . . . . . . . . . . 214
Part I
Preliminaries
~i. Notation
Throughout these notes we will be dealing with linear spaces
X, Y, ..., and various mappings defined on them. The underlying
scalar field may be either the real or complex number field, unless
one or the other is explicitly singled out. We list below some of the
abbreviations and/or symbols to be employed throughout the text. Al-
though not all used right away, it is convenient to have them collected
together for ease of reference. Notation of less frequent usage will
be introduced as the need arises.
We write:
Is - for linear space;
tls - for topological linear space;
for locally convex (Hausdorff) space;
its -
nls - for normed linear space;
@ - for the zero vector in a Is;
X - for the real restriction of a complex is;
r
X t _ for the algebraic dual of a Is;
X* - for the continuous dual of a tls;
u(x) for the unit ball {x ~ X: [Ixll ~ i} of a nls x;
S(X) - for the unit sphere {x ~ x: Ilxll = 1} of a nls x;
L(X,Y) for the space of all continuous linear maps from a tls X
into a tls Y;
n R _ for real Euclidean n-space;
for the ith-standard unit vector in Rn;
e. l -
T - for the conjugate of a complex number z;
sgn (z) for the signum ~/Izl of a non-zero complex number z
(with sgn (0) = 0);
span (A) for the linear hull of a set A;
co (A) for the convex hull of a set A;
int (A) for the interior of a set A;
rel-int (A) - for the relative interior of a set A;
cl (A) (or sometimes A) for the closure of a set A;
fIA - for the restriction of a function f to a subset A of
its domain;
wrt - for "with respect to";
nas- for "necessary and sufficient";
C(~) for the space of continuous scalar-valued functions on a
compact Hausdorff space ;lf
rca (~) for the space of regular Borel measures on such a space
IP(n),co,lP,LP(~ ) - for the usual Banach spaces;
A subscript R attached to the symbol for a function space, as
in CR(fl ) or L~(V), means that the functions involved are real-
valued; otherwise the scalars may be either real or complex.
Finally, the symbol "z" is to be read "equals by definition".
§2. The Hahn-Banach Theorem
In this section we recall without proof some variants of the Hahn-
Banach extension theorems. These results all assert the existence of
linear functionals with certain properties. Together with their geo-
metrical versions to be given in §3 below, they constitute the corner-
stone of the existence and duality theory to be developed later in
these notes.
a) Theorem. Let M be a linear subspace of a real is X and
f ~ M'. Let p be a real-valued sublinear function on X such that
f < p]M. Then ~ F E X' satisfying F < p and F]M = f.
3
Thus the linear functional f has a linear extension F to all
of X and this extension remains dominated (pointwise) by p. Using
a separation theorem (§3) Weston [77] has shown that the above result
remains true if p is replaced by a (finite) convex function on X.
b) Corollary. Let X be a complex is and let f, M have the
same meaning as in a). If p is a semi-norm on X such that
If(.)l ! plM, then O F e X' such that IF(.)I ! p and FIM = f.
c) Corollary. Let X be a nls, M a linear subspace of X,
and f s M*. Then ~ F s X* such that IIFII = ]Ifl[ and FIM = .f
This result may be viewed in particular as asserting the exist-
ence of a continuous linear extension of f with minimal norm.
Clearly f has "many" extensions F with I[FII > llfl[. It is less
clear a priori whether or not an extension of minimal norm is uni%ue.
This question has some interesting connections with approximation and
moment problems; the reader may consult [19, 26, 58, 73] for further
details.
We note also that the proofs of b) and c) above establish some
information about linear functionals on a complex nls. Namely, let X
be such a space and f g X*. Define (re f)(x) -- re f(x) and
(im f)(x) = im f(x). Then re f and im f belong to X* (where X
r r
denotes X regarded as a real is), ari( )f )x( = -(re f)(i x), and
II re fll = Ilfll • And conversely, if f ~ ~ X and F is defined by
r'
F(x) = f(x) i f(i x), x ~ X,
then F s X* and IIFII = llfll
d) Corollary. Let M be a linear subspace of the nls X and
x ~ X \ M. Then f ~ S(X*) such that f(x) = 0 ~x ~ M and
O
f(Xo) = d(Xo,M .)
Proofs of all the preceding results, along with further corollar-