Lecture Notes ni Mathematics A collection of informal reports and seminars Edited by .A Dold, Heidelberg and .B Eckmann, ZiJrich 257 Richard .B Holmes Purdue University, Lafayette, IN/USA A Course no noitazimitpO dna Best noitamixorppA Springer-Verlag Berlin-Heidelberg • NewYork 1 972 AMS Subject Classifications (1970): 41-02, 41 A 50, 41 A 65, 46B99, 46N05, 49-02, 49B 30, 90C25 ISBN 3-540-05764-1 Springer-Verlag Berlin • Heidelberg • New York ISBN 0-387-05764-1 Springer-Verlag New York • Heidelberg • Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. Under § 45 of the German Copyright waL where copies made for are other than private use, a eef si payable to the publisher, the amount of the eef to be determined by agreement with the publisher. @ by Springer-Verlag Berlin Heidelberg * .2791 Library of Congress Catalog Card Number .357981-07 Printed in Germany. :kcmdtesffO Julius Hemsbach/Bergstr. Beltz, PREFACE The course for which these notes were originally prepared was a one-semester graduate level course at Purdue University, dealing with optimization in general and best approximation in particular. The prerequisites were modest: a semester's worth of functional analysis together with the usual background required for such a course. A few prerequisite results of special importance have been gathered together for ease of reference in Part .I My general aim was to present an interesting field of application of functional analysis. Although the tenor of the course si consequently rather theoretical, I made some effort to include a few fairly concrete examples, and to bring under consideration problems of genuine practical interest. Examples of such problems are convex programs (~'s 11-13), calculus of variations (§17), minimum effort control (§21), quadrature formulas (§24), construction of "good" approximations to functions (§'s 26 and 29), optimal estimation from inadequate data (§33), solution of various ill-posed linear systems (§'s 34-3S). Indeed, the bulk of the notes si devoted to a presentation of the theoretical background needed for the study of such problems. No attempt has been made to provide encyclopedic coverage of the various topics. Rather I tried only to show some highlights, techniques~ and examples in each of the several areas studied. Should a reader be stimulated to pursue a particular topic further, he will hopefully find an adequate sample of the pertinent literature included in the bibliographies. (Note that in addition to the main bibliography between Parts IV and V, each section in Part V has its own special set of references appended.) IV The first three parts of these notes constitute a slightly fleshed-out arrangement of the material actually covered in the Purdue course. That course also involved the solution of numerous problems; about 50 of those problems have been included here and Part IV contains hints and/or complete solutions to most of them. Thus this portion of the notes is reasonably self-contained, modulo the indicated prerequisites (minor exceptions to this assertion occur on pages 28, 81 and 89). Part V is a bit more loosely written; in particular, it contains a few references without proof to rather deep results. I feel that all the topics in Part V might have legitimately been included in the course had time permitted. The order of ~'s 32 and 33 is somewhat arbitrary and could have been reversed. §'s 34 and 35 provide some applications of metric projections by illustrating their natural occurrence in attempts to handle ill-posed linear equations. It is my hope that the present notes can serve as the basis for other courses besides the original; for example, a two-quarter course covering essentially everything, a one-quarter course on best approximation covering Part III, §'s 31, 32, and perhaps 19 and 35, or a one-quarter course on convexity and optimization covering Part If, ~33 (note that 33b) contains a proof of Valadier's formula for the subdifferential of a supremum of convex functions), and perhaps some of the early material in Part III. As format goes, sections are divided into sub-sections; each sub-section contains at most one theorem, at most one definition, etc. (the sole exception to this being 33e)). A reference to (sub- section) 15b), say, is unambiguous; a reference to b), say, refers to sub-section b) of the current section. Because of typographical limitations, the symbol "4" has been used in two ways, which hopefully are distinguishable by context: it denotes on occasion the empty set, and at other times, it denotes a linear functional. Some acknowledgments are now in order. Professor Frank Deutsch generously made available to me a copy of his own lecture notes on best approximation, and these proved quite useful in the arrangement of some of the material in Part III. Mr. Philip Smith provided several helpful comments about Chebyshev centers in §33. Professor Paul Halmos kindly recommended the inclusion of the manuscript in the Springer Lecture Notes Series. Finally, it is a pleasure to thank Mrs. Nancy ~berle and Mrs. Judy Snider for their competent and cheerful assistance in the preparation of the manuscript. West Lafayette, Indiana November, 1971 CONTENTS Part .I Preliminaries . . . . . . . . . . . . . . . . . . . . 51. Notation . . . . . . . . . . . . . . . . . . . . . . 1 §2. The Hahn-Banach Theorem . . . . . . . . . . . . . 2 §S. The Separation Theorems . . . . . . . . . . . . . . 4 §4. The Alaoglu-Bourbaki Theorem . . . . . . . . . . . . 7 §5. The Krein-Milman Theorem . . . . . . . . . . . . . . 8 Part II. Theory of Optimization . . . . . . . . . . . . . . . 14 §6. Convex Functions . . . . . . . . . . . . . . . . . . 14 §7. Directional Derivatives . . . . . . . . . . . . . . 16 §8. Subgradients . . . . . . . . . . . . . . . . . . . . 20 §9. Normal Cones . . . . . . . . . . . . . . . . . . . . 23 §i0. Subdifferential Formulas . . . . . . . . . . . . 25 29 §II. Convex Programs . . . . . . . . . . . . . . . . . . §12. Kuhn-Tucker Theory . . . . . . . . . . . . . . . . . 32 513. Lagrange Multipliers . . . . . . . . . . . . . . . . 36 §14. Conjugate Functions . . . . . . . . . . . . . . . . 42 §lB. Polarity . . . . . . . . . . . . . . . . . . . . . . 48 516. Dubovitskii-Milyutin Theory . . . . . . . . . . . . 51 §17. An Application . . . . . . . . . . . . . . . . . . . 56 §18. Conjugate Functions and Subdifferentials ...... 58 §19. Distance Functions . . . . . . . . . . . . . . . . . 61 §20. The Fenchel Duality Theorem . . . . . . . . . . . . 65 §21. Some Applications . . . . . . . . . . . . . . . . . 7O Part III. Theory of Best Approximation • . . . . . . . . . . 76 §22. Characterization of Best Approximations ...... 76 §23. Extremal Representations . . . . . . . . . . . . . . 81 §24. Application to Gaussian Quadrature . . . . . . . . . 88 §25. Haar Subspaces . . . . . . . . . . . . . . . . . . . 91 §26. Chebyshev Polynomials . . . . . . . . . . . . . . . 98 §27. Rotundity . . . . . . . . . . . . . . . . . . . . . i05 §28. Chebyshev Subspaces . . . . . . . . . . . . . . . . 109 §29. Algorithms for Best Approximation . . . . . . . . . 118 §30. Proximinal Sets . . . . . . . . . . . . . . . . . . 123 VIII Part IV. Comments on the Problems . . . . . . . . . . . . . . . 128 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Part V" Selected Special ToPic s . . . . . . . . . . . . . . . . 145 §31. E-spaces . . . . . . . . . . . . . . . . . . . . . . . 145 §32. Metric Projections . . . . . . . . . . . . . . . . . . 157 §33. Optimal Estimation . . . . . . . . . . . . . . . . . . 177 §34. Quasi-Solutions . . . . . . . . . . . . . . . . . . . 203 §35. Generalized Inverses . . . . . . . . . . . . . . . . . 214 Part I Preliminaries ~i. Notation Throughout these notes we will be dealing with linear spaces X, Y, ..., and various mappings defined on them. The underlying scalar field may be either the real or complex number field, unless one or the other is explicitly singled out. We list below some of the abbreviations and/or symbols to be employed throughout the text. Al- though not all used right away, it is convenient to have them collected together for ease of reference. Notation of less frequent usage will be introduced as the need arises. We write: Is - for linear space; tls - for topological linear space; for locally convex (Hausdorff) space; its - nls - for normed linear space; @ - for the zero vector in a Is; X - for the real restriction of a complex is; r X t _ for the algebraic dual of a Is; X* - for the continuous dual of a tls; u(x) for the unit ball {x ~ X: [Ixll ~ i} of a nls x; S(X) - for the unit sphere {x ~ x: Ilxll = 1} of a nls x; L(X,Y) for the space of all continuous linear maps from a tls X into a tls Y; n R _ for real Euclidean n-space; for the ith-standard unit vector in Rn; e. l - T - for the conjugate of a complex number z; sgn (z) for the signum ~/Izl of a non-zero complex number z (with sgn (0) = 0); span (A) for the linear hull of a set A; co (A) for the convex hull of a set A; int (A) for the interior of a set A; rel-int (A) - for the relative interior of a set A; cl (A) (or sometimes A) for the closure of a set A; fIA - for the restriction of a function f to a subset A of its domain; wrt - for "with respect to"; nas- for "necessary and sufficient"; C(~) for the space of continuous scalar-valued functions on a compact Hausdorff space ;lf rca (~) for the space of regular Borel measures on such a space IP(n),co,lP,LP(~ ) - for the usual Banach spaces; A subscript R attached to the symbol for a function space, as in CR(fl ) or L~(V), means that the functions involved are real- valued; otherwise the scalars may be either real or complex. Finally, the symbol "z" is to be read "equals by definition". §2. The Hahn-Banach Theorem In this section we recall without proof some variants of the Hahn- Banach extension theorems. These results all assert the existence of linear functionals with certain properties. Together with their geo- metrical versions to be given in §3 below, they constitute the corner- stone of the existence and duality theory to be developed later in these notes. a) Theorem. Let M be a linear subspace of a real is X and f ~ M'. Let p be a real-valued sublinear function on X such that f < p]M. Then ~ F E X' satisfying F < p and F]M = f. 3 Thus the linear functional f has a linear extension F to all of X and this extension remains dominated (pointwise) by p. Using a separation theorem (§3) Weston [77] has shown that the above result remains true if p is replaced by a (finite) convex function on X. b) Corollary. Let X be a complex is and let f, M have the same meaning as in a). If p is a semi-norm on X such that If(.)l ! plM, then O F e X' such that IF(.)I ! p and FIM = f. c) Corollary. Let X be a nls, M a linear subspace of X, and f s M*. Then ~ F s X* such that IIFII = ]Ifl[ and FIM = .f This result may be viewed in particular as asserting the exist- ence of a continuous linear extension of f with minimal norm. Clearly f has "many" extensions F with I[FII > llfl[. It is less clear a priori whether or not an extension of minimal norm is uni%ue. This question has some interesting connections with approximation and moment problems; the reader may consult [19, 26, 58, 73] for further details. We note also that the proofs of b) and c) above establish some information about linear functionals on a complex nls. Namely, let X be such a space and f g X*. Define (re f)(x) -- re f(x) and (im f)(x) = im f(x). Then re f and im f belong to X* (where X r r denotes X regarded as a real is), ari( )f )x( = -(re f)(i x), and II re fll = Ilfll • And conversely, if f ~ ~ X and F is defined by r' F(x) = f(x) i f(i x), x ~ X, then F s X* and IIFII = llfll d) Corollary. Let M be a linear subspace of the nls X and x ~ X \ M. Then f ~ S(X*) such that f(x) = 0 ~x ~ M and O f(Xo) = d(Xo,M .) Proofs of all the preceding results, along with further corollar-