Springer Series in Information Sciences 10 Springer-Verlag Berlin Heidelberg GmbH Springer Series in Information Sciences Editors: Thomas S. Huang Teuvo Kohonen Manfred R. Schroeder 30 Self-Organizing Maps By T. Kohonen 3rd Edition 31 Music and Schema Theory Cognitive Foundations of Systematic Musicology By M. Leman 32 The Maximum Entropy Method By N. Wu 33 A Few Steps Towards 3D Active Vision By T. Vieville 34 Calibration and Orientation of Cameras in Computer Vision Editors: A. Gruen and T. S. Huang 35 Computer Speech Recognition, Compression, Synthesis By M. R. Schroeder Volumes 1-29 are listed at the end of the book. B. Roy Frieden Pro babili ty, Statistical Optics, and Data Testing A Problem Solving Approach Third Edition With 115 Figures Springer Professor B. Roy Frieden University of Arizona Optic al Research Center Tucson, AZ 85721, USA Series Editors: Professor Thomas S. Huang Department of Electrical Engineering and Coordinated Science Laboratory, University of Tllinois, Urbana, IL 61801, USA Professor Teuvo Kohonen Helsinki University of Technology, Neural Networks Research Centre, Rakentajanaukio 2 C, 02150 Espoo, Finland Professor Dr. Manfred R. Schroeder Drittes Physikalisches Institut, Universităt Giittingen, Biirgerstrasse 42-44, 37073 Giittingen, Germany ISSN 0720-678X ISBN 978-3-540-41708-8 Library of Congress Cataloging-in-Pub!ication Data Frieden, B. Roy. 1936- Probability. statistical optics, and data testing: a problem solving appraoch / Roy Frieden. - 3rd ed. (Springer series in in[ormation sciences, ISSN 0720-678X; 10) Includes bibliographical references and indcx. ISBN 978-3-540-41708-8 ISBN 978-3-642-56699-8 (eBook) DOI 10.1007/978-3-642-56699-8 1. Probabilitics. 2. Stochastic processes. 3. Mathematical statistic,. 4. Optics-Statistical methods. 1. Titlc. II. Series QA273 .F89 2001 519.2-dc21 20001020879 This work is subject to copyright. AII rights arc rcserved, whether the whole Of part of Ihe material is concerned, spccifically the rights of translation, Teprinting, reuse of illustrations, recitation, broadcasting, reproduction an microfilm or in any other way, and storage in data banks. Duplication of this publication ar parts thereof is pcrmitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version. anu permission for use must al ways be obtaincd frum Springer-Verlag Berlin Heidelberg GmbH. Violations are !iable forprosecution under the German Copyright Law. http://www.springer.de © Springer-Verlag Berlin Heidclbcrg 1983. 1991, 2001 Originally published by Springer-Verlag Berlin Heidelberg New York in 2001 Thc use of general descriptive names. rcgistered names, trademarks, etc. in this publicat ion does not imply. even in the ahsence of a specific statement, that such namcs are excmpt [rom the relevant protcctive laws and regulations and therefore free fOf general use. Typesetting: LE-TeX Jelonek, Schmidt & Voecklcr GbR. Leipzig Cover design: desiRn & production GmbH. Heidelberg Printcd an acid-free paper SPIN: 10794554 56/31411YL -5 43210 To Sarah and Miriam Preface to the Third Edition The overall aim of this edition continues to be that of teaching the fundamental methods of probability and statistics. As before, the methods are developed from first principles, and the student is motivated by solving interesting problems in optics, engineering and physics. The more advanced of these problems amount to further developments of the theory, and the student is guided to the solutions by carefully chosen hints. In three decades of teaching, I have found that a student who plays such an active role in developing the theory gets to understand it more fundamentally. It also fosters a sense of confidence in the analytical abilities of the student and, hence, encourages him/her to strike out into unknown areas of research. Meanwhile, the passage of ten years since the second edition has given the author, as well, lots of time to build confidence and learn more about statistics and its ever-broadening domains of application. Our immediate aim is to pass on this increased scope of information to the student. Important additions have been made to the referencing as well. This facilitates learning by the student who wants to know more about a given effect. Of course, all known typographical errors in the previous editions have also been corrected. Additional problems that are analyzed range from the simple, such as winning a state lottery jackpot or computing the probability of intelligent life in the universe, to the more complex, such as modelling the bull and bear behavior of the stock market or formulating a new central limit theorem of optics. A synopsis of these follows. A new central limit theorem of optics is developed. This predicts that the sum of the position coordinates of the photons in a diffraction point spread function (PSF) follow a Cauchy probability law. Also, the output PSF of a relay system using multiply cascaded lenses obeys the Cauchy law. Of course the usual central limit theorem predicts a normal or Gaussian law, but certain of its premises are violated in incoherent diffraction imagery. Other limiting forms are found to follow from a general approach to central limit theorems based upon the use of an invariance principle. Other specifically optical topics that are newly treated are the Mandel formula of photoelectron theory and the concept of coarse graining. The topic of maximum probable estimates of optical objects has been updated to further clarify the scope of application of the MaxEnt approach. YllI Preface to the Third Edition The chapter on Monte Carlo calculations has been extended to include methods of generating jointly fluctuating random variables. Also, methods of artificially generating photon depleted, two-dimensional images are given. The treatment of functions of random variables has been extended to include functions of combinations of random variables, such as quotients or products. For example, it is found that the quotient of two independent Gaussian random variables obeys a Cauchy law. A further application gives the amount by which a probability law is distorted due to viewing its event space from a relativistically moving frame. Fractal processes are now included in the chapter on stochastic processes. This includes the concepts of the Hausdorff dimension and self-similarity. The ideas of connectivity by association, and Erdos numbers, are also briefly treated. The subject of parameter estimation has been broadened in scope, to include the Bhattacharyya bound, receiver operating characteristics and the problem of estimating mUltiple parameters. It is shown that systems of differential equations such as the Lotka-Volterra kind are amenable to probabilistic solutions that complement the usual analytical ones. The approach also permits classical trajectories to be assigned to quantum mechanical particles. The trajectories are not those of the physicist D. Bohm, since they are constructed in an entirely different manner from these. The Heisenberg uncertainty principle is closely examined. It is independently derived from two differing viewpoints: the conventional viewpoint that the widths of a function and its Fourier transform cannot both be arbitrarily small; and a measure ment viewpoint which states that the mean-square error in estimation of a parameter and the information content in the data cannot both be arbitrarily small. Interestingly, the information content is that of Fisher, and not Shannon. The uncertainty principle is so fundamental to physics that its origin in Fisher information prompts us to investigate whether physics in general is based upon the concepts of measurement and Fisher information. The surprising answer is "yes", as developed in the new Sect. l7.3 of Chap. l7. An alternative statement of the uncertainty principle is Hirschman's inequality. This uses the concept of entropy instead of variances of error. It is shown that the entropy of the data and of its Fourier space cannot both be arbitrarily small. The general use of invariance principles for purposes of finding unknown pro bability laws is extensively discussed and applied. A simple example shows that, based upon invariance to change of units, the universal physical constants should obey a l/x or reciprocal probability law. This hypothesis is tested by use of a Chi square test, as an example of the use of this test upon given data. Agreement with the hypothesis at the confidence level a = 0.95 is found. A section has been added on the diverse measures of information that are being used to advantage in the physical and biological sciences, and in engineering. Examples are information measures of Shannon, Renyi, Fisher, Kullback-Leibler, Hellinger, Tsallis, and Gini-Simpson. The fundamental role played by Fisher information I, in particular in deriving the Heisenberg uncertainty principle (see above) motivated us to further study I for Preface to the Third Edition IX its mathematical and physical properties. A calculation of I for the case of correlated additive noise shows the possibility of perfect processing of data. Also, I is found to be a measure of the level of disorder of a system. It also obeys a property of :s additivity, and a monotonic decrease with time, dl/ dt 0 under a wide range of conditions. The latter is a restatement of the Second Law of Thermodynamics with I replacing the usual entropy term. These properties imply that I may be used in the development of thermodynamics from a non-entropic point of view that emphasizes measurement and estimation error in place of heat. A novel concept that arises from this point of view is a Fisher temperature of estimation error, in place of the usual Kelvin temperature of heat. A remaining property of Fisher information is its invariance to unitary trans formation of whatever type (e.g. Fourier transformation). This effect is recognized in Chap. 17 to be a universal invariance principle that is obeyed by all physically valid probability laws. The universal nature of this invariance principle allows it to be used as a key element in a new knowledge-based procedure that finds unknown probability laws. The procedure is called that of "extreme physical information", or EPI for short. The EPI approach is developed out of a model of measurement that incorporates the observer into the observed phenomenon. The observer collects information about the aim of the measurement, an unknown parameter. The infor mation is uniquely Fisher information, and an analysis of its flow from the observed phenomenon to the observer gives the EPI principle. This mathematically amounts to a Lagrangian problem whose output is the required probability law. A zeroing of the Lagrangian gives rise, as well, to a probability law, and both the extremum and zero solutions have physical significance. A method of constructing Lagrangians is one of the chief goals of the overall approach. Since EPI is based upon the use of Lagrangians, an added Appendix G supplies the theory needed to understand how Lagrangians form the differential equations that are the solutions to given problems. EPI is applied to various measurement problems, some via guided exercises, and is shown to give rise to many of the known physical effects that define probability laws: the Schroedinger wave equation, the Dirac equation, the Maxwell-Boltzmann distribution. One of the strengths of the approach is that it also gives rise to valid effects that we ordinarily regard (incorrectly) as not describing probability laws, such as Maxwell's equations. Another strength is that the dimensionality of the unknown probability law can be left as a free parameter. The theoretical answer holds for any number of dimensions. However, there is an effective dimensionality, which is ultimately limited by that of the user's chosen data space. A third advantage is that EPI rests upon the concept of information, and not a specialized concept from physics such as energy. This means that EPI is applicable to more than physical problems. For example, it applies to genetics, as shown in one of the guided exercises. An aspect of EPI that is an outgrowth of its thermodynamic roots is its game aspect. The EPI mathematical procedure is equivalent to the play of a zero-sum game, with information as the prize. The players are the observer and "nature." Nature is represented by the observed phenomenon. Both players choose optimal strategies, X Preface to the Third Edition :s and the payoff of the game is the unknown probability law. In that dl/ dt 0 (see above), the observer always loses the game. However, he gains perfect knowledge of the phenomenon's probability law. As an example, the game aspect is discussed as giving rise to the Higgs mass phenomenon. Here the information prize is the acquisition of mass by one of two reactant particles at the expense of the field energy of the other. This is thought to be the way mass originates in the universe. Tucson, B. Roy Frieden December 2000 Preface to the Second Edition This new edition incorporates corrections of all known typographical errors in the first edition, as well as some more substantive changes. Chief among the latter is the addition of Chap. 17, on methods of estimation. As with the rest of the text, most applications and examples cited in the new chapter are from the optical perspective. The intention behind this new chapter is to empower the optical researcher with a yet broader range of research tools. Certainly a basic knowledge of estimation methods should be among these. In particular, the sections on likelihood theory and Fisher information prepare readers for the problems of optical parameter estimation and probability law estimation. Physicists and optical scientists might find this material particularly useful, since the subject of Fisher information is generally not covered in standard physical science curricula. Since the words "statistical optics" are prominent in the title of this book, their meaning needs to be clarified. There is a general tendency to overly emphasize the statistics of photons as the sine qua non of statistical optics. In this text a wider view is taken, which equally emphasizes the random medium that surrounds the photon, be it a photographic emulsion, the turbulent atmosphere, a vibrating lens holder, etc. Also included are random interpretations of ostensibly deterministic phenomena, such as the Hurter-Driffield (H and D) curve of photography. Such a "random interpretation" sometimes breaks new ground, as in Chap. 5, where it is shown how to produce very accurate ray trace-based spot diagrams, using the statistical theory of Jacobian transformation. This edition, like the first, is intended to be first and foremost an introductory text on methods of probability and statistics. Emphasis is on the linear (and sometimes explicitly Fourier) theory of probability calculation, chi-square and other statistical tests of data, stochastic processes, the information theories of both Shannon and Fisher, and estimation theory. Applications to statistical optics are given, in the main, so as to give a reader with some background in either optics or linear communications theory a running start. As a pedagogical aid to understanding the mathematical tools that are developed, the simplest possible statistical model is used that fits a given optical phenomenon. Hence, a semiclassical model of radiation is used in place of the full-blown quantum optical theory, a poker-chip model is used to describe film granularity, etc. However, references are given to more advanced models as well, so as to steer the interested reader in the right direction. The listing "Statistical models ... " in the index gives a useful overview of the variety of models used. The reader