ebook img

AIMS lecture notes on numerical analysis PDF

271 Pages·2006·3.809 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview AIMS lecture notes on numerical analysis

AIMS Lecture Notes on Numerical Analysis Peter J. Olver March, 2006 Lecture notes: Computer Arithmetic. pdf postscript Numerical Solution of Scalar Equations. pdf postscript Review of Matrix Algebra. pdf postscript Gaussian Elimination. pdf postscript Inner Products and Norms. pdf postscript Eigenvalues and Singular Values. pdf postscript Iterative Methods for Linear Systems. pdf postscript Numerical Computation of Eigenvalues. pdf postscript Numerical Solution of Algebraic Systems. pdf postscript Numerical Solution of Ordinary Differential Equations. pdf postscript Numerical Solution of the Heat and Wave Equations. pdf postscript Minimization. pdf postscript Approximation and Interpolation. pdf postscript The Finite Element Method. pdf postscript References: pdf postscript Exercise sets: pdf: #1 #2 #3 #4 #5 #6 #7 postscript: #1 #2 #3 #4 #5 #6 #7 1 of 1 AIMS Lecture Notes 2006 Peter J. Olver 1. Computer Arithmetic The purpose of computing is insight, not numbers. | R.W. Hamming, [23] The main goal of numerical analysis is to develop e–cient algorithms for computing precise numerical values of mathematical quantities, including functions, integrals, solu- tions of algebraic equations, solutions of difierential equations (both ordinary and partial), solutions of minimization problems, and so on. The objects of interest typically (but not exclusively) arise in applications, which seek not only their qualitative properties, but also quantitative numerical data. The goal of this course of lectures is to introduce some of the most important and basic numerical algorithms that are used in practical computations. Beyond merely learning the basic techniques, it is crucial that an informed practitioner develop a thorough understanding of how the algorithms are constructed, why they work, and what their limitations are. In any applied numerical computation, there are four key sources of error: (i) Inexactness of the mathematical model for the underlying physical phenomenon. (ii) Errors in measurements of parameters entering the model. (iii) Round-ofi errors in computer arithmetic. (iv) Approximations used to solve the full mathematical system. Ofthese, (i)isthedomainofmathematicalmodeling, andwillnotconcernushere. Neither will(ii), whichisthedomainoftheexperimentalists. (iii)arisesduetotheflnitenumerical precision imposed by the computer. (iv) is the true domain of numerical analysis, and refers to the fact that most systems of equations are too complicated to solve explicitly, or, even in cases when an analytic solution formula is known, directly obtaining the precise numerical values may be di–cult. There are two principal ways of quantifying computational errors. ? Deflnition 1.1. Let x be a real number and let x be an approximation. The ? ? absolute error in the approximation x … x is deflned as jx ¡xj. The relative error is ? jx ¡xj deflned as the ratio of the absolute error to the size of x, i.e., , which assumes jxj x 6= 0; otherwise relative error is not deflned. 3/15/06 1 (cid:176)c 2006 Peter J. Olver For example, 1000001 is an approximation to 1000000 with an absolute error of 1 and a relative error of 10¡6, while 2 is an approximation to 1 with an absolute error of 1 and a relative error of 1. Typically, relative error is more intuitive and the preferred determiner of the size of the error. The present convention is that errors are always ‚ 0, and are = 0 ? if and only if the approximation is exact. We will say that an approximation x has k signiflcant decimal digits if its relative error is < 5£10¡k¡1. This means that the flrst k ? digits of x following its flrst nonzero digit are the same as those of x. Ultimately, numerical algorithms must be performed on a computer. In the old days, \computer" referred to a person or an analog machine, but nowadays inevitably means a digital, electronic machine. A most unfortunate fact of life is that all digital computers, no matter how \super", can only store flnitely many quantities. Thus, there is no way that a computer can represent the (discrete) inflnity of all integers or all rational numbers, let alone the (continuous) inflnity of all real or all complex numbers. So the decision of how to approximate more general numbers using only the flnitely many that the computer can handle becomes of critical importance. Each number in a computer is assigned a location or word, consisting of a specifled k number of binary digits or bits. A k bit word can store a total of N = 2 difierent numbers. For example, the standard single precision computer uses 32 bit arithmetic, for a total of N = 232 … 4:3£109 difierent numbers, while double precision uses 64 bits, with N = 264 … 1:84£1019 difierent numbers. The question is how to distribute the N exactly representable numbers over the real line for maximum e–ciency and accuracy in practical computations. One evident choice is to distribute them evenly, leading to flxed point arithmetic. For example, one can use the flrst bit in a word to represent a sign, and treat the remaining bits as integers, thereby representing the integers from 1¡ 1N = 1¡ 2k¡1 to 1N = 2k¡1 2 2 exactly. Ofcourse,thisdoesapoorjobapproximatingmostnon-integralnumbers. Another option is to space the numbers closely together, say with uniform gap of 2¡n, and so distribute our N numbers uniformly over the interval ¡2¡n¡1N < x • 2¡n¡1N. Real numbers lying between the gaps are represented by either rounding, meaning the closest exact representative, or chopping, meaning the exact representative immediately below (or above if negative) the number. Numbers lying beyond the range must be represented by the largest (or largest negative) representable number, which thus becomes a symbol for over(cid:176)ow. When processing speed is a signiflcant bottleneck, the use of such flxed point representations is an attractive and faster alternative to the more cumbersome (cid:176)oating point arithmetic most commonly used in practice. The most common non-uniform distribution of our N numbers is the (cid:176)oating point system, which is based on scientiflc notation. If x is any real number it can be written in the form n x = §:d d d d ::: £2 ; 1 2 3 4 where d = 0 or 1, and n 2 Z is an integer. We call d d d d ::: the mantissa and n the ” 1 2 3 4 exponent. If x 6= 0, then we can uniquely choose n so that d = 1. In our computer, we 1 approximate x by a flnite number of mantissa digits ? b n x = §:d d d d ::: d d £2 ; 1 2 3 4 k¡1 k 3/15/06 2 (cid:176)c 2006 Peter J. Olver by either chopping or rounding the flnal digit. The exponent is also restricted to a flnite ? range of integers n • N • N . Very small numbers, lying in the gap between 0 < jxj < ? n 2 ?, are said to cause under(cid:176)ow. † In single precision (cid:176)oating point arithmetic, the sign is 1 bit, the exponent is 7 bits, and the mantissa is 24 bits. The resulting nonzero numbers lie in the range 2¡127 … 10¡38 • jxj • 2127 … 1038; and allow one to accurately represent numbers with approximately 7 signiflcant decimal digits of real numbers lying in this range. † In double precision (cid:176)oating point arithmetic, the sign is 1 bit, the exponent is 10 bits, and the mantissa is 53 bits, leading to (cid:176)oating point representations for a total of 1:84£1019 difierent numbers which, apart from 0. The resulting nonzero numbers lie in the range 2¡1023 … 10¡308 • jxj • 21023 … 10308: In double precision, one can accurately represent approximately 15 decimal digits. Keep in mind (cid:176)oating point numbers are not uniformly spaced! Moreover, when passing from :111111::: £2n to :100000::: £2n+1, the inter-number spacing suddenly jumps by a factor of 2. The non-smoothly varying spacing inherent in the (cid:176)oating point system can cause subtle, undesirable numerical artifacts in high precision computations. Remark: Although they are overwhelmingly the most prevalent, flxed and (cid:176)oating point are not the only number systems that have been proposed. See [9] for another intriguing possibility. In the course of a calculation, intermediate errors interact in a complex fashion, and ? result in a flnal total error that is not just the sum of the individual errors. If x is an ? approximationtox, andy isanapproximationtoy, then, insteadofarithmeticoperations +;¡;£;= and so on, the computer uses a \pseudo-operations" to combine them. For instance, to approximate the difierence x ¡ y of two real numbers, the computer begins ? ? by replacing each by its (cid:176)oating point approximation x and y . It the subtracts these, ? ? and replaces the exact result x ¡y by its nearest (cid:176)oating point representative, namely ? ? ? (x ¡y ) . As the following example makes clear, this is not necessarily the same as the ? ? ? ? (cid:176)oating point approximation to x¡y, i.e., in general (x ¡y ) 6= (x¡y) . Example 1.2.. Lets see what happens if we subtract the rational numbers 301 301 x = … :15050000::: ; y = … :150424787::: : 2000 2001 The exact answer is 301 x¡y = … :00007521239::: : 4002000 However, if we use rounding to approximate with 4 signiflcant digitsy, then 301 301 x = … :1505; y = … :1504 and so x¡y … :0001; 2000 2001 y To aid comprehension, we are using base 10 instead of base 2 arithmetic in this example. 3/15/06 3 (cid:176)c 2006 Peter J. Olver 3 2 1 2 4 6 8 10 -1 -2 -3 Figure 1.1. Roots of Polynomials. which has no signiflcant digits in common with the actual answer. On the other hand, if we evaluate the difierence using the alternative formula 301£2001¡301£2000 602301¡602000 x¡y = = 4002000 4002000 6:023£105 ¡6:020£105 :003£105 … = … :00007496; 4:002£106 4:002£106 we at least reproduce the flrst two signiflcant digits. Thus, the result of a (cid:176)oating point computation depends on the order of the arithmetic operations! In particular, the associa- tive and distributive laws are not valid in (cid:176)oating point arithmetic! In the development of numerical analysis, one tends to not pay attention to this \minor detail’, although its efiects must always be kept in the back of one’s mind when evaluating the results of any serious numerical computation. Just in case you are getting a little complacent, thinking \gee, a few tiny round ofi errors can’t really make that big a difierence", let us close with two cautionary examples. Example 1.3. Consider the pair of degree 10 polynomials p(x) = (x¡1)(x¡2)(x¡3)(x¡4)(x¡5)(x¡6)(x¡7)(x¡8)(x¡9)(x¡10) and 5 q(x) = p(x)+x : They only difier in the value of the coe–cient of their middle term, which, by a direct expansion, is ¡902055x5 in p(x), and ¡902054x5 in q(x); all other terms are the same. The relative error between the two coe–cients is roughly one-thousandth of one percent. Our task is to compute the roots of these two polynomials, i.e., the solutions to p(x) = 0 and q(x) = 0. Those of p(x) are obvious. One might expect the roots of q(x) to 3/15/06 4 (cid:176)c 2006 Peter J. Olver 1 0.8 0.6 0.4 0.2 0.5 1 1.5 2 2.5 Figure 1.2. Sensitive Dependence on Initial Data. be rather close to the integers 1;2;:::;10. However, their approximate values are 1:0000027558; 1:99921; 3:02591; 3:82275; 5:24676§0:751485i; 7:57271§1:11728i; 9:75659§0:368389i: Surprisingly, only the smallest two roots are relatively unchanged; the third and fourth roots difier by roughly 1% and 27%, while the flnal six roots have mysteriously meta- morphosed into three complex conjugate pairs of roots of q(x). The two sets of roots are plotted in Figure 1.1; those of p(x) are indicated by solid disks, while those of q(x) are indicated by open circles. We have thus learned that an almost negligible change in a single coe–cient of a real polynomial can have dramatic and unanticipated efiects on its roots. Needless to say, this indicates that flnding accurate numerical values for the roots of high degree polynomials is a very challenging problem. Example 1.4. Consider the initial value problem du ¡2u = ¡e¡t; u(0) = 1: dt 3 The solution is easily found: u(t) = 1e¡t; 3 and is exponentially decaying as t ! 1. However, in a (cid:176)oating point computer, we are not able to represent the initial con- 1 dition exactly, and make some small round-ofi error (depending on the precision of the 3 computer). Let " 6= 0 represent the error in the initial condition. The solution to the perturbed initial value problem dv ¡2v = ¡e¡t; v(0) = 1 +"; dt 3 is v(t) = 1e¡t +"e2t; 3 which is exponentially growing as t increases. As sketched in Figure 1.2, the two solutions remain close only for a very short time interval, the duration of which depends on the 3/15/06 5 (cid:176)c 2006 Peter J. Olver size in the initial error, but then they eventually diverge far away from each other. As a consequence, a tiny error in the initial data can have a dramatic efiect on the solution. This phenomenon is referred to as sensitive dependence on initial conditions. The numerical computation of the exponentially decaying exact solution in the face of round-ofi errors is an extreme challenge. Even the tiniest of error will immediately introduce an exponentially growing mode which will eventually swamp the true solution. Furthermore, in many important applications, the physically relevant solution is the one that decays to zero at large distances, and is usually distinguished from the vast majority of solutions that grow at large distances. So the computational problem is both important and very di–cult. Examples 1.3 and 1.4 are known as ill-conditioned problems meaning that tiny changes in the data have dramatic efiects on the solutions. Simple numerical methods work as advertised on well-conditioned problems, but all have their limitations and a su–ciently ill-conditioned problem will test the limits of the algorithm and/or computer, and, in many instances, require revamping a standard numerical solution scheme to adapt to the ill-conditioning. Some problems are so ill-conditioned as to defy even the most powerful computers and sophisticated algorithms. For this reason, numerical analysis will forever remain a vital and vibrant area of mathematical research. So, numerical analysis cannot be viewed in isolation as a black box, and left in the hands of the mathematical experts. Every practitioner will, sooner or later, confront a problem that tests the limitations of standard algorithms and software. Without a proper understanding of the mathematical principles involved in constructing basic numerical algorithms, one is ill-equipped, if not hopelessly handicapped, when dealing with such problems. The purpose of this series of lectures is to give you the proper mathematical grounding in modern numerical analysis. 3/15/06 6 (cid:176)c 2006 Peter J. Olver AIMS Lecture Notes 2006 Peter J. Olver 2. Numerical Solution of Scalar Equations Most numerical solution methods are based on some form of iteration. The basic idea is that repeated application of the algorithm will produce closer and closer approximations to the desired solution. To analyze such algorithms, our flrst task is to understand general iterative processes. 2.1. Iteration of Functions. Iteration, meaning repeated application of a function, can be viewed as a discrete dynamical system in which the continuous time variable has been \quantized" to assume integer values. Even iterating a very simple quadratic scalar function can lead to an amaz- ing variety of dynamical phenomena, including multiply-periodic solutions and genuine chaos. Nonlinear iterative systems arise not just in mathematics, but also underlie the growth and decay of biological populations, predator-prey interactions, spread of commu- nicable diseases such as Aids, and host of other natural phenomena. Moreover, many numerical solution methods | for systems of algebraic equations, ordinary difierential equations, partial difierential equations, and so on | rely on iteration, and so the the- ory underlies the analysis of convergence and e–ciency of such numerical approximation schemes. In general, an iterative system has the form u(k+1) = g(u(k)); (2:1) where g:Rn Rn is a real vector-valued function. (One can similarly treat iteration of ! complex-valued functions g:Cn Cn, but, for simplicity, we only deal with real systems ! here.) A solution is a discrete collection of points u(k) Rn, in which the index k = y 2 0;1;2;3;::: takes on non-negative integer values. Once we specify the initial iterate, u(0) = c; (2:2) then the resulting solution to the discrete dynamical system (2.1) is easily computed: u(1) = g(u(0)) = g(c); u(2) = g(u(1)) = g(g(c)); u(3) = g(u(2)) = g(g(g(c))); ::: The superscripts on u(k) refer to the iteration number, and do not denote derivatives. y 3/15/06 7 c 2006 Peter J. Olver (cid:176) and so on. Thus, unlike continuous dynamical systems, the existence and uniqueness of solutions is not an issue. As long as each successive iterate u(k) lies in the domain of deflnition of g one merely repeats the process to produce the solution, k times (2:3) u(k) = g g(c); k = 0;1;2;:::; – ¢¢¢ – z }| { which is obtained by composing the function g with itself a total of k times. In other words, the solution to a discrete dynamical system corresponds to repeatedly pushing the g key on your calculator. For example, entering 0 and then repeatedly hitting the cos key corresponds to solving the iterative system u(k+1) = cosu(k); u(0) = 0: (2:4) The flrst 10 iterates are displayed in the following table: k 0 1 2 3 4 5 6 7 8 9 u(k) 0 1 :540302 :857553 :65429 :79348 :701369 :76396 :722102 :750418 For simplicity, we shall always assume that the vector-valued function g:Rn Rn is ! deflned on all of Rn; otherwise, we must always be careful that the successive iterates u(k) never leave its domain of deflnition, thereby causing the iteration to break down. To avoid technical complications, we will also assume that g is at least continuous; later results rely on additional smoothness requirements, e.g., continuity of its flrst and second order partial derivatives. While the solution to a discrete dynamical system is essentially trivial, understanding its behavior is deflnitely not. Sometimes the solution converges to a particular value | the key requirement for numerical solution methods. Sometimes it goes ofi to , or, more 1 precisely, the norms of the iterates are unbounded: u(k) as k . Sometimes k k ! 1 ! 1 the solution repeats itself after a while. And sometimes the iterates behave in a seemingly random, chaotic manner | all depending on the function g and, at times, the initial condition c. Although all of these cases may arise in real-world applications, we shall mostly concentrate upon understanding convergence. Deflnition 2.1. A flxed point or equilibrium of a discrete dynamical system (2.1) is a vector u? Rn such that 2 g(u?) = u?: (2:5) We easily see that every flxed point provides a constant solution to the discrete dy- namical system, namely u(k) = u? for all k. Moreover, it is not hard to prove that any convergent solution necessarily converges to a flxed point. Proposition 2.2. If a solution to a discrete dynamical system converges, lim u(k) = u?; k !1 then the limit u? is a flxed point. 3/15/06 8 c 2006 Peter J. Olver (cid:176) Proof: This is a simple consequence of the continuity of g. We have u? = lim u(k+1) = lim g(u(k)) = g lim u(k) = g(u?); k k (cid:181)k ¶ !1 !1 !1 the last two equalities following from the continuity of g. Q.E.D. For example, continuing the cosine iteration (2.4), we flnd that the iterates gradually converge to the value u? :739085, which is the unique solution to the flxed point equation … cosu = u: Later we will see how to rigorously prove this observed behavior. Of course, not every solution to a discrete dynamical system will necessarily converge, but Proposition 2.2 says that if it does, then it must converge to a flxed point. Thus, a key goal is to understand when a solution converges, and, if so, to which flxed point | if there is more than one. Fixed points are divided into three classes: asymptotically stable, with the property that all nearby solutions converge to it, † stable, with the property that all nearby solutions stay nearby, and † unstable, almost all of whose nearby solutions diverge away from the flxed point. † Thus, from a practical standpoint, convergence of the iterates of a discrete dynamical system requires asymptotic stability of the flxed point. Examples will appear in abundance in the following sections. Scalar Functions As always, the flrst step is to thoroughly understand the scalar case, and so we begin with a discrete dynamical system u(k+1) = g(u(k)); u(0) = c; (2:6) in which g:R R is a continuous, scalar-valued function. As noted above, we will assume, ! for simplicity, that g is deflned everywhere, and so we do not need to worry about whether the iterates u(0);u(1);u(2);::: are all well-deflned. As usual, to study systems one begins with an in-depth analysis of the scalar version. Consider the iterative equation u(k+1) = au(k); u(0) = c: (2:7) The general solution to (2.7) is easily found: u(1) = au(0) = ac; u(2) = au(1) = a2c; u(3) = au(2) = a3c; and, in general, u(k) = akc: (2:8) If the initial condition is a = 0, then the solution u(k) 0 is constant. Therefore, 0 is a · flxed point or equilibrium solution for the iterative system. 3/15/06 9 c 2006 Peter J. Olver (cid:176)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.