ebook img

Matrix Calculus PDF

450 Pages·1959·19.817 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Matrix Calculus

Τθ MY WIFE, who believes more in me than in mathematics MATRIX CALCULUS BY E. BODEWIG SECOND REVISED AND ENLARGED EDITION 1959 NORTH-HOLLAND PUBLISHING COMPANY, AMSTERDAM No part of this book may be reproduced in any form by print, microfilm or any other means without written permission from the publisher Sole agency for USA Inter science Publishers, Inc., New York PRINTED IX THE NETHERLANDS PREFACE The aim of this book is a systematic calculation with the true building blocks of a matrix, the rows and columns, avoiding the use of the indivi- dual elements. Historically the notation of the theory of matrices developed from that of the theory of determinants, although the two subjects have little in common. This little was, however, enough to cause the complete notation of the highly developed theory of determinants, a notation which had proved very efficient and convenient for its own purposes, to be taken over by the new theory of matrices. Thus the matrix, forced into the Procrustean bed of determinants, was broken down into its individual elements, although the elements them- selves seldom play an independent part; indeed, the elements nearly always occur in fixed aggregates in which the individual elements them- selves are without significance. The result of this operation with symbols foreign to the subject was often an awkward and unintelligible formula, containing, for example, several summation signs. From such a formula the method of calculation had to be derived more or less by a process of mentally grouping the various elements into rows and columns, a process which should already have been carried out by the notation itself. Thus there arose a discrepancy between thought and calculation, and lack of elegance was a sign of it. It is, in fact, not a matter of indifference if formulas are overloaded with auxiliary symbols. The beauty and the success of the theory of deter- minants is chiefly the result of a convenient notation, and the same holds for other theories, e.g. for infinitesimal calculus. Symbols in science are the means by which we express our thoughts and describe the structure of the subject, and they are just as important as language is for culture. When they are taken from foreign fields, they will produce a discrepancy between thinking and speaking which will of itself cause disorder. A method of calculating is not, therefore, in itself a "calculus" ; for this it must operate only with the proper elements. The form must be the mirror of the content. Since, for example, coordinates change under a projective transformation, it is inappropriate to use coordinates of any kind in X PREFACE projective geometry and thus overload the formulas with symbols that are later transformed and therefore have no significance. Only points as points and lines as lines are invariant and only thus should they enter a calcu- lation. Neglecting this, we are led to a poorly constructed theory which is neither aesthetically satisfying nor practical. The situation is somewhat different when a theory is already dominated by an existing symbolism, even though this symbolism is less convenient. Then the gain produced by the new calculus in eliminating useless work is diminished by the work involved in learning the new language. Fortu- nately, however, the amount of work is small for our calculus, since nothing has to be abandoned and only a few lines have to be added, in virtue of which the elements disappear and are subsumed in certain "linear" aggregates: rows or columns or parallels. Even in the exceptional cases where a particular element is needed, it is better represented in the newT symbolism than in the old. On the other hand, the new symbolism considerably simplifies formulas and calculations, and only with its use does the discrepancy that we mentioned disappear and calculation with matrices become a ''calculus". Take, for instance, the simple operation of displacing row i of A so that it becomes row k. The old notation makes no provision for expressing this, while we write e^A,·. or E A. fci Or take column k of the inverse of AB. In the usual notation it reads ((AB)"1)«, * = 1, ...,n, which is useless for further calculation. For example, suppose it is to be premultiplied by B. We should have to write B((AB)-i) , *=l,...,n, lt a result that is completely unclear and can be simplified only by thinking, not by calculating. In our calculus, however, the former expression is written simply as (AB)-!e or B^A ^, fc and the latter as BB-iA-^ = A-ie , fc that is, column k of A-1, a result which is produced automatically by the calculus itself, thus leaving thought free for other purposes. Some theorems become so simple by the calculus that it is not worth- while nor necessary to prove them. For instance, rank(AB) ^ min (rank A, rank B). Or, the associative law of multiplication follows from the fact that the product of matrices is a sum of terms E^E^E .... E , thus wp qr PREFACE XI eEir where ε = 1 or 0 according as the adjacent indices in every two subsequent E's are equal or not. But this does not depend on the manner of comprising. A final example: In an important paper I find, for the change in the solution vector x of the system Ax = v when A and v vary slightly, dx = - Σ Σ xA da + Σ A dv j k hj hk sj 3 h k s where A is the element in the inverse of A corresponding to a . It will hk hk require further thought to arrange the indices and summations in an order convenient for calculation and further operations, while the in- crements d2x etc., of higher order will hardly be obtainable at all, or at jf most by means of intensive algebra involving so many summations that it will be impossible to calculate either theoretically or numerically with them. On the other hand, the same result is given by our formula dx = R{dv - rfA.x), R = A"1, which is clear and indicates the method of computation automatically. Even the higher differentials follow immediately and comprehensibly: d2x. = - 2(RdA)dx, d*x = - 3{RdA)d2x, .... The other differences in notation, e.g. for the lower and upper triangular matrices, are smaller, being D + L and D + U resp., but here again, computation and operation will be simpler and clearer. The various chapters of the book are not equally difficult, the interests of an engineer, or, generally, of a reader who does not wish to read the book page by page, having taken into account. For the same reason and in order to avoid frequent consulting of former pages slight repetitions have been made. E. B. Preface to Second Edition Among new results which have been added in the second edition are Lanczos's pq-algorithm, Rutishauser's LR-algorithm and Wilkinson's method. Hitherto unpublished investigations are described : The deflation of complex eigenvalues, the application to differential equations of the inversion of the general geodetic chain and its matricial generalization, and a new section devoted to orthogonal matrices. E. B. Due to special circumstances the author corrected the galley-proofs of this second-edition only. The page-proofs were corrected by another mathematician. THE PUBLISHERS CHAPTER I VECTORS A vector v of order n is an ordered set of n numbers (so-called compo- nents) v v , ...., v , whose succession is significant. All v/s are here v 2 n assumed to be real, in general. The numbers may be arranged vertically, i.e. in the form of a so-called column, in which case they are called a column vector or briefly a column and are denoted by v. If the numbers are arranged horizontally, i.e. in the form of a row, they are then called a row vector or briefly a row and are denoted by v'. — For example, , w' = [0,3, -1,2]. The product v'w of a row v' and a column w (of the same order n) is defined as (1.1) v'w = w'v = v w + v w + + v w . 1 1 2 2 n n It is called the scalar product of the two. It is a number. In the above example we have : v'w = w'v = 3.0 + 4.3— 1.1 +6.2 = 23. The scalar product of a vector v with itself, that is v'v, is the square of its length : v'v = v\ + . . . .+ v\. By dividing v by the square root of its length, a vector of length 1 is obtained: length (ν/<γ/ν'ν) = 1. The inequality of Schwarz is for complex vectors v,w: (1.2) |v'w|2 <: (v'v) (w'w) where the equality holds if, and only if, either v is proportional to w or if w = 0. The proof is obvious in the case of real vectors. For complex vectors, E. HELLINGER has given the following proof. (Lit. 122). Decompose {a) v = cw + u 4 MATRIX CALCULUS [I. 1 where the vector u and the scalar c are yet unknown. To determine them we form v'w = c(w'w) + u'w where w'w, being a sum of squares, is Φ 0 if w Φ 0. Then we choose c from (b) v'w = c(w'w). Then u follows from (a). And further u'w = 0. That is, by (a), v'v == cc(w'w) + c(u'w) + c(fl'w) + u'ü = cê(w'w) + u'ü. Thus (c) v'v > cc(w'w), where the equality only holds if u = 0, that is if, by (a), (d) v = cw. Q.e.d. Now, by (b, c) : |v'w|2 = |e|2(w'w)2 < (v'v) (w'w). Q.e.d. Another proof is AITKEN'S (Lit. 103, p. 39): Take the real vector x = Xr + s where r — \v-\, s = |^ ·|, λ = number. Then, t t ζ x'x - (λτ' + s') (Ar + s) = A2(r'r) + 2A(r's) + (s's). But r'r = v'v, r's ;> |v'w|, s's = w'w. And as x'x ;> 0, the discriminant of the quadratic function in λ must be negative so that (1.2) follows. Since the quotient of the left side of (2) by the right is smaller than 1, its square root can be interpreted as a cosine, viz. the cos of the angle between v and w (in two- or three-dimensional space this is really the cos of the angle) (1.3) cos (v, w) = v'w/VVvVw'w Therefore : v'w = 0 means: v and w are orthogonal v'w/Vv'vVw'w = 1 means: v and w are parallel In the latter case, there follows from (2) : Σ Σ (ViW — vw)2 = 0, that k k t i k is vw = vWi for all i, k or v = AW, where λ = number. i k k THEOREM, V and w are parallel if, and only if, v = Aw. The component of v in the direction of w is a = v'w/w'w so that v — aw is orthogonal to w. ι,ΐ] VECTORS 5 Null vector is the vector, the components of which are all zero. It is denoted by 0 or 0', according to whether it is a column or a row. Unity Vectors. Vectors, all components of which are zero, with the exception of a single one which is 1, are also important. They are called unity vectors (of order n) and are denoted by e e , ..., e if they are v 2 n columns, and by e^, . ..., e' if they are rows. Thus we have: n ei = (1,0,0, ....,0), e£ = (0,1,0, ....,0), ...., e; = (0, 0, 0, ..., 1). For their scalar products the following relationships hold: (1.4) e,-e. = 1, e'e = 0 for i φ k. 2 4 fc Every vector may be decomposed into a linear combination of unity vectors : v = V& + v2e2 + • · + vnen and v' = v& + v e + • · + v e' 2 2 n n 1.1. EQUATION OF A PLANE It is well known that the equation of the plane a through the origin is a x + a x + . . . .+ a x = 0 or in vector form: 1 1 2 2 n n a'x = 0, That is, the plane is formed by the end points of all vectors which, are orthogonal to the vector a. Thus the plane is represented vectorially by the vector a which is orthogonal to it. From a point p (that is the end point of the vector p) we drop the perpendicular onto the plane a. Being parallel to the vector a, this perpendicular has the form 2a. Affixed in the point (vector) p, its end point y — p -\- Xa must lie in the plane a, that is 0 = a'y ΞΞΞ a'p + Aa'a That is A=-(a'p)/(a'a). Thus the perpendicular is λα and its length is (a'p)Va'a/(a'a) = (a'p)/Va'a. THEOREM. Let the equation a'x = 0 of the plane a be ''normalized'' by dividing it by y'a'a. Then the value of the function on the left side for x = p will give the distance of the point p from the plane. It may be positive or negative. 6 MATRIX CALCULUS [I, 1 The planes a'x = o and a'x = b are parallel, since their equations are contradictory and therefore the planes have no point in common. The constant b apparently bears some relation to the distance of the second plane from the origin; but this distance is the same as that of a point y of the second plane from the first plane, that is equal to a'y/y'a'a = b/Va'a· By again normalizing the equation a'x = δ, that is so that a'a = 1, — b is the distance of the origin from the plane, and a'x — b the distance of point x from the plane, x lies on the same side of the plane as the origin if the product of distance and b is negative.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.