ebook img

Oxford Prelims Linear Algebra I, Michaelmas Term 2014 PDF

54 Pages·2014·0.369 MB·English
by  ?
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Oxford Prelims Linear Algebra I, Michaelmas Term 2014

Prelims Linear Algebra I Michaelmas Term 2014 1 Systems of linear equations and matrices Let m,n be positive integers. An m×n matrix is a rectangular array, with nm numbers, arranged in m rows and n columns. For example 1 −5 0 (cid:18)3 0 2(cid:19) is a 2×3 matrix. We allow the possibility of having just one column, or just one row, such as 3/2  9  or −0.5 19 25 . 17 (cid:0) (cid:1)   In general we write an m×n matrix X as x x ... x 11 12 1n x21 x22 ... x2n x x ... x  31 32 3n  .. .. ..   . . ... .    x x ... x  m1 m2 mn   where x is the (i,j)-th entry of X and appears in the i-th row and in the j-th column. We ij m, n often abbreviate this and write X = [x ] or [x ] or just [x ] if it’s clear (or not ij i=1,j=1 ij m×n ij important) what m and n are. IfA = [a ] andB = [b ] thenAandBareequal (andwewriteA = B) if and only if ij m×n ij p×q (1) m = p and n = q; and (2) a = b for all i ∈ {1,...,m} and j ∈ {1,...n}. ij ij 1.1 Addition and scalar multiplication of matrices Definition 1.1 Suppose A and B are m×n matrices whose entries are real numbers. Define the sum A+B to be the m×n matrix whose (i,j)-th entry is a +b . ij ij Note that A and B must have the same size, and then A+B also has the same size with entries given by adding the corresponding entries of A and B. For example 1 −4 −1 2 0 −2 + = (cid:18)3 0 (cid:19) (cid:18) 3 4(cid:19) (cid:18)6 4 (cid:19) If we write A+B then we assume implicitly that A and B have the same size. Definition 1.2 The m×n matrix with all entries equal to zero is called the zero matrix, and written as 0 or just as 0. m×n 1 Remark 1.3 We can define A+B in exactly the same way if the entries of A and B are complex numbers, or indeed if the entries of A and B belong to any ‘field’ F (see Remark 2.2 later). Theorem 1.4 (1) Addition of matrices is commutative; that is, A+B = B+A for all m×n matrices A and B. (2) Addition of matrices is associative; that is, A+(B +C) = (A+B)+C for all m×n matrices A, B and C. (3) We have A+0 = A = 0+A for every matrix A. (4) For every m×n matrix A there is a unique m×n matrix B with A+B = 0. Proof (1) If A and B are of size m×n then A+B and B +A are also of size m×n. The ij entry of A+B is a +b . The ij entry of B +A is b +a . Now, a and b are real ij ij ij ij ij ij numbers and addition of real numbers is commutative, so a +b = b +a . ij ij ij ij (2) is left as an exercise. (3) We have A+0 = [a +0 ] = [a +0] = [a ] = A ij ij ij ij and A+0 = 0+A. (4) Given A = [a ], take B to be the matrix [−a ] whose (i,j)-th entry is −a . Then ij ij ij A+B = [a +(−a )] = 0 . ij ij m×n m×n If also A+C = 0 then for all i,j we have a +c = 0 and hence c = −a , so C = B. ij ij ij ij (cid:3) The matrix B in Theorem 1.4 (4) is called the additive inverse of A, and we write B = −A. Given matrices A,C of the same size we also write A−C as shorthand for A+(−C). Exercise 1.5 Prove that if A and B have the same size then −(A+B) = −A−B. Definition 1.6 Given a matrix A and a number λ, define the product of A by λ to be the matrix, denoted by λA, obtained from A by multiplying every entry of A by λ. That is, if A = [a ] then λA = [λa ] . ij m×n ij m×n This is traditionally called ‘scalar multiplication’ where the word scalar means ‘number’ (real or complex, later others). The main properties are: Theorem 1.7 Suppose A and B are m×n matrices, then for any scalars λ,µ (1) λ(A+B) = λA+λB; (2) (λ+µ)A = λA+µA; (3) (λµ)A = λ(µA); (4) 1·A = A. The proof is to be completed. Exercise 1.8 Prove that for an m×n matrix A, taking scalars 1 and 0 gives: (a) (−1)·A = −A and (b) 0·A = 0 . m×n 2 1.2 Matrix multiplication Definition 1.9 Let A = [a ] and B = [b ] (note the sizes!). Then one defines the ij m×n ij n×p product AB to be the m×p matrix whose (i,j)-th entry is [AB] = a b +a b +...+a b . ij i1 1j i2 2j in nj That is, we multiply the elements in the i-th row with the elements of the j-th column and take the sum. This is usually abbreviated as n [AB] = a b . ij ik kj Xk=1 For example, if 3 0 4 −1 A = −1 2, B = , then (cid:18)0 2 (cid:19) 1 1   3×4+0×0 3×(−1)+0×2 12 −3 AB = (−1)×4+2×0 (−1)×(−1)+2×2 = −4 5 . 1×4+1×0 1×(−1)+1×2 4 1     A matrix is square if it is of size n×n for some n; that is, it has the same number of rows and columns. Consider the n×n matrix 1 0 0 ... 0 0 1 0 ... 0 1 if i = j I = 0 0 1 ... 0 , whose (i,j)-th entry is [I ] = n .. .. .. .. n ij (cid:26) 0 if i 6= j. . . . .   0 0 0 ... 1   The matrix I is called the ‘identity matrix’ of size n×n. This is an example of a diagonal n matrix: a square matrix D = [d ] is said to be diagonal if d = 0 whenever i 6= j. That ij n×n ij is, all its entries are zero off the leading diagonal. Theorem 1.10 Assume A,B,C are matrices, and λ is a scalar. Whenever the sums and products are defined, (1) A(BC) = (AB)C. (2) A(B+C) = AB+AC, (B+C)A = BA+CA. (4) AI = A = I A; n n (3) λ(AB) = (λA)B = A(λB). Proof (a) For A(BC) to be defined we need the sizes to be m×n, n×p and p×q. These are precisely the conditions one needs (AB)C to be defined. Assume these, then we calculate the (i,j)-th entry of A(BC) to be n n p n p [A(BC)] = a [BC] = a [ b c ] = a b c . ij ik kj ik kt tj ik kt tj Xk=1 Xk=1 Xt=1 Xk=1Xt=1 3 We calculate the (i,j)-th entry of (AB)C to be p p n p n [(AB)C)] = [AB] c = ( a b )c = a b c . ij it tj ik kt tj ik kt tj Xt=1 Xt=1 Xk=1 Xt=1Xk=1 These are the same, for arbitrary i,j, so (AB)C = A(BC). The other parts are to be completed. (cid:3) Property (1) in this theorem is called associativity for matrix multiplication, and property (2) is known as the distributive law for matrices. Because of the associativity of matrix multiplication we write usually just ABC instead of A(BC) or (AB)C. In addition, for every positive integer n we write An for the product AA···A (with n terms). Example 1.11 The matrices 0 1 1 0 A = , B = (cid:18)0 0(cid:19) (cid:18)0 0(cid:19) satisfy AB = 0 and BA = A. This shows that in general matrix multiplcation is not commu- tative. We also see that it is possible to have matrices A and B with AB = 0 but A and B both non-zero. Definition 1.12 Suppose A is an n×n matrix. Then A is invertible if there is some n×n matrix X such that AX = I and XA = I . n n If A is invertible then this matrix X is unique. To prove this, suppose that X′ is another matrix such that AX′ = I = X′A. Then we must show that X = X′. But n X = XI = X(AX′) = (XA)X′ = I X′ = X′ n n as required. Therefore it makes sense to call X the inverse of A and write X = A−1. Lemma 1.13 Suppose that A and B are invertible n×n matrices. Then AB is invertible, with inverse B−1A−1. Proof We have (AB)(B−1A−1) = A(BB−1)A−1 = AI A−1 = AA−1 = I n n and similarly one calculates (BA)(A−1B−1) = I . (cid:3) n This lemma says that if A and B are invertible of the same size then AB is invertible and (AB)−1 = B−1A−1. 4 1.3 Systems of linear equations One important application of matrix technology is to solve linear equations. In this section we will start on this, and later we will review it when we can use more advanced technology. Definition 1.14 A system of linear equations in n variables is a list of linear equations a x +a x +...+a x = b 11 1 12 2 1n n 1 a x +a x +...+a x = b 21 1 22 2 2n n 2 ...... a x +a x +...+a x = b m1 1 m2 2 mn n m where the a and the b are numbers (in R or C, or any other field F). ij i We write this in matrix form, as Ax = b where A is the m×n matrix A = [a ] , and where x and b are column vectors; that is ij m×n x b 1 1 x2 b2 x = . , b = . . . .  .   .      x  b  n m     The system of linear equations is called homogeneous if b = 0. Example 1.15 Consider the system of linear equations 2x +2x −4x = 2 (1) 2 3 4 x + 2x +3x = 5 (2) 1 2 3 5x + 8x +13x +4x = 23 (3). 1 2 3 4 This becomes Ax = b where x 0 2 2 −4 1 2 A = 1 2 3 0 , x = x2, b = 5 x 5 8 13 4  3 23   x    4   To solve the equations, one might start by interchanging equations (1) and (2), to get the x to the top left place. 1 Then one can eliminate x , via 5x +8x +13x +4x −5(x +2x +3x ) = 23−25, which 1 1 2 3 4 1 2 3 gives −2x −2x +4x = −2. 2 3 4 Then we have two equations with fewer variables, and we can repeat the process. We translate this process into matrix notation. Write down the matrix B obtained by concatenating matrix A and the vector b; this gives 0 2 2 −4 2 B = [A | b] =  1 2 3 0 5 . 5 8 13 4 23   5 The matrix B is called an augmented matrix . Write R for the i-th row of B. We have i first interchanged R and R ; we write this as 1 2 1 2 3 0 5 B = [A | b] R1 ↔R2  0 2 2 −4 2 . 5 8 13 4 23   Then we replaced R by R −5R . We write this as 3 3 1 1 2 3 0 5 R3→R 3−5R1  0 2 2 −4 2 . 0 −2 −2 4 −2   Next, we can replace the last row by a row of zeros by adding row 2 to row 3 1 2 3 0 5 R3→ R2+R3  0 2 2 −4 2 . 0 0 0 0 0   Then we can replace R by R −R , and finally we divide R by 2. This gives 1 1 2 2 1 0 1 4 3 R1→R1−R2 , R2→(1/2)R2  0 1 1 −2 1 . (1) 0 0 0 0 0   The corresponding equations x + x +4x = 3 1 3 4 x +x −2x = 1 2 3 4 have exactly the same solutions as the original equations, and now it is easy to describe all the solutions. We can assign arbitrary values to x and x , say x = α and x = β. Then the 3 4 3 4 values of x and x are uniquely determined by the equations in terms of α and β; that is 2 1 x = −α−4β +3, x = −α+2β +1. 1 2 The ‘general solution’ to the system of linear equations can thus be written as x = −α−4β +3, x = −α+2β +1, x = α, x = β 1 2 3 4 or equivalently x −α−4β +3 −1 −4 3 1 x2 = −α+2β +1 = α−1+β 2 +1 x α 1 0 0 3           x   β   0   1  0 4           for arbitrary α and β. What we did in this example can be generalised to give a method for finding the general solution to any system of linear equations. The strategy is to transform the augmented matrix B by reversible steps, without changing the solutions to the corresponding systems of linear equations, to a ‘nice’ form E for which one can easily describe all the solutions. The transformations we will use are called elementary row operations (EROs). The ‘nice’ form to aim for is known as ‘reduced row echelon form’ (RRE form); the matrix (1) has this shape. 6 1.4 Elementary row operations and reduced row echelon form Example 1.16 Examples of matrices in reduced row echelon form are 1 0 1 0 0 0 3 1  0  0 0 0 0 1  0 , , , , 0 1 4 0 −2 0 . 0 (cid:18) 0 0 0 (cid:19) (cid:18) 0 0 (cid:19) 0   0 0 0 1 0 1    0      More generally, the following matrix is in reduced row echelon form: 0 ··· 0 1 ∗ ··· ∗ 0 ∗ ··· ∗ 0 ∗ ··· ∗ 0 ∗ ··· ∗ ··· ∗  ... ... 0 ··· ··· 0 1 ∗ ··· ∗ 0 ... ... ... ... ...     . .   . .   . 0 ··· 0 1 ∗ ··· ∗ 0 ∗ .     . .  .. 0 ··· ··· 1 ∗ ∗   0 ··· ··· 0 ··· ··· 0 ··· ··· 0   ... ...     0 ··· ··· ··· ··· ··· ··· ··· ··· ··· 0  We make a formal definition: Definition 1.17 The m×n matrix E is in reduced row echelon form (RRE form) if (i) the zero rows of E lie below the non-zero rows; (ii) in each row which is not zero, the leading entry (that is, the left-most non-zero entry) is 1; (iii) if row i and row i+1 are non-zero then the leading entry of row i+1 is strictly to the right of the leading entry of row i; (iv) if a column contains a leading entry of some row then all other entries in this column are zero. In order to transform a matrix to reduced row echelon form, one uses elementary row opera- tions: Definition 1.18 There are three types of elementary row operations (EROs) which can be applied to a matrix B; they are defined as follows. Let R be the i-th row of B. i (1) Interchange R and R ; i j (2) Replace R by R′ := cR where c is a non-zero scalar; i i i (3) Replace R by R′ := R +dR where d is a scalar, and i 6= j. i i i j Each of these operations can clearly be reversed by another ERO of the same type which is the ‘inverse’ operation. Operation (1) is its own inverse, for the inverse of (2) one replaces R′ i by (1/c)R′ = R , and for the inverse of (3) one replaces R′ by R′ −dR . i i i i j Applying an ERO to B is the same as premultiplying B by an invertible matrix: Lemma 1.19 Applying an ERO to an m×n matrix B gives us PB where P is the result of applying the same ERO to the m×m-identity matrix. 7 Proof: ERO (1) is the same as replacing B by PB where P is the m×m permutation matrix P = [p ] with rs 1 if r = s 6= i,j p =  1 if (r,s) = (i,j) or (j,i) rs  0 otherwise.  ERO (2) is the same as replacing B by PB where P is the m×m diagonal matrix with i-th diagonal entry c and all other diagonal entries equal to 1. ERO (3) is the same as replacing B by PB where P = I +dE . Here E is the m×m m ij ij matrix which has a 1 in position ij and is 0 otherwise. (cid:3) Definition 1.20 The matrices P defined by applying EROs to an identity matrix are called elementary matrices. An elementary matrix is invertible; its inverse is the elementary matrix corresponding to the inverse ERO. Lemma 1.21 Applying an ERO to an augmented matrix B = [A | b] does not alter the set of solutions x to the system of linear equations given by Ax = b. Proof: By Lemma 1.19 an ERO transforms B = [A | b] to PB = [PA | Pb] where P is the corresponding elementary matrix. If x satisfies Ax = b then it also satisfies PAx = Pb, so every solution to the original system of linear equations is also a solution to the transformed system. Since the ERO can be reversed by another ERO (its inverse), it follows that every solutiontothetransformedsystemisalsoasolutiontotheoriginalsystemoflinearequations. (cid:3) The following theorem is sometimes called Gauss elimination. Writing the proof on the board is not illuminating, so this is for private reading, but the theorem is very important. Theorem 1.22 Suppose B is some m×p matrix. Then B can be transformed to a matrix E in reduced row echelon form by a finite sequence of elementary row operations. Thus there is an invertible matrix P = P P ...P , a product of elementary matrices P ,P ,...,P , such s s−1 1 1 2 s that PB = E where E is a reduced row echelon matrix. Proof We will show, by induction on m, that B can be transformed to a reduced row echelon matrix E via ERO’s. This will prove the theorem, since by Lemma 1.19 an ERO is the same as premultiplication by an elementary matrix. If B is the zero matrix then it is already in RRE form and we have nothing to prove. So we can assume that B 6= 0. Suppose m = 1, then we use an ERO of type (2), and premultiply by a non-zero scalar to make the leftmost non-zero entry equal to 1. This is then in RRE form. Now assume m > 1. 8 (i) Reading from the left, the first non-zero column of B has some non-zero entry. By interchanging rows if necessary we can move a non-zero entry of this column to the first row. This gives a matrix B of the form 1 0 ... 0 b ... b 11 1k 0 ... 0 b21 ... b2k B1 = .. .. .. .. . ... . . ... .    0 ... 0 b ... b  m1 mn   with b 6= 0. 11 (ii) We transform B to B of the form 1 2 0 ... 0 b b ... b 11 12 1k 0 ... 0 0 c22 ... c2k  B2 = .. .. .. .. .. . ... . . . ... .    0 ... 0 0 c ... c  m2 mn   using EROs of type (3) to subtract scalar multiples of the first row from the other rows. (iii) Now consider the submatrix C := [c ] of B , of size (m−1)×(k−1). By the ij 2≤i,2≤j≤k 2 inductionhypothesis,C canbetransformedtoamatrixE ,inrowreducedechelonform. The 1 same EROs applied to B , and then an ERO of type (2) applied to the first row, transform 2 B to B where (writing it as block matrix) 2 3 0 0 ... 0 1 ∗ B = . 3 (cid:18)0 0 ... 0 0 E1(cid:19) (iv) By using EROs of type (3) we make each entry in the top row of B to zero if it is in 3 a column which has leading 1 of E . This has then produced the required matrix E in RRE 1 form. (cid:3) Remark 1.23 One can show that given B, then this E is unique. A proof is given in Blyth- Robertson, Theorem 3.11. It is easy to keep track as one goes along of the product of elemetary matrices used in Gauss elimination, and so to calculate the matrix P in Theorem 1.22. This is done as follows: take the block matrix [B | I ] which is the concatenation of B and the identity matrix I . Then m m perform the EROs on [B | I ]. This results in the matrix [E | P] where the block P is the m matrix we want, since, taking products of matrices, we have P[B|I ] = [PB|PI ] = [E|P]. m m It is strongly recommended that you should keep track in this way of the product of elementary matrices, and then at the end calculate PB to make sure that it is equal to E. If it is not, then you have made a mistake somewhere. 9 Example 1.24 Let B = [A|b] be the matrix as in Example 1.15. To keep track of the elementary matrices along the way, we perform the EROs on the block matrix [B|I ]. Then 3 we end up with a matrix [E|P] such that PB = E. Explicitly: 0 2 2 −4 2 1 0 0 1 2 3 0 5 0 1 0 [B|I3] =  1 2 3 0 5 0 1 0   0 2 2 −4 2 1 0 0  5 8 13 4 23 0 0 1 5 8 13 4 23 0 0 1     1 2 3 0 5 0 1 0 1 0 1 4 3 −1 1 0  0 2 2 −4 2 1 0 0  ...  0 1 1 −2 1 1/2 0 0  0 −2 −2 4 −2 0 −5 1 0 0 0 0 0 1 −5 1     which is [E|P] where P is the matrix −1 1 0 1/2 0 0 1 −5 1   This does indeed satisfy PB = E Exercise 1.25 Transform the matrix M to reduced row echelon form E, where 1 1 1 0 1 1 0 1  M = . 1 0 1 −1   0 1 1 0    At the same time calculate the matrix P so that PM is the reduced row echelon matrix E. 1.5 Solving systems of linear equations using EROs Consider a system of linear equations given by Ax = b where A = [a ] is a given m×n matrix and ij b 1 b2 b = . .  .    b  m   is a given column vector of length m, both with entries in a field F (usually F = R or C or Q). We want to find all the solutions x 1 x2 x = . .  .    x  n   10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.