1 Matrices J MUSCAT 1 Elementary Linear Algebra Dr J Muscat 2002 1 Matrices 1.1 Definition A matrix is a rectangular array of numbers, arranged in rows and columns. a11 a12 a13 ... a1n a21 a22 a23 ... a2n A = ... ... am1 ... amn We write the array in short as [a ] with i and j denoting the indices for ij the rows and columns respectively. We say that a matrix with m rows and n columns is of size m×n. An m×n matrix has mn elements. If m = n the matrix is called a square matrix. A matrix with just one column is called a vector while one with just one row is called a covector or row vector. A 1×1 matrix is called a scalar and is simply a number. Examples: 1 2 3 −1 1 a 2×3 matrix, ; a 2×2 square matrix, ; 4 5 6 2 4 (cid:18) (cid:19) (cid:18) (cid:19) 2 a vector, −1 ; a row vector, 5 2 ; a scalar, 9. 0 (cid:0) (cid:1) 1.2 Equality of Matrices Two matrices are equal when all their respective elements are equal. A = B when a = b for all i,j. ij ij Note that for this to make sense, the size of the matrices must be the same. Example: 1 2 1 2 6= . 3 4 3 5 (cid:18) (cid:19) (cid:18) (cid:19) 1.3 Addition J MUSCAT 2 1.3 Addition The addition of two matrices is defined by [a ]+[b ] = [a +b ] ij ij ij ij Example: 1 2 3 4 1+3 2+4 3 5 + 1 5 = 3+1 5+5 4 3 2 2 4+2 3+2 4 6 = 4 10 6 5 Notethat it isnotpossible to addmatrices together if theyareofdifferent sizes. Itdoesnotmatterwhichorderthematricesareaddedi.e. A+B = B+A. 1.4 Multiplication by Scalars A matrix can be multiplied by a number (scalar), λ[a ] = [λa ]. ij ij Example: 1 2 2×1 2×2 2 4 2 = = −2 4 2×−2 2×4 −4 8 (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) 1.5 Multiplication of Matrices Multiplication of matrices is defined by AB = [a ][b ] = [ a b ]. ij jk ij jk j X Example: 1 1 2 −1 1×1+2×−1+−1×2 −3 −1 = = , 3 −2 0 3×1+−2×−1+0×2 5 (cid:18) (cid:19) 2 (cid:18) (cid:19) (cid:18) (cid:19) 0 −1 1 2 3 8 10 1 1 = . 4 5 6 17 19 (cid:18) (cid:19) 2 3 (cid:18) (cid:19) 1.6 Zero Matrix J MUSCAT 3 Notethatmatricesmust becompatibleformultiplication, i.e. thenumber of columns of the first matrix must equal the number of rows of the second matrix. In general, an l×m matrix multiplied by an m×n matrix gives an l×n matrix. The usual dot product of vectors can also be expressed as matrix multi- plication: −1 1 0 2 2 = 1×−1+0×2+2×3 = 5. 3 (cid:0) (cid:1) The following multiplication of vectors gives a very different result: −1 −1 0 −2 2 1 0 2 = 2 0 4 . 3 3 0 6 (cid:0) (cid:1) From this example it follows that the order in which we multiply matrices is important. This may be true even for square matrices: 1 2 2 1 10 7 = , 3 4 4 3 22 15 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) 2 1 1 2 5 8 = . 4 3 3 4 13 20 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) In general, AB 6= BA. 1.6 Zero Matrix The zero matrix, 0, is defined as that matrix with zero elements, [0]. Example: The 2×3 zero matrix is 0 0 0 . 0 0 0 (cid:18) (cid:19) There are zero matrices of any size. When the matrices are compatible, 0+A = A and 0A = 0; also 0A = 0. What’s written symbolically here could be a reference to the following three operations: 0 0 1 2 1 2 1 2 0 0 0 0 0 0 0 + −1 2 = −1 2 ; −1 2 = ; 0 0 0 0 0 0 0 0 3 0 3 (cid:18) (cid:19) 0 3 (cid:18) (cid:19) 1 2 0 0 0 −1 2 = 0 0 . 0 3 0 0 1.7 Identity Matrix J MUSCAT 4 1.7 Identity Matrix The identity matrix, I, is the square matrix, 1 0 0 ... 0 1 0 ... . 0 0 1 ... ... ... It can be written in short as I = [δ ] where ij 1 when i = j δ = , ij (0 when i 6= j is called the Kronecker delta. It is easy to check that for compatible matrices IA = A; AI = A. 1.8 Transpose The transpose of a matrix is that matrix, written A⊤, obtained by inter- changing the rows and columns, ⊤ [a ] = [a ]. ij ji Example: ⊤ 1 4 1 0 2 = 0 2 4 2 −1 (cid:18) (cid:19) 2 −1 An m×n matrix becomes an n×m matrix. In particular the transpose of a vector is a covector and vice versa. Some properties of the transpose: ⊤ ⊤ ⊤ ⊤ ⊤ (λA) = λA , (A+B) = A +B , ⊤ ⊤ ⊤ ⊤ ⊤ (A ) = A, (AB) = B A . 1.9 Simultaneous Linear Equations If we multiply out the following two matrices, we get x 2 3 = 2x+3y. y (cid:18) (cid:19) (cid:0) (cid:1) 1.9 Simultaneous Linear Equations J MUSCAT 5 This means that we can write an equation such as 5x−2y = 3 as x 5 −2 = 3. y (cid:18) (cid:19) (cid:0) (cid:1) Moreover if we have two simultaneous equations of the same type we can combine them in matrix notation: 5x−2y = 3 2x+y = −1 becomes 5 −2 x 3 = . 2 1 y −1 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) This idea of transforming simultaneous equations into matrix equations can be generalized to any number of variables, not just two. Example: The equations 2x−5y +3z = −1 3y +x−z = 3 3x−4y +3z = 1 become the matrix equation 2 −5 3 x −1 1 3 −1 y = 3 . 3 −4 3 z 1 As long as the equations are linear, that is, only terms like x and y occur 3 and not terms like x , sinx, etc., then it will always be possible to write the equations as a single matrix equation. The number of rows of the matrix would equal the number of given equations, while the number of columns equals the number of variables. We can therefore write a general system of linear equations as Ax = b, where x refers to the vector of variables x, y, etc. If at all possible, we would like to find the values for x, y,..., in other words the vector x, to make the equation true. In order to proceed, we must look at the simplest matrix equation, one for which there is only one variable x and the matrix A is of size 1×1, that is it is just a scalar. 1.10 Inverses J MUSCAT 6 Example: 5x = 3 To solve this equation, we need to divide both sides of the equation by 5, or rather multiply both sides by 1/5. Why not multiply with 1/3 or any other such number? Obviously because doing so only gives us another equation without telling us directly what x is. Only by multiplying with 1/5 are we 1 able to eliminate the 5 in front of the unknown x, because × 5 = 1 and 5 1x = x, so that we get x = 3/5. Repeating the same argument to the matrix A (compare with the 5), we realize that what we need is another matrix B (compare with the 1/5) such that BA = I. We call such a matrix the inverse of A and denote it by A−1. 1 1 1 1 (The notation should be avoided, because it suggests = which is A AB AB false.) 1.10 Inverses The inverse of a matrix A is another matrix, denoted by A−1 with the property that −1 −1 A A = I, AA = I. (It is not apparent, but true for matrices, that if A−1A = I, then automati- cally AA−1 = I) If we are able to find a method which will give us the inverse of a matrix we would then be able to solve simultaneous equations, even if they had a thousand variables. (In fact, there are many applications ranging from predicting the weather to modeling the economy to simulating the shape of an airplane wing, that require this many variables.) Ax = b A−1Ax = A−1b Ix = A−1b x = A−1b Example: By a method that will be done later on, the inverse of the matrix A shown previously is −1 2 −5 3 5 3 −4 1 3 −1 = −6 −3 5 . 3 −4 3 −13 −7 11 Let us check this. 5 3 −4 2 −5 3 1 0 0 −6 −3 5 1 3 −1 = 0 1 0 −13 −7 11 3 −4 3 0 0 1 1.10 Inverses J MUSCAT 7 Therefore the solution of the simultaneous equations shown above is 5 3 −4 −1 0 −1 x = A b = −6 −3 5 3 = 2 . −13 −7 11 1 3 For which matrices is it possible to find an inverse? Non-square ma- trices do not have an inverse. How do we know this? Suppose we have an m×n matrix A. From the requirement that A−1A = I we see that the number of columns of A, which is n, is the same as that of I. We also have the requirement that AA−1 = I. This means that the number of rows of A, which is m, must equal that of the identity matrix. But the identity matrix is square, therefore m = n. Even if a matrix is square, it does not necessarily mean that it has an in- verse. Some matrices are analogous to the number 0, for which the reciprocal 1/0 does not make sense. In the next section we will find ways of determining which matrices are invertible and if they are, how to find the inverse. The inverses of matrices obey certain laws: • (A−1)−1 = A How do we know this? Let B = A−1. We want to show that B−1 = A, that is, that the inverse of the matrix B is nothing else but the original matrix A. To show this, all we need is to show that BA = I which is true. • (AB)−1 = B−1A−1. We can show this by multiplying out −1 −1 −1 B A AB = B IB = I Hence the inverse of AB is B−1A−1. 1.11 Exercises J MUSCAT 8 1.11 Exercises 1. Let 1 −2 3 −2 0 −1 A = ; B = . 0 −2 5 4 −6 −3 (cid:18) (cid:19) (cid:18) (cid:19) If possible find the following: 3A, A+B, 3A−2B, AB. 2. If A is a matrix of size 3×5, which of the following may be true? (a) AB has 3 rows and B has 5 columns; (b) B has 5 rows and AB has 3 rows; (c) BA has 5 columns and B has 3 rows; (d) B has 3 columns and BA has 5 columns. 3. Find a matrix C such that the following sum is the zero matrix: 1 2 0 0 2 1 3 −2 3 4 −4 0 1 −1 +2C 6 −2 1 3 1 0 4. Perform the following matrix multiplications if possible, −5 6 7 1 4 2 −3 5 3 1 0 , . 3 2 1 5 1 9 1 1 3 (cid:18) (cid:19) (cid:18) (cid:19)(cid:18) (cid:19) 5. Without doing any calculations, find the inverse of the matrix 2 0 0 0 −1 0 . 0 0 4 6. If we want to define A2 by AA, on what type of matrices would such a definition make sense? Let 3 2 A = . −1 0 (cid:18) (cid:19) Find A2 and A3. 2 Gaussian Elimination J MUSCAT 9 7. Suppose we take some functions of x and apply their McLaurin series to matrices: 1 1 A 2 3 e = I+A+ A + A +..., 2! 3! −1 2 3 (I+A) = I−A+A −A +.... 0.1 0.2 By considering the matrix A = , find approximations for −0.1 0.3 (cid:18) (cid:19) eA, e−A and (I+A)−1 by calculating the first few terms in the series. Check whether (a) (I+A)−1 is indeed the inverse of I+A; −A A (b) e is the inverse of e . 2 Gaussian Elimination Our aim in this chapter is to describe a method for solving simultaneous equations of the type Ax = b for any number of variables, and also to find a method for calculating the inverse of a matrix A when possible. Before we learn how to run, let us consider how we normally solve simul- taneous equations with two or three variables. 2.0.1 Example Consider the simultaneous equations 2x + 3y = 4 5x − y = −7. Our usual method isto multiply bothequations by different multiples in such a way that one of the variables can be eliminated by subtraction. In other words, multiplying the first equation by 5 and the second by 2, we get 10x + 15y = 20 10x − 2y = −14. Subtracting the first equation from the second we get 10x + 15y = 20 − 17y = −34. Once the variable x has been eliminated from the second equation we can solve it directly to get y = 2. Substituting into the first equation we then get 10x+30 = 20 which implies that x = −1. 2 Gaussian Elimination J MUSCAT 10 Let us rewrite this example using matrix notation so that the procedure becomes clearer. 2 3 x 4 = 5 −1 y −7 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) 10 15 x 20 = 10 −2 y −14 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) 10 15 x 20 = 0 −17 y −34 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) The steps that we have taken in solving the equations become simple operations on the rows of the matrix. In particular, we have multiplied whole rows by scalars, and we have subtracted away a whole row from a whole row, including the vector b. Let us take another example, this time involving three variables. We will write the equations in matrix form on the right hand side for comparison. 2x+3y −4z = −4 2 3 −4 x −4 x−y +3z = 8 , 1 −1 3 y = 8 . 3x+y −2z = −1 3 1 −2 z −1 To simplify things slightly, we will switch round the first and second rows. Obviously this does not make the slightest difference to the solution. x−y +3z = 8 1 −1 3 x 8 2x+3y −4z = −4, 2 3 −4 y = −4 . 3x+y −2z = −1 3 1 −2 z −1 We can eliminate the x variable from the second and third equation by sub- tracting 2 times the first row and 3 times the first row respectively. x−y +3z = 8 1 −1 3 x 8 5y −10z = −20, 0 5 −10 y = −20 . 4y −11z = −25 0 4 −11 z −25 The second equation simplifies to x−y +3z = 8 1 −1 3 x 8 y −2z = −4, 0 1 −2 y = −4 , 4y −11z = −25 0 4 −11 z −25 and now we can eliminate the y variable from the third equation by subtract- ing away 4 times the second equation, x−y +3z = 8 1 −1 3 x 8 y −2z = −4, 0 1 −2 y = −4 . −3z = −9 0 0 −3 z −9