ebook img

Linear Algebra I [Lecture notes] PDF

58 Pages·2017·0.497 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Linear Algebra I [Lecture notes]

VersionofRevised22March2017 Oxford University Mathematical Institute Linear Algebra I Notes by Peter M. Neumann (Queen’s College) Preface These notes are intended as a rough guide to the fourteen-lecture course Linear Algebra I which is a part of the Oxford 1st year undergraduate course in mathematics (for Prelims). Please do not expect a polished account. They are lecture notes, not a carefully checked textbook. Nevertheless, I hope they may be of some help. The synopsis for the course is as follows. • Systems of linear equations. Expression as an augmented matrix (understood simply as an array at this point). Elementary Row Operations (EROs). Solutions by row reduction. • Abstract vector spaces: definition of a vector space over a field (expected examples R, Q, C). Examples of vector spaces: solution space of homogeneous systems of equations and differential equations; function spaces; polynomials; C as an R-vector space; sequence spaces. Subspaces, spanning sets and spans. (Emphasis on concrete examples, with de- duction of properties from axioms set as problems). • Linear independence, definition of a basis, examples. Steinitz exchange lemma, and defi- nition of dimension. Coordinates associated with a basis. Algorithms involving EROs to find a basis of a subspace. • Sums, intersections and direct sums of subspaces. Dimension formula. • Linear transformations: definition and examples including projections. Kernel and image, rank–nullity formula. • Algebra of linear transformations. Inverses. Matrix of a linear transformation with re- spect to a basis. Algebra of matrices. Transformation of a matrix under change of basis. Determining an inverse with EROs. Column space, column rank. • Bilinear forms. Positive definite symmetric bilinear forms. Inner Product Spaces. Exam- ples: Rn with dot product; function spaces. Comment on (positive definite) Hermitian forms. Cauchy–Schwarz inequality. Distance and angle. Transpose of a matrix. Orthogo- nal matrices. The Faculty Teaching Committee has approved the following lists to support the teaching and learning of this material. Reading List: (1) T.S.Blyth and E.F.Robertson, Basic Linear Algebra (Springer, London, 1998). (2) R.Kaye and R.Wilson, it Linear Algebra (OUP, 1998), Chapters 1–5 and 8. [More ad- vanced but useful on bilinear forms and inner product spaces.] i Further Reading: (1) C.W.Curtis, Linear Algebra – An Introductory Approach (Springer, London, Fourth edi- tion, reprinted 1994). (2) R.B.J.T.Allenby, Linear Algebra (Arnold, London, 1995). (3) D.A.Towers, A Guide to Linear Algebra (Macmillan, Basingstoke, 1988). (4) D.T.Finkbeiner, Elements of Linear Algebra (Freeman, London, 1972). [Out of print, but available in many libraries.] (5) B.SeymourLipschutz,MarcLipson,Linear Algebra(McGrawHill,London,ThirdEdition, 2001). The fact that some of these texts were first published many years ago does not mean that they are out of date. Unlike some of the laboratory sciences, mathematics and mathematical pedagogy at first-year undergraduate level were already sufficiently well developed many, many years ago that they do not change much now. Nevertheless, we are always looking out for good modern expositions. Please let me have suggestions—author(s), title, publisher and date. A set of seven exercise sheets goes with this lecture course. The questions they contain will be found embedded in these notes along with a number of supplementary exercises. Ackowledgements: IamverygratefultomywifeSylvia, toGeorge Cooper(Balliol), to Jake Lee (St Catz), to Alexander Ober (Hertford), and to Yuyang Shi (Oriel) for drawing my attention to some misprints (now corrected) in earlier versions of these notes. I would much welcome further feedback. Please let me know of any errors, infelicities and obscurities (or reading-list suggestions—see above). Please email me at [email protected] orwriteanotetomeatTheQueen’sCollegeorTheAndrewWilesBuildingoftheMathematical Institute. ΠMN: Queen’s: Version of Revised 22 March 2017 ii CONTENTS 1. Linear equations and matrices p.1 Linear equations 1 Matrices 1 The beginnings of matrix algebra 2 More on systems of linear equations 4 Elementary Row Operations (EROs) 5 2. Vector spaces 9 Vectors as we know them 9 Vector Spaces 9 Subspaces 11 Examples of vector spaces 13 3. Bases of vector spaces 17 Spanning sets 17 Linear independence 18 Bases 19 The Steinitz Exchange Lemma 20 4. Subspaces of vector spaces 23 Bases of subspaces 23 Finding bases of subspaces 24 An algebra of subspaces 25 Direct sums of subspaces 26 5. An introduction to linear transformations 29 What is a linear transformation? 29 Some examples of linear transformations 30 Some algebra of linear transformations 31 Rank and nullity 33 6. Linear transformations and matrices 37 Matrices of linear transformations with respect to given bases 37 Change of basis 40 More about matrices: rank 41 More on EROs: row reduced echelon (RRE) form 44 Using EROs to invert matrices 45 iii 7. Inner Product Spaces 49 Bilinear forms 49 Inner product spaces 50 Orthogonal matrices 52 The Cauchy–Schwarz Inequality 53 Complex inner product spaces 54 iv 1 Linear equations; matrices 1.1 Linear equations The solution of sets of simultaneous linear equations is one of the most widely used techniques of algebra. It is essential in most of pure mathematics, applied mathematics and statistics, and is heavily used in other areas such as economics, management, and other areas of practical or not-so-practical life. Although meteorological prediction, modern cryptography, and other such areas require the solutions of systems of many thousands, even millions, of simultaneous equations in a similar number of variables, we’ll be less ambitious to begin with. Systems of equations like   x+2y+3z = 6 y+2z+3w = 0 (cid:26) x+2y = 3   ((cid:63)) , 2x+3y+4z = 9 and x+2y+3z+4w = 2 2x+3y = 5  3x+4y+5z = 12  2x+3y+4z+5w = 0 may be solved (or shown to be insoluble) by a systematic elimination process that you should have come across before arriving at Oxford. Exercise 1.1. Which (if any) of the following systems of linear equations with real coeffi- cients have no solutions, which have a unique solution, which have infinitely many solutions?    2x+4y−3z =0 x+2y+3z =0 x+2y+3z =0    (a) x−4y+3z =0 ; (b) 2x+3y+4z =1 ; (c) 2x+3y+4z =2 .  3x−5y+2z =1  3x+4y+5z =2  3x+4y+5z =2 1.2 Matrices It is immediately apparent that it is only the coefficients that matter. Stripping away the variables from the systems ((cid:63)) of equations we are left with the arrays     1 2 3 0 1 2 3 (cid:18) (cid:19) 1 2 , 2 3 4, and 1 2 3 4 2 3 3 4 5 2 3 4 5     6 0 (cid:18) (cid:19) 3 on the left sides of the equations and , 9, and 2 on their right sides. 5 12 0 Such arrays are known as matrices. In general, an m×n matrix is a rectangular array with m rows and n columns. Conven- tionally the rows are numbered 1, ..., m from top to bottom and the columns are numbered 1, ..., n from left to right. In the Prelim context, the entries in the array will be real or com- plex numbers. The entry in row i and column j is usually denoted a or x , or something ij ij similar. Ifthe m×n matrix A hasentries a itisoftenwritten (a ) (or [a ] or (a )m n , ij ij ij ij i=1,j=1 or something similar)∗. A 1×n matrix is often called a row vector, an m×1 matrix is called ∗Please forgive my vagueness here and at a few other points in these notes. Mathematicians are inventive people, and it is not possible to list all the notational variations that they create. Clarification must often be sought from context. 1 a column vector. If A = (a ) and a = 0 for all relevant i, j (that is, for 1 (cid:54) i (cid:54) m, ij ij 1 (cid:54) j (cid:54) n) then we write A = 0, or sometimes A = 0 , and refer to it as the zero matrix. m×n Let F be the set from which the entries of our matrices come. Its members are known as scalars. In the Prelim context, usually F = R, the set of real numbers, or F = C, the set of complex numbers, (sometimes F = Q, the set of rational numbers). We define M (F) := {A | A is an m×n matrix with entries from F}. m×n Many authors write Fn for M (F) or Fm for M (F). The context will usually make 1×n m×1 clear what is intended. 1.3 The beginnings of matrix algebra There is a natural notion of addition for m×n matrices. If A = (a ) and B = (b ) then ij ij A+B = (c ), where c = a +b . Thus addition of matrices—which makes sense when ij ij ij ij they are of the same size, not otherwise—is “coordinatewise”. Likewise, there is a natural notion of scalar multiplication. If A = (a ) ∈ M (F) and λ ij m×n is a scalar then λA is the member of M (F) in which the (i,j) entry is λa . The following m×n ij is a collection of simple facts that are used all the time, unnoticed. Theorem 1.1. For A, B, C ∈ M (F) and λ, µ ∈ F : m×n (1) A+0 = 0 +A = A; m×n m×n (2) A+B = B+A; [addition is commutative] (3) A+(B+C) = (A+B)+C; [addition is associative] (4) λ(µA) = (λµ)A; (5) (λ+µ)A = λA+µA; (6) λ(A+B) = λA+λB. I do not propose to give formal proofs of these six assertions. It you are unfamiliar with this kind of reasoning, though, I recommend that you ensure that you can see exactly what a proof should involve: Exercise 1.2. Write out formal proofs of a few of the assertions in Theorem 1.1. Multiplication is more complicated. The product AB is defined only if the number of columns of A is the same as the number of rows of B. Thus if A is an m×n matrix then B must be an n×p matrix for some p. Then the product AB will be an m×p matrix, and if A = (a ), B = (b ) then AB = (c ), where ij jk ik n (cid:88) c = a b +a b +···+a b , or in summation notation, c = a b . ik i1 1k i2 2k in nk ik ij jk j=1 What this means visually is that to multiply A and B we run along the ith row of A (which contains n numbers a ) and down the jth column of B (in which there are n numbers b ), ij jk we multiply corresponding numbers and add. Thus (AB) comes from ik ↓    a a ... ... a b b ... b ... b 11 12 1n 11 12 1k 1p a a ... ... a b b ... b ... b   21 22 2n 21 22 2k 2p  ... ... ... ... ... ... ... ... ... ... ...    →a a ... ... a ... ... ... ... ... ...  i1 i2 in     ... ... ... ... ... ... ... ... ... ... ...    a a ... ... a b b ... b ... b m1 m2 mn n1 n2 nk np 2 Exercise 1.3. Calculate the following matrix products: (cid:18) (cid:19)(cid:18) (cid:19)  (cid:18) (cid:19)  2 1 2 x 0 1 1 2 1 2 3 ; ; 2 3 y 2 3 2 3 2 3 4 . 4 6 3 4 5 Note A. Using the definition, to multiply an m×n matrix and an n×p matrix the computation requires n numerical multiplications and n−1 numerical additions for each of the mp entries of the product matrix, hence mnp numerical multiplications and m(n−1)p numerical additions in total. Does it really require so many? It has been known since 1969 that one can do rather better. That is, when m, n, p are all at least 2, the product of two matrices can be calculated using fewer numerical operations than are needed for the the na¨ıve method. For large values of m, n, p it is not yet known how much better. This is a lively area of modern research on the boundary between pure mathematics and computer science. Theorem 1.2. Let A, B, C be matrices and λ a scalar. (1) If A ∈ M (F), B ∈ M (F) and C ∈ M (F), then A(BC) = (AB)C m×n n×p p×q [multiplication is associative]. (2) If A,B ∈ M (F) and C ∈ M (F), then (A + B)C = AC + BC and m×n n×p A(B+C) = AB+AC; [distributive laws]. (3) If A ∈ M (F) and B ∈ M (F) then (λA)B = A(λB) = λ(AB). m×n n×p The proofs are not hard, though the computation to prove part (1) may look a little daunting. I suggest that you write out a proof in order to familiarise yourself with what is involved. Later in these lectures we will see a computation-free proof and a natural reason for the associativity of matrix multiplication. Exercise 1.4. Write out formal proofs of the assertions in Theorem 1.2. Note B. If A ∈ M (F) and B ∈ M (F) then both AB and BA are defined if and m×n n×p only if m = p. Then AB ∈ M (F) and BA ∈ M (F). Thus if m (cid:54)= n then AB (cid:54)= BA. m×m n×n Square matrices A and B of the same size are said to commute if AB = BA. For n (cid:62) 2 there are plenty of pairs A, B of n×n matrices that do not commute. Exercise 1.5. For each n(cid:62)2 find matrices A, B ∈M (F) such that AB (cid:54)=BA. n×n (cid:18) (cid:19) a b Exercise 1.6. Let A be the 2×2 matrix . c d (cid:18) (cid:19) 1 0 (a) Show that A commutes with if and only if A is diagonal (that is, b=c=0). 0 0 (cid:18) (cid:19) 0 1 (b) Which 2×2 matrices A commute with ? 0 0 (c) Use the results of (a) and (b) to find the matrices A that commute with every 2×2 matrix. Note C. The n×n matrix I in which the (i,j) entry is 1 if i = j and 0 if i (cid:54)= j: n (cid:26) 1 if i = j, (I ) = δ = n ij ij 0 if i (cid:54)= j, is known as the identity matrix. An n×n matrix A is said to be invertible if there exists an n×n matrix B such that AB = BA = I . When this is case, there is only one such matrix n B, and one writes A−1 for B. 3 Theorem 1.3. Let A, B be invertible n × n matrices. Then AB is invertible and (AB)−1 = B−1A−1. I’ll not prove this here—but you’ll find it a valuable exercise. It is a special case of a very general phenomenon. Note the reversal of the factors. Hermann Weyl (in his book Symmetry if I remember correctly) points out how it is familiar in everyday life. To undo the process of putting on socks and then shoes, we take off the shoes first, then the socks. Exercise 1.7. Prove Theorem 1.3. Exercise 1.8. Show that if A is an m×n matrix and B is an n×p matrix then AI =A n and I B =B. n Exercise 1.9. Let A bean n×n matrix. Showthatif AB =BA=I and AC =CA=I n n then B =C. Exercise 1.10. The transpose Atr of an m×n matrix A = (a ) is the n×m matrix in ij which the (i,j) entry is a . Let A and B be m×n matrices, and let C be an n×p matrix. ji (a) Show that (A+B)tr =Atr +Btr and that (λA)tr =λAtr for scalars λ. (b) Show that (AC)tr =CtrAtr. (c) Suppose that n = m and that A is invertible. Show that Atr is invertible and that (Atr)−1 =(A−1)tr. Exercise 1.11. Let A and B denote n × n matrices with real entries. For each of the following assertions, find either a proof or a counterexample. (a) A2−B2 =(A−B)(A+B). (b) If AB =0 then A=0 or B =0. (c) If AB =0 then A and B cannot both be invertible. (d) If A and B are invertible then A+B is invertible. (e) If ABA=0 and B is invertible then A2 =0. [Hint: where the assertions are false there usually are counterexamples of size 2×2.] Exercise 1.12. Let J be the n×n matrix with all entries equal to 1. Let α,β ∈R with n α(cid:54)=0 and α+nβ (cid:54)=0. Show that the matrix αI +βJ is invertible. n n [Hint: notethat J2 =nJ ;seekaninverseof αI +βJ oftheform λI +µJ where λ,µ∈R.] n n n n n n   3 2 2 2 2 3 2 2 Find the inverse of  . 2 2 3 2 2 2 2 3 1.4 More on systems of linear equations The system (cid:80)n a x = b (for 1 (cid:54) i (cid:54) m) of m linear equations in n unknowns x , ..., x j=1 ij j i 1 n may be expressed as the single matrix equation Ax = b where A is the m×n matrix (a ) of coefficients, x is the n×1 column vector with entries ij x , ..., x , and b is the m×1 column vector with entries b , ..., b . In this notation the 1 n 1 m systems ((cid:63)) of equations in §1.1 are      1 2 3 x 6 (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) 1 2 x 3 = , 2 3 4y = 9 2 3 y 5 3 4 5 z 12 4 and x     0 1 2 3 0 y 1 2 3 4  = 2. z 2 3 4 5 0 w There is a systematic way to solve such systems. We divide every term of the first equation a x +a x +···+a x = b 11 1 12 2 1n n 1 by a to get an equivalent equation in which the coefficient of x is 1. That is not possible 11 1 if a = 0, but let’s set this difficulty aside for a moment and assume that a (cid:54)= 0. Next, for 11 11 2 (cid:54) i (cid:54) m we subtract a ×Equation (1) from Equation (i).† The output from this first set 1i of moves is a new system of linear equations which has a special form. Its first equation may be construedasgivinganexpressionfor x intermsof x , ..., x , b , andthevariouscoefficients 1 2 n 1 a . Its other m−1 equations do not involve x . They form a system of m−1 equations 1j 1 in n−1 variables. Clearly, this is progress: since the moves we have made (dividing the first equation through by a and then subtracting multiples of the new first equation from each 11 of the others) do not change the solution set; moreover, it is fair to assume that the smaller problem of solving the set of m−1 equations in the n−1 variables x , ..., x can be solved, 2 n and then we’ll have the value of x also. 1 But what if a = 0? If a = 0 for all i in the range 1 (cid:54) i (cid:54) m then x did not occur 11 i1 1 in any of our equations, and from the start we had the simpler problem of a system of m linear equations in n−1 variables. Therefore we may suppose that there exists r such that 2 (cid:54) r (cid:54) m and a (cid:54)= 0. Then we simply interchange Equation (1) and Equation (r). The r1 system of equations is unchanged. Only the order in which they are listed has changed. The solution set is unchanged. This systematic process and variants of it are known as Gaussian elimination, referring to work of C.F.Gauss in the early 1800s. It was used several centuries before that in China and (later) in Europe. 1.5 Elementary Row Operations (EROs) Now let’s return to matrices. The original system Ax = b is completely determined by the so- calledaugmented coefficientmatrix A|b thatisobtainedfromthe m×n matrix A byadjoining b as an (n + 1)th column. The operations on our systems of equations are elementary row operations (EROs) on the augmented matrix A|b: P(r,s) for 1 (cid:54) r < s (cid:54) m: interchange row r and row s. M(r,λ) for 1 (cid:54) r (cid:54) m and λ (cid:54)= 0: multiply (every entry of) row r by λ. S(r,s,λ) for 1 (cid:54) r (cid:54) m, 1 (cid:54) s (cid:54) m and r (cid:54)= s: add λ times row r to row s. Note. In my specification of M(r,λ) I used the word ‘multiply’, whereas in Gaussian elimination we actually divided by a . But of course, division by a is multiplication by 11 11 1/a . Similarly,inpracticewesubtract λ timesoneequationfromanother,butthisisaddition 11 of −λ times the one to the other. It is both traditional and convenient to use multiplication and addition instead of division and subtraction in the definition of EROs. †Can we multiply an equation by a number and subtract one equation from another? Technically not, I suppose. But it should be clear what is meant and this is no place for pedantry. 5 From the procedure described above it should be clear that the augmented matrix A|b can be changed using EROs to an m×(n+1) matrix E|d which has the following form, known as echelon form: • if row r of E has any non-zero entries then the first of these is 1; • if 1 (cid:54) r < s (cid:54) m and rows r, s of E contain non-zero entries, the first of which are e rj and e respectively, then j < k (the leading entries of lower rows occur to the right of sk those in higher rows); • if row r of E contains non-zero entries and row s does not (that is e = 0 for 1 (cid:54) j (cid:54) n) sj then r < s—that is, zero rows (if any) appear below all the non-zero rows. Example 1.1. Here is a simple example of reduction to echelon form using EROs.     0 1 2 3 0 1 2 3 4 2 P(1,2) 1 2 3 4 2 −→ 0 1 2 3 0 2 3 4 5 0 2 3 4 5 0   1 2 3 4 2 S(1,3,−2) −→ 0 1 2 3 0 0 −1 −2 −3 −4   1 2 3 4 2 S(2,3,1) −→ 0 1 2 3 2 0 0 0 0 −4   1 2 3 4 2 M(3,−1) −→4 0 1 2 3 2 0 0 0 0 1 Returning to equations, in this example we have manipulated the augmented matrix of the system  y+2z+3w = 0  x+2y+3z+4w = 2 ,  2x+3y+4z+5w = 0 the third of those proposed at ((cid:63)). The significance of the EROs is the following sequence of operations on equations: (1) interchange the first and second equations; (2) subtract 2 times the (new) first equation from the third equation; (3) add the (current) second equation to the (current) third equation. (4) multiply both sides of the (current) third equation by −1. 4 It is not hard to see that such manipulations do not change the set of solutions. But now the third equation has become 0x+0y+0z+0w = 1, which clearly has no solutions. Therefore the original set of three simultaneous equations had no solutions—it is inconsistent. Thepointofallthisisthatitiscompletelygeneralandsufficientlyroutinethatitcaneasily be mechanised. A machine can work with the augmented matrix of a system of simultaneous linear equations. If the final few rows of the echelon form are zero they can be deleted (they say that 0 = 0, which is certainly true, but is not helpful). Then all rows are non-zero. If the final row has its leading 1 in the (n+1)st position (the final position) then the equations are inconsistent. Otherwise we can choose arbitrary values for those variables x for which the j 6

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.