ebook img

Linear Algebra II [Lecture notes] PDF

75 Pages·2016·0.576 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Linear Algebra II [Lecture notes]

Linear Algebra II Course No. 100222 Spring 2007 Michael Stoll With some additions by Ronald van Luijk, 2016 Contents 1. Review of Eigenvalues, Eigenvectors and Characteristic Polynomial 2 2. Direct Sums of Subspaces 2 3. The Cayley-Hamilton Theorem and the Minimal Polynomial 7 4. The Structure of Nilpotent Endomorphisms 11 5. The Jordan Normal Form Theorem 16 6. The Dual Vector Space 21 7. Norms on Real Vector Spaces 28 8. Bilinear Forms 30 9. Inner Product Spaces 39 10. Orthogonal Diagonalization 49 11. External Direct Sums 53 12. The Tensor Product 54 13. Symmetric and Alternating Products 63 References 69 Index of notation 70 Index 71 2 1. Review of Eigenvalues, Eigenvectors and Characteristic Polynomial RecallthetopicswefinishedLinear Algebra Iwith. Wewerediscussingeigenvalues and eigenvectors of endomorphisms and square matrices, and the question when they are diagonalizable. For your convenience, I will repeat here the most relevant definitions and results. Let V be a finite-dimensional F-vector space, dimV = n, and let f : V → V be an endomorphism. Then for λ ∈ F, the λ-eigenspace of f was defined to be E (f) = {v ∈ V : f(v) = λv} = ker(f −λ id ). λ V λ is an eigenvalue of f if E (f) (cid:54)= {0}, i.e., if there is 0 (cid:54)= v ∈ V such that λ f(v) = λv. Such a vector v is called an eigenvector of f for the eigenvalue λ. The eigenvalues are exactly the roots (in F) of the characteristic polynomial of f, P (x) = det(x id −f), f V which is a monic polynomial of degree n with coefficients in F. The geometric multiplicity of λ as an eigenvalue of f is defined to be the dimension of the λ-eigenspace, whereas the algebraic multiplicity of λ as an eigenvalue of f is defined to be its multiplicity as a root of the characteristic polynomial. The endomorphism f is said to be diagonalizable if there exists a basis of V consisting of eigenvectors of f. The matrix representing f relative to this basis is then a diagonal matrix, with the various eigenvalues appearing on the diagonal. Since n×n matrices can be identified with endomorphisms Fn → Fn, all notions and results makes sense for square matrices, too. A matrix A ∈ Mat(n,F) is diagonalizable if and only if it is similar to a diagonal matrix, i.e., if there is an invertible matrix P ∈ Mat(n,F) such that P−1AP is diagonal. It is an important fact that the geometric multiplicity of an eigenvalue cannot exceed its algebraic multiplicity. An endomorphism or square matrix is diagonal- izable if and only if the sum of the geometric multiplicities of all eigenvalues equals the dimension of the space. This in turn is equivalent to the two conditions (a) the characteristic polynomial is a product of linear factors, and (b) for each eigen- value, algebraic and geometric multiplicities agree. For example, both conditions are satisfied if P is the product of n distinct monic linear factors. f 2. Direct Sums of Subspaces The proof of the Jordan Normal Form Theorem, which is one of our goals, uses the idea to split the vector space V into subspaces on which the endomorphism can be more easily described. In order to make this precise, we introduce the notion of direct sum of linear subspaces of V. 2.1. Definition. Suppose I is an index set and U ⊂ V (for i ∈ I) are linear i subspaces of a vector space V satisfying   (cid:88) (1) Uj ∩ Ui = {0} i∈I\{j} (cid:76) (cid:80) for all j ∈ I. Then we write U for the subspace U of V, and we call i∈I i i∈I i this sum the direct sum of the subspaces U . Whenever we use this notation, the i 3 hypothesis (1) is implied. If I = {1,2,...,n}, then we also write U ⊕U ⊕···⊕U . 1 2 n 2.2. Lemma. Let V be a vector space, and U ⊂ V (for i ∈ I) linear subspaces. i Then the following statements are equivalent. (cid:80) (1) Every v ∈ V can be written uniquely as v = u with u ∈ U for all i∈I i i i i ∈ I (and only finitely many u (cid:54)= 0). i (cid:80) (cid:80) (2) U = V, and for all j ∈ I, we have U ∩ U = {0}. i∈I i j i∈I\{j} i (3) If we have any basis B of U for each i ∈ I, then these bases B are pairwise i i i (cid:83) disjoint, and the union B forms a basis of V. i∈I i (4) There exists a basis B of U for each i ∈ I such that these bases B are i i i (cid:83) pairwise disjoint, and the union B forms a basis of V. i∈I i By statement (2) of this lemma, if these conditions are satisfied, then V is the (cid:76) direct sum of the subspaces U , that is, we have V = U . i i∈I i Proof. “(1) ⇒ (2)”: Since every v ∈ V can be written as a sum of elements of (cid:80) (cid:80) the U , we have V = U . Now assume that v ∈ U ∩ U . This gives two i i∈I i j i(cid:54)=j i (cid:80) representations of v as v = u = u . Since there is only one way of writing v j i(cid:54)=j i as a sum of u ’s, this is only possible when v = 0. i “(2) ⇒ (3)”: Since the elements of any basis are nonzero, and B is contained in U i i (cid:80) for all i, it follows from U ∩ U = {0} that B ∩B = ∅ for all i (cid:54)= j. Let j i∈I\{j} i i j (cid:83) (cid:80) B = B . Since B generates U and U = V, we find that B generates V. i∈I i i i i i To show that B is linearly independent, consider a linear combination (cid:88)(cid:88) λ b = 0. i,b i∈I b∈Bi For any fixed j ∈ I, we can write this as (cid:88) (cid:88)(cid:88) (cid:88) U (cid:51) u = λ b = − λ B ∈ U . j j j,b i,b i b∈Bj i(cid:54)=j b∈Bi i(cid:54)=j By(2), thisimpliesthatu = 0. SinceB isabasisofU , thisisonlypossiblewhen j j j λ = 0 for all b ∈ B . Since j ∈ I was arbitrary, this shows that all coefficients j,b j vanish. “(3) ⇒ (4)”: This follows by choosing any basis B for U (see Remark 2.3). i i “(4) ⇒ (1)”: Take a basis B for U for each i ∈ I. Write v ∈ V as a linear i i (cid:83) combination of the basis elements in B . Since B is a basis of U , we may write i i i i (cid:80) the part of the linear combination coming from B as u , which yields v = u i i i i with u ∈ U . To see that the u are unique, we note that the u can be written i i i i (cid:80) as linear combination of elements in B ; the sum v = u is then a linear i i i (cid:83) combination of elements in B , which has to be the same as the original linear i i (cid:83) combination, because B is a basis for V. It follows that indeed all the u are i i i uniquely determined. (cid:3) 2.3. Remark. The proof of the implication (3) ⇒ (4) implicitly assumes the existence of a basis B for each U . The existence of a basis B for U is clear i i i i when U is finite-dimensional, but for infinite-dimensional vector spaces this is i more subtle. Using Zorn’s Lemma, which is equivalent to the Axiom of Choice of Set Theory, one can prove that all vector spaces do indeed have a basis. See Appendix D of Linear Algebra I, 2015 edition. We will use this more often. 4 2.4. Remark. If U and U are linear subspaces of the vector space V, then state- 1 2 ment V = U ⊕U is equivalent to U and U being complementary subspaces. 1 2 1 2 2.5. Lemma. Suppose V is a vector space with subspaces U and U(cid:48) such that V = U ⊕ U(cid:48). If U ,...,U are subspaces of U with U = U ⊕ ··· ⊕ U and 1 r 1 r U(cid:48),...,U(cid:48) are subspaces of U(cid:48) with U(cid:48) = U(cid:48) ⊕···⊕U(cid:48), then we have 1 s 1 s V = U ⊕···⊕U ⊕U(cid:48) ⊕···⊕U(cid:48). 1 r 1 s Proof. This follows most easily from part (1) of Lemma 2.2. (cid:3) The converse of this lemma is trivial in the sense that if we have V = U ⊕···⊕U ⊕U(cid:48) ⊕···⊕U(cid:48), 1 r 1 s then apparently the r +s subspaces U ,...,U ,U(cid:48),...,U(cid:48) satisfy the hypothesis 1 r 1 s (1), which implies that also the r subspaces U ,...,U satisfy this hypothesis, as 1 r well as the subspaces U(cid:48),...,U(cid:48); then also the two subspaces U = U ⊕···⊕U 1 s 1 r and U(cid:48) = U(cid:48) ⊕...⊕U(cid:48) together satisfy the hypothesis and we have V = U ⊕U(cid:48). 1 s In other words, we may write (U ⊕···⊕U )⊕(U(cid:48) ⊕···⊕U(cid:48)) = U ⊕···⊕U ⊕U(cid:48) ⊕···⊕U(cid:48) 1 r 1 s 1 r 1 s in the sense that if all the implied conditions of the form (1) are satisfied for one side of the equality, then the same holds for the other side, and the (direct) sums are then equal. In particular, we have U ⊕(U ⊕···⊕U ) = U ⊕···⊕U . 1 2 r 1 r The following lemma states that if two subspaces intersect each other trivially, then one can be extended to a complementary space of the other. Its proof also suggests how we can do the extension explicitly. 2.6. Lemma. Let U and U(cid:48) be subspaces of a finite-dimensional vector space V satisfying U ∩U(cid:48) = {0}. Then there exists a subspace W ⊂ V with U(cid:48) ⊂ W that is a complementary subspace of U in V. Proof. Let (u ,...,u ) be a basis for U and (v ,...,v ) a basis for U(cid:48). Then 1 r 1 s Lemma 2.2 we have a basis (u ,...,u ,v ,...,v ) for U + U(cid:48) = U ⊕ U(cid:48). By the 1 r 1 s Basis Extension Theorem of Linear Algebra 1, we may extend this to a basis (u ,...,u ,v ,...,v ,w ,...,w ) for V. We now let W be the subspace generated 1 r 1 s 1 t by v ,...,v ,w ,...,w . Then (v ,...,v ,w ,...,w ) is a basis for W and clearly 1 s 1 t 1 s 1 t W contains U(cid:48). By Lemma 2.2 we conclude that U and W are complementary spaces. (cid:3) Next, we discuss the relation between endomorphisms of V and endomorphisms between the U . i 2.7. Lemma and Definition. Let V be a vector space with linear subspaces U i (cid:76) (i ∈ I) such that V = U . For each i ∈ I, let f : U → U be an endomor- i∈I i i i i phism. Then there is a unique endomorphism f : V → V such that f| = f for Ui i all i ∈ I. (cid:76) We call f the direct sum of the f and write f = f . i i∈I i 5 (cid:80) Proof. Let v ∈ V. Then we have v = u as above, therefore the only way i i (cid:80) to define f is by f(v) = f (u ). This proves uniqueness. Since the u in the i i i i representation of v above are unique, f is a well-defined map, and it is clear that f is linear, so f is an endomorphism of V. (cid:3) 2.8. Remark. If in the situation just described V is finite-dimensional and we choose a basis of V that is the union of bases of the U , then the matrix represent- i ing f relative to that basis will be a block diagonal matrix, where the diagonal blocks are the matrices representing the f relative to the bases of the U . i i 2.9. Lemma. Let V be a vector space with linear subspaces U (i ∈ I) such that i (cid:76) V = U . Let f : V → V be an endomorphism. Then there are endomorphims i∈I i (cid:76) f : U → U for i ∈ I such that f = f if and only if each U is invariant i i i i∈I i i under f (or f-invariant), i.e., f(U ) ⊂ U . i i (cid:76) Proof. If f = f , then f = f| , hence f(U ) = f| (U ) = f (U ) ⊂ U . i i i Ui i Ui i i i i Conversely, suppose that f(U ) ⊂ U . Then we can define f : U → U to be the i i i i i restriction of f to U ; it is then clear that f is an endomorphism of U and that f i i i (cid:76) equals f , as the two coincide on all the subspaces U , which together generate i i i V. (cid:3) We now come to a relation between splittings of f as a direct sum and the char- acteristic or minimal polynomial of f. We call two polynomials p (x) and p (x) coprime if there are polynomials a (x) 1 2 1 and a (x) such that a (x)p (x)+a (x)p (x) = 1. 2 1 1 2 2 2.10. Lemma. Let V be a vector space and f : V → V an endomorphism. Let p(x) = p (x)p (x) be a polynomial such that p(f) = 0 and such that p (x) and 1 2 1 (cid:0) (cid:1) p (x) are coprime. Let U = ker p (f) , for i = 1,2. Then V = U ⊕U and the 2 i i 1 2 U are f-invariant. In particular, f = f ⊕f , where f = f| . Moreover, we have i (cid:0) (cid:1) (cid:0) (cid:1) 1 2 i Ui U = im p (f) and U = im p (f) . 1 2 2 1 (cid:0) (cid:1) (cid:0) (cid:1) Proof. Set K = im p (f) and K = im p (f) . We first show that K ⊂ U for 1 2 2 1 i i (cid:0) (cid:1) (cid:0) (cid:1) i = 1,2. Let v ∈ K = im p (f) , so v = p (f) (u) for some u ∈ V. Then 1 2 2 (cid:16) (cid:17) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) p (f) (v) = p (f) p (f) (u) = p (f)p (f) (u) = p(f) (u) = 0, 1 1 2 1 2 (cid:0) (cid:1) (cid:0) (cid:1) so K = im p (f) ⊂ ker p (f) = U . The statement for i = 2 follows by 1 2 1 1 symmetry. (cid:0) (cid:1) Now we show that U ∩ U = {0}. So let v ∈ U ∩ U . Then p (f) (v) = 1 2 1 2 1 (cid:0) (cid:1) p (f) (v) = 0. Using 2 (cid:0) (cid:1) id = 1(f) = a (x)p (x)+a (x)p (x) (f) = a (f)◦p (f)+a (f)◦p (f), V 1 1 2 2 1 1 2 2 we see that (cid:16) (cid:17) (cid:16) (cid:17) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) v = a (f) p (f) (v) + a (f) p (f) (v) = a (f) (0)+ a (f) (0) = 0. 1 1 2 2 1 2 Next, we show that K + K = V. Using the same relation above, and the fact 1 2 that p (f) and a (f) commute, we find for v ∈ V arbitrary that i i (cid:16) (cid:17) (cid:16) (cid:17) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) v = p (f) a (f) (v) + p (f) a (f) (v) ∈ im p (f) +im p (f) . 1 1 2 2 1 2 6 These statements together imply that K = U for i = 1,2, and V = U ⊕ U . i i 1 2 Indeed, let v ∈ U . We can write v = v +v with v ∈ K . Then U (cid:51) v −v = 1 1 2 i i 1 1 v ∈ U , but U ∩U = {0}, so v = v ∈ K . 2 2 1 2 1 1 Finally, we have to show that U and U are f-invariant. So let (e.g.) v ∈ U . 1 2 1 Since f commutes with p (f), we have 1 (cid:16) (cid:17) (cid:0) (cid:1)(cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) p (f) f(v) = p (f)◦f (v) = f ◦p (f) (v) = f p (f) (v) = f(0) = 0, 1 1 1 1 (since v ∈ U = ker(cid:0)p (f)(cid:1)), hence f(v) ∈ U as well. (cid:3) 1 1 1 2.11. Proposition. Let V be a vector space and f : V → V an endomorphism. Let p(x) = p (x)p (x)···p (x) be a polynomial such that p(f) = 0 and such that 1 2 k (cid:0) (cid:1) the factors p (x) are coprime in pairs. Let U = ker p (f) . Then V = U ⊕···⊕U i i i 1 k and the U are f-invariant. In particular, f = f ⊕···⊕f , where f = f| . i 1 k i Ui Proof. We proceed by induction on k. The case k = 1 is trivial. So let k ≥ 2, and denote q(x) = p (x)···p (x). Then I claim that p (x) and q(x) are coprime. To 2 k 1 see this, note that by assumption, we can write, for i = 2,...,k, a (x)p (x)+b (x)p (x) = 1. i 1 i i Multiplying these equations, we obtain A(x)p (x)+b (x)···b (x)q(x) = 1; 1 2 k note that all the terms except b (x)···b (x)q(x) that we get when expanding the 2 k product of the left hand sides contains a factor p (x). 1 We can then apply Lemma 2.10 to p(x) = p (x)q(x) and find that V = U ⊕U(cid:48) 1 1 (cid:0) (cid:1) (cid:0) (cid:1) and f = f ⊕f(cid:48) with U = ker p (f) , f = f| , and U(cid:48) = ker q(f) , f(cid:48) = f| . 1 1 1 1 U1 U(cid:48) In particular, q(f(cid:48)) = 0. By induction, we then know that U(cid:48) = U ⊕···⊕U with 2 k (cid:0) (cid:1) U = ker p (f(cid:48)) and f(cid:48) = f ⊕···⊕f , where f = f(cid:48)| , for j = 2,...,k. Finally, j (cid:0) (cid:1)j (cid:0) (cid:1) 2 k j Uj ker p (f(cid:48)) = ker p (f) (sincethelatteriscontainedinU(cid:48))andf = f(cid:48)| = f| , j j j Uj Uj so that we obtain the desired conclusion from Lemma 2.5. (cid:3) The following little lemma about polynomials is convenient if we want to apply Lemma 2.10. 2.12. Lemma. If p(x) is a polynomial (over F) and λ ∈ F such that p(λ) (cid:54)= 0, then (x−λ)m and p(x) are coprime for all m ≥ 1. Proof. First, consider m = 1. Let p(x) q(x) = −1; p(λ) thisisapolynomialsuchthatq(λ) = 0. Therefore, wecanwriteq(x) = (x−λ)r(x) with some polynomial r(x). This gives us 1 −r(x)(x−λ)+ p(x) = 1. p(λ) Now, taking the mth power on both sides, we obtain an equation (cid:0)−r(x)(cid:1)m(x−λ)m +a(x)p(x) = 1. (cid:3) 7 3. The Cayley-Hamilton Theorem and the Minimal Polynomial LetA ∈ Mat(n,F). WeknowthatMat(n,F)isanF-vectorspaceofdimensionn2. Therefore, the elements I,A,A2,...,An2 cannot be linearly independent (because their number exceeds the dimension). If we define p(A) in the obvious way for p a polynomial with coefficients in F (as we already did in the previous chapter), then we can deduce that there is a (non-zero) polynomial p of degree at most n2 such that p(A) = 0 (0 here is the zero matrix). In fact, much more is true. Consider a diagonal matrix D = diag(λ ,λ ,...,λ ). (This notation is supposed 1 2 n to mean that λ is the (j,j) entry of D; the off-diagonal entries are zero, of course.) j Its characteristic polynomial is P (x) = (x−λ )(x−λ )···(x−λ ). D 1 2 n SincethediagonalentriesarerootsofP ,wealsohaveP (D) = 0. Moregenerally, D D consider a diagonalizable matrix A. Then there is an invertible matrix Q such that D = Q−1AQ is diagonal. Since (Exercise!) p(Q−1AQ) = Q−1p(A)Q for p a polynomial, we find 0 = P (D) = Q−1P (A)Q = Q−1P (A)Q =⇒ P (A) = 0. D D A A (RecallthatP = P —similarmatriceshavethesamecharacteristicpolynomial.) A D The following theorem states that this is true for all square matrices (or endomor- phisms of finite-dimensional vector spaces). 3.1. Theorem (Cayley-Hamilton). Let A ∈ Mat(n,F). Then P (A) = 0. A Proof. Here is a simple, but wrong “proof”. By definition, P (x) = det(xI −A), A so, plugging in A for x, we have P (A) = det(AI−A) = det(A−A) = det(0) = 0. A (Exercise: find the mistake!) For the correct proof, we need to consider matrices whose entries are polynomials. Since polynomials satisfy the field axioms except for the existence of inverses, we can perform all operations that do not require divisions. This includes addition, multiplication and determinants; in particular, we can use the adjugate matrix. ˜ Let B = xI−A, then det(B) = P (x). Let B be the adjugate matrix; then we still A ˜ ˜ have BB = det(B)I. The entries of B come from determinants of (n−1)×(n−1) submatrices of B, therefore they are polynomials of degree at most n−1. We can then write B˜ = xn−1B +xn−2B +···+xB +B , n−1 n−2 1 0 and we have the equality (of matrices with polynomial entries) (xn−1B +xn−2B +···+B )(xI−A) = P (x)I = (xn+b xn−1+···+b )I, n−1 n−2 0 A n−1 0 where we have set P (x) = xn+b xn−1+···+b . Expanding the left hand side A n−1 0 and comparing coefficients of like powers of x, we find the relations B = I, B −B A = b I, ..., B −B A = b I, −B A = b I. n−1 n−2 n−1 n−1 0 1 1 0 0 We multiply these from the right by An, An−1, ..., A, I, respectively, and add: B An = An n−1 B An−1 − B An = b An−1 n−2 n−1 n−1 . . . . . . . . . B A − B A2 = b A 0 1 1 − B A = b I 0 0 0 = P (A) A 8 (cid:3) 3.2. Remarks. (1) The reason why we cannot simply plug in A for x in the identity ˜ B ·(xI −A) = P (x)I A is that whereas x (as a scalar) commutes with the matrices occurring as coefficients of powers of x, it is not a priori clear that A does so, too. We will discuss this in more detail in the Introductory Algebra course, where polynomial rings will be studied in some detail. (2) Another idea of proof (and maybe easier to grasp) is to say that a ‘generic’ matrix is diagonalizable (if we assume F to be algebraically closed...), hence the statement holds for ‘most’ matrices. Since it is just a bunch of polynomial relations between the matrix entries, it then must hold for all matrices. This can indeed be turned into a proof, but unfortunately, this requires rather advanced tools from algebra. (3) Of course, the statement of the theorem remains true for endomorphisms. Let f : V → V be an endomorphism of the finite-dimensional F-vector space V, then P (f) = 0 (which is the zero endomorphism in this case). f For evaluating the polynomial at f, we have to interpret fn as the n-fold composition f ◦f ◦···◦f, and f0 = id . V Our next goal is to define the minimal polynomial of a matrix or endomorphism, as the monic polynomial of smallest degree that has the matrix or endomorphism as a “root”. However, we need to know a few more facts about polynomials in order to see that this definition makes sense. 3.3. Lemma (Polynomial Division). Let f and g be polynomials, with g monic. Then there are unique polynomials q and r such that r = 0 or deg(r) < deg(g) and such that f = qg +r. Proof. We first prove existence, by induction on the degree of f. If deg(f) < deg(g), then we take q = 0 and r = f. So we now assume that m = deg(f) ≥ deg(g) = n, f = a xm+···+a . Let f(cid:48) = f−a xm−ng, then (since g = xn+...) m 0 m deg(f(cid:48)) < deg(f). By the induction hypothesis, there are q(cid:48) and r such that deg(r) < deg(g) or r = 0 and such that f(cid:48) = q(cid:48)g+r. Then f = (q(cid:48)+a xm−n)g+r. m (This proof leads to the well-known algorithm for polynomial long division.) As to uniqueness, suppose we have f = qg +r = q(cid:48)g +r(cid:48), with r and r(cid:48) both of degree less than deg(g) or zero. Then (q −q(cid:48))g = r(cid:48) −r. If q (cid:54)= q(cid:48), then the degree of the left hand side is at least deg(g), but the degree of the right hand side is smaller, hence this is not possible. So q = q(cid:48), and therefore r = r(cid:48), too. (cid:3) Taking g = x−α, this provides a different proof for case k = 1 of Example 8.4 of Linear Algebra I, 2015 edition.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.