Table Of ContentMatrix Differential Calculus
with Applications in Statistics
and Econometrics
WILEY SERIES IN PROBABILITY AND STATISTICS
Established by WALTER E. SHEWHART AND SAMUEL S. WILKS
Editors: Vic Barnett, Noel A. C. Cressie, Nicholas, I. Fisher,
Iain M. Johnstone, J. B. Kadane, David, G. Kendall, David W. Scott,
Bernard W. Silverman, Adrian F. M. Smith, Jozef L. Teugels
Editors Emeritus: Ralph A. Bradley, J. Stuart Hunter
A complete list of the titles in this series appears at the end of this volume
Matrix
Differential
Calculus
with Applications
in Statistics
and Econometrics
Third Edition
JAN R. MAGNUS
CentER, Tilburg University
and
HEINZ NEUDECKER
Cesaro, Schagen
JOHN WILEY & SONS
Chichester • New York • Weinheim • Brisbane • Singapore • Toronto
Copyright(cid:13)c1988,1999JohnWiley&SonsLtd,
BaffinsLane,Chichester,
WestSussexPO191UD,England
National01243779777
International (+44)1243779777
Copyright(cid:13)c1999oftheEnglishandRussianLATEXfileCentER,TilburgUniversity,
P.O.Box90153,5000LETilburg,TheNetherlands
Copyright(cid:13)c2007oftheThirdEditionJanMagnusandHeinzNeudecker.Allrightsreserved.
Publication dataforthe second (revised) edition
LibraryofCongressCataloging in PublicationData
Magnus,JanR.
Matrixdifferentialcalculuswithapplicationsinstatisticsand
econometrics/J.R.MagnusandH.Neudecker—Rev.ed. p. cm.
Includesbibliographicalreferencesandindex.
ISBN0-471-98632-1(alk.paper);ISBN0-471-98633-X(pbk:alk.paper)
1.Matrices. 2.DifferentialCalculus. 3.Statistics.
4.Econometrics. I.Neudecker,Heinz. II.Title.
QA188.M345 1999
512.9′434—dc21 98-53556
CIP
British LibraryCataloguing inPublicationData
AcataloguerecordforthisbookisavailablefromtheBritishLibrary
ISBN0-471-98632-1; 0-471-98633-X(pbk)
Publication dataforthe third edition
Thisisversion07/01.
Lastupdate:16January2007.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Part One — Matrices
1 Basic properties of vectors and matrices 3
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Matrices: addition and multiplication . . . . . . . . . . . . . . . 4
4 The transpose of a matrix . . . . . . . . . . . . . . . . . . . . . 6
5 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6 Linear forms and quadratic forms . . . . . . . . . . . . . . . . . 7
7 The rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . 8
8 The inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
9 The determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 10
10 The trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
11 Partitioned matrices . . . . . . . . . . . . . . . . . . . . . . . . 11
12 Complex matrices . . . . . . . . . . . . . . . . . . . . . . . . . 13
13 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . 14
14 Schur’s decomposition theorem . . . . . . . . . . . . . . . . . . 17
15 The Jordan decomposition . . . . . . . . . . . . . . . . . . . . . 18
16 The singular-value decomposition . . . . . . . . . . . . . . . . . 19
17 Further results concerning eigenvalues . . . . . . . . . . . . . . 20
18 Positive (semi)definite matrices . . . . . . . . . . . . . . . . . . 23
19 Three further results for positive definite matrices . . . . . . . 25
20 A useful result . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 27
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 Kronecker products, the vec operator and the Moore-Penrose inverse 31
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2 The Kronecker product . . . . . . . . . . . . . . . . . . . . . . 31
3 Eigenvalues of a Kronecker product . . . . . . . . . . . . . . . . 33
4 The vec operator . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 The Moore-Penrose(MP) inverse . . . . . . . . . . . . . . . . . 36
6 Existence and uniqueness of the MP inverse . . . . . . . . . . . 37
v
vi Contents
7 Some properties of the MP inverse . . . . . . . . . . . . . . . . 38
8 Further properties . . . . . . . . . . . . . . . . . . . . . . . . . 39
9 The solution of linear equation systems . . . . . . . . . . . . . 41
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 43
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Miscellaneous matrix results 47
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2 The adjoint matrix . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 Proof of Theorem 1. . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Bordered determinants . . . . . . . . . . . . . . . . . . . . . . . 51
5 The matrix equation AX =0 . . . . . . . . . . . . . . . . . . . 51
6 The Hadamard product . . . . . . . . . . . . . . . . . . . . . . 53
7 The commutation matrix K . . . . . . . . . . . . . . . . . . 54
mn
8 The duplication matrix D . . . . . . . . . . . . . . . . . . . . 56
n
9 Relationship between D and D , I . . . . . . . . . . . . . . 58
n+1 n
10 Relationship between D and D , II . . . . . . . . . . . . . . 60
n+1 n
11 Conditions for a quadratic form to be positive (negative) sub-
ject to linear constraints . . . . . . . . . . . . . . . . . . . . . . 61
12 Necessary and sufficient conditions for r(A:B)=r(A)+r(B) 64
13 The bordered Gramian matrix . . . . . . . . . . . . . . . . . . 66
14 The equations X A+X B′ =G ,X B =G . . . . . . . . . . 68
1 2 1 1 2
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 71
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Part Two — Differentials: the theory
4 Mathematical preliminaries 75
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2 Interior points and accumulation points . . . . . . . . . . . . . 75
3 Open and closed sets . . . . . . . . . . . . . . . . . . . . . . . . 76
4 The Bolzano-Weierstrasstheorem . . . . . . . . . . . . . . . . . 79
5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 The limit of a function . . . . . . . . . . . . . . . . . . . . . . . 81
7 Continuous functions and compactness . . . . . . . . . . . . . . 82
8 Convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9 Convex and concave functions . . . . . . . . . . . . . . . . . . . 85
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5 Differentials and differentiability 89
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3 Differentiability and linear approximation . . . . . . . . . . . . 91
4 The differential of a vector function. . . . . . . . . . . . . . . . 93
5 Uniqueness of the differential . . . . . . . . . . . . . . . . . . . 95
6 Continuity of differentiable functions . . . . . . . . . . . . . . . 96
7 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 97
Contents vii
8 The first identification theorem . . . . . . . . . . . . . . . . . . 98
9 Existence of the differential, I . . . . . . . . . . . . . . . . . . . 99
10 Existence of the differential, II . . . . . . . . . . . . . . . . . . 101
11 Continuous differentiability . . . . . . . . . . . . . . . . . . . . 103
12 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
13 Cauchy invariance . . . . . . . . . . . . . . . . . . . . . . . . . 105
14 The mean-value theorem for real-valued functions . . . . . . . . 106
15 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . 107
16 Some remarks on notation . . . . . . . . . . . . . . . . . . . . . 109
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 110
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6 The second differential 113
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
2 Second-order partial derivatives . . . . . . . . . . . . . . . . . . 113
3 The Hessian matrix. . . . . . . . . . . . . . . . . . . . . . . . . 114
4 Twice differentiability and second-order approximation,I . . . 115
5 Definition of twice differentiability . . . . . . . . . . . . . . . . 116
6 The second differential . . . . . . . . . . . . . . . . . . . . . . . 118
7 (Column) symmetry of the Hessian matrix . . . . . . . . . . . . 120
8 The second identification theorem . . . . . . . . . . . . . . . . 122
9 Twice differentiability and second-order approximation,II . . . 123
10 Chain rule for Hessian matrices . . . . . . . . . . . . . . . . . . 125
11 The analogue for second differentials . . . . . . . . . . . . . . . 126
12 Taylor’s theorem for real-valued functions . . . . . . . . . . . . 128
13 Higher-order differentials. . . . . . . . . . . . . . . . . . . . . . 129
14 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7 Static optimization 133
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
2 Unconstrained optimization . . . . . . . . . . . . . . . . . . . . 134
3 The existence of absolute extrema . . . . . . . . . . . . . . . . 135
4 Necessary conditions for a local minimum . . . . . . . . . . . . 137
5 Sufficient conditions for a local minimum: first-derivative test . 138
6 Sufficient conditions for a local minimum: second-derivative test140
7 Characterizationof differentiable convex functions . . . . . . . 142
8 Characterizationof twice differentiable convex functions . . . . 145
9 Sufficient conditions for an absolute minimum . . . . . . . . . . 147
10 Monotonic transformations . . . . . . . . . . . . . . . . . . . . 147
11 Optimization subject to constraints . . . . . . . . . . . . . . . . 148
12 Necessary conditions for a local minimum under constraints . . 149
13 Sufficient conditions for a local minimum under constraints . . 154
14 Sufficient conditions for an absolute minimum under constraints158
15 A note on constraints in matrix form . . . . . . . . . . . . . . . 159
16 Economic interpretation of Lagrange multipliers. . . . . . . . . 160
Appendix: the implicit function theorem . . . . . . . . . . . . . . . . 162
viii Contents
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Part Three — Differentials: the practice
8 Some important differentials 167
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
2 Fundamental rules of differential calculus . . . . . . . . . . . . 167
3 The differential of a determinant . . . . . . . . . . . . . . . . . 169
4 The differential of an inverse . . . . . . . . . . . . . . . . . . . 171
5 Differential of the Moore-Penroseinverse . . . . . . . . . . . . . 172
6 The differential of the adjoint matrix . . . . . . . . . . . . . . . 175
7 On differentiating eigenvalues and eigenvectors . . . . . . . . . 177
8 The differential of eigenvalues and eigenvectors:symmetric case 179
9 The differential of eigenvalues and eigenvectors:complex case . 182
10 Two alternative expressions for dλ . . . . . . . . . . . . . . . . 185
11 Second differential of the eigenvalue function . . . . . . . . . . 188
12 Multiple eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . 189
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 189
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9 First-order differentials and Jacobian matrices 193
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3 Bad notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
4 Good notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5 Identification of Jacobian matrices . . . . . . . . . . . . . . . . 198
6 The first identification table . . . . . . . . . . . . . . . . . . . . 198
7 Partitioning of the derivative . . . . . . . . . . . . . . . . . . . 199
8 Scalar functions of a vector . . . . . . . . . . . . . . . . . . . . 200
9 Scalar functions of a matrix, I: trace . . . . . . . . . . . . . . . 200
10 Scalar functions of a matrix, II: determinant. . . . . . . . . . . 202
11 Scalar functions of a matrix, III: eigenvalue . . . . . . . . . . . 204
12 Two examples of vector functions . . . . . . . . . . . . . . . . . 204
13 Matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . 205
14 Kronecker products . . . . . . . . . . . . . . . . . . . . . . . . . 208
15 Some other problems . . . . . . . . . . . . . . . . . . . . . . . . 210
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10 Second-order differentials and Hessian matrices 213
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
2 The Hessian matrix of a matrix function . . . . . . . . . . . . . 213
3 Identification of Hessian matrices . . . . . . . . . . . . . . . . . 214
4 The second identification table . . . . . . . . . . . . . . . . . . 215
5 An explicit formula for the Hessian matrix . . . . . . . . . . . . 217
6 Scalar functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7 Vector functions . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8 Matrix functions, I . . . . . . . . . . . . . . . . . . . . . . . . . 220
Contents ix
9 Matrix functions, II . . . . . . . . . . . . . . . . . . . . . . . . 221
Part Four — Inequalities
11 Inequalities 225
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
2 The Cauchy-Schwarzinequality . . . . . . . . . . . . . . . . . . 225
3 Matrix analogues of the Cauchy-Schwarz inequality . . . . . . . 227
4 The theorem of the arithmetic and geometric means . . . . . . 228
5 The Rayleigh quotient . . . . . . . . . . . . . . . . . . . . . . . 230
6 Concavity of λ , convexity of λ . . . . . . . . . . . . . . . . . 231
1 n
7 Variational description of eigenvalues . . . . . . . . . . . . . . . 232
8 Fischer’s min-max theorem . . . . . . . . . . . . . . . . . . . . 233
9 Monotonicity of the eigenvalues . . . . . . . . . . . . . . . . . . 235
10 The Poincar´eseparation theorem . . . . . . . . . . . . . . . . . 236
11 Two corollaries of Poincar´e’s theorem . . . . . . . . . . . . . . 237
12 Further consequences of the Poincar´e theorem . . . . . . . . . . 238
13 Multiplicative version . . . . . . . . . . . . . . . . . . . . . . . 239
14 The maximum of a bilinear form . . . . . . . . . . . . . . . . . 241
15 Hadamard’s inequality . . . . . . . . . . . . . . . . . . . . . . . 242
16 An interlude: Karamata’s inequality . . . . . . . . . . . . . . . 243
17 Karamata’s inequality applied to eigenvalues . . . . . . . . . . 245
18 An inequality concerning positive semidefinite matrices. . . . . 245
19 A representation theorem for ( ap)1/p . . . . . . . . . . . . . 246
i
20 A representation theorem for (trAp)1/p . . . . . . . . . . . . . . 248
P
21 H¨older’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . 249
22 Concavity of logA . . . . . . . . . . . . . . . . . . . . . . . . . 250
| |
23 Minkowski’s inequality . . . . . . . . . . . . . . . . . . . . . . . 252
24 Quasilinear representation of A1/n . . . . . . . . . . . . . . . . 254
| |
25 Minkowski’s determinant theorem. . . . . . . . . . . . . . . . . 256
26 Weighted means of order p. . . . . . . . . . . . . . . . . . . . . 256
27 Schl¨omilch’s inequality . . . . . . . . . . . . . . . . . . . . . . . 259
28 Curvature properties of M (x,a) . . . . . . . . . . . . . . . . . 260
p
29 Least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
30 Generalized least squares . . . . . . . . . . . . . . . . . . . . . 263
31 Restricted least squares . . . . . . . . . . . . . . . . . . . . . . 263
32 Restricted least squares: matrix version . . . . . . . . . . . . . 265
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 266
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Part Five — The linear model
12 Statistical preliminaries 275
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
2 The cumulative distribution function . . . . . . . . . . . . . . . 275
3 The joint density function . . . . . . . . . . . . . . . . . . . . . 276
4 Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
x Contents
5 Variance and covariance . . . . . . . . . . . . . . . . . . . . . . 277
6 Independence of two random variables . . . . . . . . . . . . . . 279
7 Independence of n random variables . . . . . . . . . . . . . . . 281
8 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
9 The one-dimensional normal distribution . . . . . . . . . . . . . 281
10 The multivariate normal distribution . . . . . . . . . . . . . . . 282
11 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 285
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
13 The linear regression model 287
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
2 Affine minimum-trace unbiased estimation . . . . . . . . . . . . 288
3 The Gauss-Markovtheorem . . . . . . . . . . . . . . . . . . . . 289
4 The method of least squares . . . . . . . . . . . . . . . . . . . . 292
5 Aitken’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 293
6 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . . 295
7 Estimable functions . . . . . . . . . . . . . . . . . . . . . . . . 297
8 Linear constraints: the case (R′) (X′) . . . . . . . . . . 299
M ⊂M
9 Linear constraints: the general case . . . . . . . . . . . . . . . . 302
10 Linear constraints: the case (R′) (X′)= 0 . . . . . . . 305
M ∩M { }
11 A singular variance matrix: the case (X) (V) . . . . . . 306
12 A singular variance matrix: the case Mr(X′V+⊂XM)=r(X) . . . . 308
13 A singular variance matrix: the general case, I . . . . . . . . . . 309
14 Explicit and implicit linear constraints . . . . . . . . . . . . . . 310
15 The general linear model, I . . . . . . . . . . . . . . . . . . . . 313
16 A singular variance matrix: the general case, II . . . . . . . . . 314
17 The general linear model, II . . . . . . . . . . . . . . . . . . . . 317
18 Generalized least squares . . . . . . . . . . . . . . . . . . . . . 318
19 Restricted least squares . . . . . . . . . . . . . . . . . . . . . . 319
Miscellaneous exercises. . . . . . . . . . . . . . . . . . . . . . . . . . 321
Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
14 Further topics in the linear model 323
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
2 Best quadratic unbiased estimation of σ2 . . . . . . . . . . . . 323
3 The best quadratic and positive unbiased estimator of σ2 . . . 324
4 The best quadratic unbiased estimator of σ2 . . . . . . . . . . . 326
5 Best quadratic invariant estimation of σ2 . . . . . . . . . . . . 329
6 The best quadratic and positive invariant estimator of σ2 . . . 330
7 The best quadratic invariant estimator of σ2 . . . . . . . . . . . 331
8 Best quadratic unbiased estimation: multivariate normal case . 332
9 Bounds for the bias of the least squares estimator of σ2, I . . . 335
10 Bounds for the bias of the least squares estimator of σ2, II . . . 336
11 The prediction of disturbances . . . . . . . . . . . . . . . . . . 338
12 Best linear unbiased predictors with scalar variance matrix . . 339
13 Best linear unbiased predictors with fixed variance matrix, I . . 341
Description:Matrices. 2. Differential Calculus. 3. Statistics. 1 Basic properties of vectors and
matrices. 3. 1 .. Economic interpretation of Lagrange multipliers . 160
Let A (m × n), B (n × p) and C (n × p) be matrices and let x (n × 1) be a vector.