Mathematical Statistics: Exercises and Solutions Jun Shao Mathematical Statistics: Exercises and Solutions JunShao DepartmentofStatistics UniversityofWisconsin Madison,WI52706 USA [email protected] LibraryofCongressControlNumber:2005923578 ISBN-10:0-387-24970-2 Printedonacid-freepaper. ISBN-13:978-0387-24970-4 ©2005SpringerScience+BusinessMedia,Inc. Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthe writtenpermissionofthepublisher(SpringerScience+BusinessMedia,Inc.,233SpringStreet, NewYork,NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarly analysis.Useinconnectionwithanyformofinformationstorageandretrieval,electronicadap- tation,computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterde- velopedisforbidden. Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,even iftheyarenotidentifiedassuch,isnottobetakenasanexpressionofopinionastowhether ornottheyaresubjecttoproprietaryrights. PrintedintheUnitedStatesofAmerica. (EB) 9 8 7 6 5 4 3 2 1 springeronline.com To My Parents Preface Since the publication of my book Mathematical Statistics (Shao, 2003), I have been asked many times for a solution manual to the exercises in my book. Without doubt, exercises form an important part of a textbook on mathematical statistics, not only in training students for their research ability in mathematical statistics but also in presenting many additional results as complementary material to the main text. Written solutions to these exercises are important for students who initially do not have the skills in solving these exercises completely and are very helpful for instructors of a mathematical statistics course (whether or not my book Mathematical Statistics is used as the textbook) in providing answers to students as well as finding additional examples to the main text. Moti- vatedbythisandencouragedbysomeofmycolleaguesandSpringer-Verlag editor John Kimmel, I have completed this book, Mathematical Statistics: Exercises and Solutions. This book consists of solutions to 400 exercises, over 95% of which are in my book Mathematical Statistics. Many of them are standard exercises that also appear in other textbooks listed in the references. It is only a partial solution manual to Mathematical Statistics (which contains over 900exercises). However,thetypesofexerciseinMathematicalStatisticsnot selectedinthecurrentbookare(1)exercisesthatareroutine(eachexercise selectedinthisbookhasacertaindegreeofdifficulty),(2)exercisessimilar tooneorseveralexercisesselectedinthecurrentbook,and(3)exercisesfor advancedmaterialsthatareoftennotincludedinamathematicalstatistics course for first-year Ph.D. students in statistics (e.g., Edgeworth expan- sions and second-order accuracy of confidence sets, empirical likelihoods, statistical functionals, generalized linear models, nonparametric tests, and theory for the bootstrap and jackknife, etc.). On the other hand, this is a stand-alone book, since exercises and solutions are comprehensible independently of their source for likely readers. To help readers not using this book together with Mathematical Statistics, lists of notation, terminology, and some probability distributions are given in the front of the book. vii viii Preface All notational conventions are the same as or very similar to those in Mathematical Statistics and so is the mathematical level of this book. Readers are assumed to have a good knowledge in advanced calculus. A course in real analysis or measure theory is highly recommended. If this book is used with a statistics textbook that does not include probability theory, thenknowledgeinmeasure-theoreticprobabilitytheoryisrequired. Theexercisesaregroupedintosevenchapterswithtitlesmatchingthose inMathematical Statistics. AfewerrorsintheexercisesfromMathematical Statistics were detected during the preparation of their solutions and the correctedversionsaregiveninthisbook. Althoughexercisesarenumbered independently of their source, the corresponding number in Mathematical Statistics is accompanied with each exercise number for convenience of instructors and readers who also use Mathematical Statistics as the main text. Forexample,Exercise8(#2.19)meansthatExercise8inthecurrent book is also Exercise 19 in Chapter 2 of Mathematical Statistics. A note to students/readers who have a need for exercises accompanied by solutions is that they should not be completely driven by the solutions. Students/readers are encouraged to try each exercise first without reading its solution. If an exercise is solved with the help of a solution, they are encouragedtoprovidesolutionstosimilarexercisesaswellastothinkabout whether there is an alternative solution to the one given in this book. A few exercises in this book are accompanied by two solutions and/or notes of brief discussions. I would like to thank my teaching assistants, Dr. Hansheng Wang, Dr. Bin Cheng, and Mr. Fang Fang, who provided valuable help in preparing some solutions. Any errors are my own responsibility, and a correction of them can be found on my web page http://www.stat.wisc.edu/˜shao. Madison, Wisconsin Jun Shao April 2005 Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Some Distributions . . . . . . . . . . . . . . . . . . . . . . . . xxiii Chapter 1. Probability Theory . . . . . . . . . . . . . . . . 1 Chapter 2. Fundamentals of Statistics . . . . . . . . . . . 51 Chapter 3. Unbiased Estimation . . . . . . . . . . . . . . . 95 Chapter 4. Estimation in Parametric Models . . . . . . . 141 Chapter 5. Estimation in Nonparametric Models . . . . 209 Chapter 6. Hypothesis Tests . . . . . . . . . . . . . . . . . . 251 Chapter 7. Confidence Sets . . . . . . . . . . . . . . . . . . 309 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Notation R: The real line. Rk: The k-dimensional Euclidean space. c=(c ,...,c ): A vector (element) in Rk with jth component c ∈R; c is 1 k j considered as a k×1 matrix (column vector) when matrix algebra is involved. cτ: The transpose of a vector c ∈ Rk considered as a 1×k matrix (row vector) when matrix algebra is involved. (cid:2)c(cid:2): The Euclidean norm of a vector c∈Rk, (cid:2)c(cid:2)2 =cτc. |c|: The absolute value of c∈R. Aτ: The transpose of a matrix A. Det(A) or |A|: The determinant of a matrix A. tr(A): The trace of a matrix A. (cid:2)A(cid:2): The norm of a matrix A defined as (cid:2)A(cid:2)2 =tr(AτA). A−1: The inverse of a matrix A. A−: The generalized inverse of a matrix A. A1/2: The square root of a nonnegative definite matrix A defined by A1/2A1/2 =A. A−1/2: The inverse of A1/2. R(A): The linear space generated by rows of a matrix A. I : The k×k identity matrix. k J : The k-dimensional vector of 1’s. k ∅: The empty set. (a,b): The open interval from a to b. [a,b]: The closed interval from a to b. (a,b]: The interval from a to b including b but not a. [a,b): The interval from a to b including a but not b. {a,b,c}: The set consisting of the elements a, b, and c. A ×···×A : The Cartesian product of sets A ,...,A , A ×···×A = 1 k 1 k 1 k {(a ,...,a ):a ∈A ,...,a ∈A }. 1 k 1 1 k k xi xii Notation σ(C): The smallest σ-field that contains C. σ(X): The smallest σ-field with respect to which X is measurable. ν ×···×ν : The product measure of ν ,...,ν on σ(F ×···×F ), where 1 k 1 k 1 k ν is a measure on F , i=1,...,k. i i B: The Borel σ-field on R. Bk: The Borel σ-field on Rk. Ac: The complement of a set A. A∪B: The union of sets A and B. ∪A : The union of sets A ,A ,.... i 1 2 A∩B: The intersection of sets A and B. ∩A : The intersection of sets A ,A ,.... i 1 2 I : The indicator function of a set A. A P(A): The probability of a set A. (cid:1) fdν: The integral of a Borel function f with respect to a measure ν. (cid:1) fdν: The integral of f on the set A. (cid:1)A f(x)dF(x): The integral of f with respect to the probability measure corresponding to the cumulative distribution function F. λ (cid:6) ν: The measure λ is dominated by the measure ν, i.e., ν(A) = 0 always implies λ(A)=0. dλ: The Radon-Nikodym derivative of λ with respect to ν. dν P: A collection of populations (distributions). a.e.: Almost everywhere. a.s.: Almost surely. a.s. P: A statement holds except on the event A with P(A) = 0 for all P ∈P. δ : The point mass at x∈Rk or the distribution degenerated at x∈Rk. x {a }: A sequence of elements a ,a ,.... n 1 2 a →a or lim a =a: {a } converges to a as n increases to ∞. n n n n limsup a : Thelargestlimitpointof{a },limsup a =inf sup a . n n n n n n k≥n k liminfnan: Thesmallestlimitpointof{an},liminfnan =supninfk≥nak. → : Convergence in probability. p → : Convergence in distribution. d g(cid:2): The derivative of a function g on R. g(cid:2)(cid:2): The second-order derivative of a function g on R. g(k): The kth-order derivative of a function g on R. g(x+): The right limit of a function g at x∈R. g(x−): The left limit of a function g at x∈R. g (x): The positive part of a function g, g (x)=max{g(x),0}. + + Notation xiii g−(x): The negative part of a function g, g−(x)=max{−g(x),0}. ∂g/∂x: The partial derivative of a function g on Rk. ∂2g/∂x∂xτ: The second-order partial derivative of a function g on Rk. exp{x}: The exponential function ex. logx or log(x): The inverse of ex, log(ex)=x. (cid:1) Γ(t): The gamma function defined as Γ(t)= ∞xt−1e−xdx, t>0. 0 F−1(p): The pth quantile of a cumulative distribution function F on R, F−1(t)=inf{x:F(x)≥t}. E(X) or EX: The expectation of a random variable (vector or matrix) X. Var(X): ThevarianceofarandomvariableX orthecovariancematrixof a random vector X. Cov(X,Y): The covariance between random variables X and Y. E(X|A): The conditional expectation of X given a σ-field A. E(X|Y): The conditional expectation of X given Y. P(A|A): The conditional probability of A given a σ-field A. P(A|Y): The conditional probability of A given Y. X : The ith order statistic of X ,...,X . (i) 1 n (cid:2) X¯ or X¯·: The sample mean of X1,...,Xn, X¯ =n−1 ni(cid:2)=1Xi. X¯·j: The average of Xij’s over the index i, X¯·j =n−1 (cid:2)ni=1Xij. S2: The sample variance of X ,...,X , S2 =(n−1)−1 n (X −X¯)2. 1 n i(cid:2)=1 i F : The empirical distribution of X ,...,X , F (t)=n−1 n δ (t). n 1 n n i=1 Xi (cid:6)(θ): The likelihood function. H : The null hypothesis in a testing problem. 0 H : The alternative hypothesis in a testing problem. 1 L(P,a) or L(θ,a): The loss function in a decision problem. R (P) or R (θ): The risk function of a decision rule T. T T r : The Bayes risk of a decision rule T. T N(µ,σ2): Theone-dimensionalnormaldistributionwithmeanµandvari- ance σ2. N (µ,Σ): Thek-dimensionalnormaldistributionwithmeanvectorµand k covariance matrix Σ. Φ(x): The cumulative distribution function of N(0,1). z : The (1−α)th quantile of N(0,1). α χ2: The chi-square distribution with degrees of freedom r. r χ2 : The (1−α)th quantile of the chi-square distribution χ2. r,α r χ2(δ): The noncentral chi-square distribution with degrees of freedom r r and noncentrality parameter δ.