Graph presentations for moments of noncentral Wishart distributions and their applications Satoshi Kuriki and Yasuhide Numata 0 1 0 2 Abstract Weprovideformulasforthemomentsoftherealandcomplexnon- centralWishartdistributionsofgeneraldegrees.Theobtainedformulasforthe n real and complex cases are described in terms of the undirected and directed a J graphs, respectively. By considering degenerate cases, we give explicit formu- 2 las for the moments of bivariate chi-square distributions and 2 2 Wishart × 2 distributions by enumerating the graphs.Noting that the Laguerrepolynomi- als can be considered to be moments of a noncentral chi-square distributions ] T formally, we demonstrate a combinatorial interpretation of the coefficients of S the Laguerre polynomials. . h Keywords Kibble’s bivariate gamma distribution Laguerre polynomial at Noncentral Stirling number of the first kind · · m [ 1 Introduction 2 v 7 Fort=1,...,ν,letXt =(xti)1≤i≤p beap-dimensionalrandomcolumnvector 7 distributedindependentlyaccordingtothenormaldistributionN (µ ,Σ)with p t 5 mean vector µ =(µ ) and covariance matrix Σ =(σ ) . We de- t ti 1≤i≤p ij 1≤i,j≤p 0 finethe(real)noncentralWishartdistributionW (ν,Σ,∆)bythedistribution . p 2 of a p p symmetric random matrix 1 × 9 ν 0 W =(w ) , w = x x , (1) ij 1≤i,j≤p ij ti tj : v t=1 X i X S.Kuriki r Institute of Statistical Mathematics, ROIS, 10-3 Midoricho, Tachikawa, Tokyo 190-8562, a Japan.E-mail:[email protected] Y.Numata Department of Mathematical Informatics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan; Japan Science and Technology Agency (JST), CREST. E-mail: [email protected] 2 where ν ∆=(δ ) , δ = µ µ ij 1≤i,j≤p ij ti tj t=1 X is the mean square matrix. The distribution of W depends on µ ’s through ∆ t because its moment generating function is E etr(ΘW) =det(I 2ΘΣ)−ν2etr(I−2ΘΣ)−1Θ∆, (2) − (cid:2) (cid:3) where Θ is a p p symmetric parameter matrix (Muirhead (1982)). × Note that ∆ = 0 if and only if µ = 0 for all t. The Wishart distribu- t tion with ∆ = 0 is referred to as the central Wishart distribution W (ν,Σ). p Conventionally, the triplet (ν,Σ,Ω) with Ω = Σ−1∆ is used for describing the noncentral Wishart distribution rather than (ν,Σ,∆). The matrix Ω is called the noncentrality matrix. In our paper, we adopt the triplet (ν,Σ,∆) for simplicity in describing theorems. X Fort=1,...,ν, let t be a 2p-dimensionalrandomcolumn vectordis- Y t (cid:18) (cid:19) tributedindependentlyaccordingtothenormaldistributionwithmeanvector ξ A B t andcovariancematrix − ,whereAandBarep psymmetricand ηt B A × (cid:18) (cid:19) (cid:18) (cid:19) skew-symmetric matrices, respectively. The distribution of a complex-valued random vector Z = (z ) = X +√ 1Y is referred to as the complex t ti 1≤i≤p t t − normal distribution CN (µ ,Σ) with mean µ = (µ ) = ξ +√ 1η p t t ti 1≤i≤p t t − and covariance matrix Σ = (σ ) = 2(A+√ 1B). Actually, Σ is a ij 1≤i,j≤p − “covariance” in the sense of σ =E[(z µ )(z µ )]. ij ti ti tj tj − − Here, the overline denotes the complex conjugate. From the complex random vectorsZ ,wedefinethecomplexnoncentralWishartdistributionCW (ν,Σ,∆) t p as the distribution of a p p Hermitian random matrix × ν W =(w ) , w = z z , (3) ij 1≤i,j≤p ij ti tj t=1 X where ν ∆=(δ ) , δ = µ µ ij 1≤i,j≤p ij ti tj t=1 X is the mean square parameter matrix. As in the real case, the distribution of W depends on µ ’s through ∆ since its moment generating function is t E etr(ΘW) =det(I ΘΣ)−νetr(I−ΘΣ)−1Θ∆, (4) − (cid:2) (cid:3) where Θ is a p p Hermitianparametermatrix.See Goodman (1963) for the × central case. 3 Theprimarypurposeofthispaperistoobtainexpressionsforthemoments E[w w w ] ofthe realandcomplex noncentralWishart distributions in ab cd ef ··· terms of graphs, where a,b,c,d,...,e,f 1,...,p are arbitrary indices. ∈{ } Consideringthecaseswherethemeanvectorsµ andthecovariancematrix t Σ take particular values, we will obtain severalidentities of moments of some distributions associated with the Wishart distributions. We shall see that the derivations are reduced to enumerating graphs of various types. This is the secondary purpose of our paper. TheWishartdistributionoriginateswithapaperbyWishart (1928)around 80 years ago. Since then, it is considered to be a fundamental distribution not only in mathematical statistics but also in other fields such as random matrices theory and signal processing (e.g., Bai (1999), Maiwald and Kraus (2000)). Despite this, the structure of moments of the Wishart distributions is still an active research topic. In the central case, Lu and Richards (2001), Graczyk, et al. (2003),andGraczyk, et al. (2005)providedformulasfor mo- ments of the realand complex Wishart distributions. These studies are based on expansions of the moment generating functions of the central Wishart dis- tributions det(I 2ΘΣ)−ν2. The graph presentations of moments have also − been discussed in these studies. Prior to these studies, it was known that terms in the expansion of det(I Y)−α around Y = 0 have some combina- − torial structures (e.g., Vere-Jones (1988)), which are closely related to the problem of Wishart moment. More recently, Letac and Massam (2008) pro- vided a method to calculate moments of the noncentral Wishart distribution by combining expansionsof the momentgeneratingfunction det(I 2ΘΣ)−ν2 and the “noncentral part” etr(I−2ΘΣ)−1Θ∆ in (2). − The outline of this paper is as follows. In Section 2, we treat the real non- central Wishart matrices. Formulas for the general terms of moments of the Wishart distributions are given in terms of undirected graphs. This is an ex- tensionofTakemura (1991),whotreatedthecentralcase.Then,bylettingthe mean vectors µ and the covariance matrix Σ have particular kinds of struc- t tures, we obtain explicit formulas for moments of the noncentral chi-square distribution, Kibble (1941)’s bivariate chi-square (gamma) distribution, and the 2 2centralWishartdistributionwithΣ =I.Notingaformalcorrespon- × dence between the moments of the noncentral chi-square distributions and the Laguerre polynomials, we will show that the coefficients of the Laguerre polynomials have a combinatorial interpretation. In Section 3, we treat the case of the noncentral complex Wishart matri- ces. Major parts of discussions are parallel to the real case. One remarkable difference is that moments in the complex case are not described in terms of undirected graphs but directed graphs. 4 2 Moments of the real noncentral Wishart distribution 2.1 A graph presentation In this subsection, we provide a graph presentation formula for moments of the real noncentral Wishart distributions of general degrees. Our results are generalizations of Theorem 4.3 of Takemura (1991) where the central real Wishart matrices are treated. Our basic tool is the following formula for mo- ments of Gaussian random vectors. This is just a moment-cumulant relation intheGaussiancase.Fortheproof,seeMcCullagh (1987).Inthecentralcase µ=0, this is sometimes referred to as the Wick formula. Lemma 1 (Moment of the real normal distribution) Let X =(x ) be a i Gaussianrandomvectorwithmeanµ=(µ ),andcovariancematrixΣ =(σ ). i ij Then, E[x x x ]= σ σ µ µ , 1 2··· n i1i2··· i2m−1i2m i2m+1··· in where the summation is takenXover all partitions of n indices 1,2,...,n into { } unordered m pairs and n 2m singletons − (i ,i ),...,(i ,i ),(i ),...,(i ). 1 2 2m−1 2m 2m+1 n Remark 1 Although Lemma 1 just states an expression for E[x x x ], it 1 2 n ··· indeed gives general forms of the moments E[x x x ] by considering a a b c ··· degenerate case. For example, we have E[x x2]=E[x x x ], where 1 2 1 2 3 x µ σ σ σ 1 1 11 12 1e2 e e x N µ , σ σ σ . 2 3 2 21 22 22 ∼ x µ σ σ σ e3 2 21 22 22 e Throughout the paper, we will use this degeneracy argument many times. e Let X = (x ) (t = 1,...,ν) be independent Gaussian random vectors t ti with mean µ and covariance matrix Σ. Let W = (w ) be a Wishart ma- t ij trix made of X ’s as in (1). In the following, we give a formula for the mo- t ment E[w w w ] with a,b,c,d,...,e,f arbitrary indices. By applying ab cd ef ··· the degeneracy argument again, we can restrict our attention to the moment E[w w w ] without loss of generality. For example E[w w2 ] = 12 34··· 2n−1,2n 11 12 E[w w w ], where (w ) W (ν,(σ ),(δ )), 12 34 56 ij 6 ij ij ∼ e e e e (σ11,δ11), ei,j e 1,2,3,5 , ∈{ } (σ ,δ ), i 1,2,3,5 ,j 4,6 , (σ ,δ )= 12 12 ∈{ } ∈{ } ij ij (σ21,δ21), i∈{4,6},j ∈{1,2,3,5}, e e (σ22,δ22), i,j ∈{4,6}. Let V = 1,2,...,2n 1,2n be the set of indices appearing in the ar- { − } gument of the expectation E[w w ]. In the following, we consider 12 2n−1,2n ··· 5 an undirected graph whose vertices are the elements of V. First consider an undirected graph G =(V,E ) with the edges 0 0 E = (1,2),...,(2n 1,2n) . 0 { − } Foreachpartitionof 1,2,...,2n 1,2n intompairsand2n 2msingletons, { − } − (i ,i ),...,(i ,i ),(i ),...,(i ), (5) 1 2 2m−1 2m 2m+1 2n we define a set of undirected edges E = (i ,i ),...,(i ,i ) . 1 2 2m−1 2m { } By adding the edges of E to G , we have a graph 0 G=(V,E E). (6) 0 ∪ Each connected component of G is classified as a “cycle” (a path without terminals) and a “chain” (a path with two terminals). For the partition (5), the number of chains is n m. The number of cycles of G is denoted by − len(G). Note that len(G) m.Let(j ,j ),...,(j ,j )be pairsof 1 2 2n−2m−1 2n−2m ≤ two terminal vertices of n m chains of G, and let − Eˇ = (j ,j ),...,(j ,j ) . 1 2 2n−2m−1 2n−2m { } Using these notations, the general form for the moments is given below. Theorem 1 (Moment of the real noncentral Wishart distribution) Let (w ) W(ν,(σ ),(δ )). Then, ij ij ij ∼ E[w w ]= νlen(G)σEδEˇ, (7) 12 2n−1,2n ··· E X where σE = σii′ =σi1i2···σi2m−1i2m, (i,Yi′)∈E δEˇ = δjj′ =δj1j2···δj2n−2m−1j2n−2m. (j,Yj′)∈Eˇ The summation is taken over all partitions of 1,2,...,2n of the form E { } (5). P Example 1 Consider the evaluation of the moment E[w w w ]. Then, V = 12 34 56 1,2,3,4,5,6 and E = (1,2),(3,4),(5,6) . There are 76 partitions of V 0 { } { } into pairs and singletons. Figure 1 is the graph G = (V,E E) for E = 0 ∪ 6 1 2 6 3 5 4 Fig. 1 Graph G = (V,E0 ∪ E) (E0: solid line, E: dotted line) presenting the term ν1σ16σ25δ34 (n=6,m=2,len(G)=1). (1,6),(2,5) (Eˇ = (3,4) ).Summingup76possibilities,wehavethefollow- { } { } ing: E[w w w ]=ν3σ σ σ +ν2σ σ σ [6]+νσ σ σ [8] 12 34 56 12 34 56 23 14 56 23 45 16 +ν2σ σ δ [3]+νσ σ δ [6]+νσ σ δ [12] 12 34 56 23 14 56 12 45 36 +σ σ δ [24] 23 45 16 +νσ δ δ [3]+σ δ δ [12] 12 34 56 23 14 56 +δ δ δ . 12 34 56 Here, [n] means that there are n terms of similar form. Proof For i = 1,...,n, let e(i) = [(i+1)/2] (the integer part of (i+1)/2). Noting that w = ν x x , and from Lemma 1, we have ij t=1 ti tj E[w w ]P 12 2n−1,2n ··· ν ν = E[x x x x ] ··· t1,1 t1,2··· tn,2n−1 tn,2n tX1=1 tXn=1 = E[x x x x ] ··· te(1),1 te(2),2··· te(2n−1),2n−1 te(2n),2n Xt1 Xtn = Cov(x ,x ) Cov(x ,x ) ··· te(i1),i1 te(i2),i2 ··· te(i2m−1),i2m−1 te(i2m),i2m XE Xt1 Xtn E[x ] E[x ]. (8) × te(i2m+1),i2m+1 ··· te(i2n),i2n Since i ,...,i = V, the indices i ,...,i can be divided into connected 1 n 1 n { } componentsofthegraphG.Let j ,...,j beasetofverticesofaconnected 1 2k { } component. Then, as we already pointed out, it forms either a chain (j ,j ),(j ,j ),...,(j ,j ),(j ,j ), 1 2 2 3 2k−2 2k−1 2k−1 2k or a cycle (j ,j ),(j ,j ),...,(j ,j ),(j ,j ), 1 2 2 3 2k−1 2k 2k 1 7 and in both cases (j ,j ),(j ,j ),...,(j ,j ) E . 1 2 3 4 2k−1 2k 0 ∈ Since the running indices t ,...,t correspond to n edges of E , and e(j ) = 1 n 0 1 e(j ),e(j ) = e(j ),...,e(j ) = e(j ), the argument of the summation 2 3 4 2k−1 2k in (8) is written as a product of terms of the form E P E[x ]Cov(x ,x ) ··· t1,j1 t1,j2 t2,j3 ··· Xt1 Xtk Cov(x ,x )E[x ] (9) × tk−1,j2k−2 tk,j2k−1 tk,j2k in the chain case, or Cov(x ,x ) ··· t1,j2 t2,j3 ··· Xt1 Xtk Cov(x ,x )Cov(x ,x ) (10) × tk−1,j2k−2 tk,j2k−1 tk,j2k t1,j1 in the cycle case. Here, we used a reindexing t :=t =t ,...,t :=t =t . 1 e(j1) e(j2) k e(j2k−1) e(j2k) Noting that Cov(x ,x ) = 1 σ and ν E[x ]E[x ] = δ , we see si tj {s=t} ij t=1 ti tj ij that (9) = σ σ δ and (10) = νσ σ σ . This completejs2jt3h·e··prjo2okf−.2j2k−1 j2kj1 P j2j3··· j2k−2,j2k−1 j2kj(cid:3)1 2.2 Enumeration of undirected graphs In this subsection, we will enumerate the graphs G = (V,E E) defined in 0 ∪ (6) under the condition that the number l =len(G) of cycles and the number m of edges of E are given. Let f be the number of such graphs. l,m,n Consider a degenerate noncentral Wishart matrix W = (w ) such that ij σ 1andδ δ.Thishappens wheneverycomponentofX makingupW ij ij t ≡ ≡ takesthesamevaluewithprobabilityone.Accordingly,allelementsofW take the same value w, say, with probability one. In this setting, (7) in Theorem 1 is reduced to a moment formula for the distribution of w, the noncentral chi-square distribution χ2(δ) with ν degrees of freedom and the noncentrality ν parameter δ. Using the coefficient f , the nth moment of w is given as l,m,n follows. n E[wn]= νlf δn−m. (11) l,m,n m=0l≥0 X X The coefficient f satisfies the following recurrence formula. l,m,n 8 Lemma 2 f =2(2n m 1)f +f +f , (12) l,m,n l,m−1,n−1 l−1,m−1,n−1 l,m,n−1 − − with boundary conditions 1 (l=0), f = for n 1, (13) l,0,n (0 (l 1) ≥ ≥ and 0 (l =0), f = (14) l,1,1 (1 (l =1). Proof Inthefollowing,wesometimesrefertoanedgefromE asa“solidline” 0 edge and an edge from E as a “dashed line” edge as shown in Figure 1. TheconnectedcomponentsofGareclassifiedascyclesandchains.Ineach cycle, the number of solid line edges is equal to the number of dashed line edges. In each chain, the number of solid line edges is one more than the number of dashed line edges. Thus, the number of connected chains is n m, − thedifferencebetweenthenumberofsolidlineedgesandthenumberofdashed line edges. ConsideragraphG′ madeby removingtwovertices2n 1and2n,andall − (solid line and dashed line) edges connecting to these two vertices. Note that the (solid line) edge (2n 1,2n) is deleted. − One of the following will meet: The edge (2n 1,2n) is contained in (i) a − cycle with 4 or more edges, or a chain with 3 or more edges (the edges (a,b), (c,d), (i,j), or (k,l) in Figure 2); (ii) a cycle with 2 edges ((e,f) in Figure 2); (iii) a chain consisting of a edge ((g,h) in Figure 2). Case (i). In the graph G′ made by removing two vertices 2n 1 and 2n, − andallconnectededges,there aren 1solidline edges.Since onedashedline − edge is removed together, there are m 1 dashed line edges. The number of − chains still remains (n 1) (m 1) = n m. To the graph G′, consider − − − − adding the edge (2n 1,2n) again. There are n 1+(n m)=2n m 1 − − − − − places where (2n 1,2n)can be inserted, andconsideringthe directionof the − inserted edge, there are 2(2n m 1) ways in which the insertion can be − − done. By this operation, the number of dashed lines increases by 1, whereas the number of cycles is invariant. The contribution of the number of graphs made by this operation to f is l,m,n 2(2n m 1)f . l,m−1,n−1 − − Case (ii). Consider adding the edge (2n 1,2n) to the graph G′ again − to make a cycle with the edge (2n 1,2n) and a dashed line edge. By this − operation,boththenumberofdashedlinesandthenumberofcyclesincreases by 1. The contribution of the number of graphs made by this operation to f is l,m,n f . l−1,m−1,n−1 9 a b l c k d j e i f h g Fig. 2 AfigurefortheproofofLemma2. Case (iii). Consider adding the edge (2n 1,2n) to the graph G′ again to − make a chain consisting of one edge (2n 1,2n). By this operation, both the − number of dashed lines and the number of cycles are invariant. The contribu- tion of the number of graphs made by this operation to f is l,m,n f . l,m,n−1 Summing up the three cases (i), (ii), and (iii), we obtain the recurrence formula (12). (cid:3) Theorem 2 The generating function of f with respect to the number l of l,m,n cycles is given by Φ (ν)= νlf m,n l,m,n l≥0 X m n = (ν+2(n i)) (0 m n, n 1). (15) m − ≤ ≤ ≥ (cid:18) (cid:19)i=1 Y Here, we use a convention 0 =1. i=1 Q Proof Notingthatf =0,thegeneratingfunctionΦ (ν)= νlf −1,m,n m,n l≥0 l,m,n has to satisfy P Φ =2(2n m 1)Φ +νΦ +Φ . (16) m,n m−1,n−1 m−1,n−1 m,n−1 − − To solve the recurrence formula (16), consider the boundary conditions. From (13), we have Φ (ν)=1 (n 1). (17) 0,n ≥ Moreover,from (14), we have Φ (ν)=ν. 1,1 10 Furthermore, since Φ =0, n,n−1 Φ =2(n 1)Φ +νΦ =(ν+2n 2)Φ = n,n n−1,n−1 n−1,n−1 n−1,n−1 − − ··· n = (ν+2(n i)) (n 1). (18) − ≥ i=1 Y The recurrenceformula(16)combinedwiththe boundaryconditions(17)and (18) determines Φ for all m and n. m,n Inthefollowing,weseethatΦ (ν)in(15)isactuallythesolutionforthe m,n recurrence formula (16). The boundary conditions (17) and (18) are satisfied. We only have to make sure that (15) really satisfies (16). Indeed, m m n n 1 Φ Φ = (ν+2n 2i) − (ν+2n 2 2i) m,n m,n−1 − m − − m − − (cid:18) (cid:19)i=1 (cid:18) (cid:19)i=1 Y Y m−1 n 1 1 = − (ν+2n 2 2i) m 1 m − − (cid:18) − (cid:19) i=1 Y n(ν+2n 2) (n m)(ν+2n 2 2m) ×{ − − − − − } =2(2n m 1)Φ +νΦ . m−1,n−1 m−1,n−1 − − (cid:3) Corollary 1 n Φ (1)= (2n 1)(2n 3) (2n 2m+1) m,n m − − ··· − (cid:18) (cid:19) 2n = (2m 1)!! 2m − (cid:18) (cid:19) is the number of undirected graphs G, and n Φ (0)= (2n 2)(2n 4) (2n 2m) m,n m − − ··· − (cid:18) (cid:19) 2mn!(n 1)! = − m!(n m)!(n m 1)! − − − is the number of undirected graphs G without cycles. Remark 2 Nonnegative integers s (m,l) defined by a generating function n m m νls (m,l)= (ν+n i) (19) n − l=0 i=1 X Y are called the noncentral Stirling numbers of the first kind (Koutras (1982)). Since m n n νlf = 2m (ν/2+n i)= 2m (ν/2)ls (m,l), l,m,n n m − m ν≥0 (cid:18) (cid:19) i=1 (cid:18) (cid:19) l≥0 X Y X