ebook img

A Random Matrix Approach to Dynamic Factors in macroeconomic data PDF

0.37 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Random Matrix Approach to Dynamic Factors in macroeconomic data

A Random Matrix Approach to Dynamic Factors in macroeconomic data Mal gorzata Snarska1,2,∗ 1Marian Smoluchowski Institute of Physics and Mark Kac Complex Systems Research Centre, Jagiellonian University, Reymonta 4, 30–059 Krak´ow, Poland 2Cracow University of Economics, Department of Econometrics 2 and Operations Research, Rakowicka 27, 31–510 Krak´ow, Poland 1 0 (Dated: February 1, 2012) 2 Weshowhowrandommatrixtheorycanbeappliedtodevelopnewalgorithmstoextractdynamic n factors from macroeconomic time series. In particular, we consider a limit where the number of a random variables N and the number of consecutive time measurements T are large but the ratio J N/T is fixed. In this regime theunderlyingrandom matrices are asymptotically equivalent to Free 1 Random Variables (FRV).Application of these methods for macroeconomic indicators for Poland 3 economy is also presented. ] PACSnumbers: 89.65.Gh(Economics;econophysics,financialmarkets,businessandmanagement),02.50.Sk T (Multivariate analysis), 02.70.-c (Computational Techniques;simulations), 02.70.Uu (Applications of Monte S Carlomethods) . n Keywords: VARMA, random matrix theory, free random variables, dynamic factor models, historical esti- i mation f - q [ 1 v I. INTRODUCTION 4 4 5 Macroeconomic time series are characterized by two main features:comovements and cyclic phases of expansion 6 and depression. Behind the econometric literature, especially on business cycle and VAR modeling there is an idea, . 1 that essential characteristics of macroeconomic time series are adequately captured by a small number of nearly 0 independent factors. 2 Ontheotherhandmanyempiricalstudiessuggesttoanalyzemanyvariablesinordertounderstandthemicroeconomic 1 mechanisms behind the fluctuations, where both N- the number of variables and T the length of a time series are : v typically large,and typically of the same order. Surprisingly most of the models developedin this field of science can i X preciselydescribeonlystaticpropertiesofthecorrelationstructure. Thereisastrongneedformodelsandalgorithms, that can accurately capture the above described features of macroeconomic time series,are dynamic and allow for a r a reacher spatio – temporal structure. In this paper we present new algorithms, based on Random Matrix Theory that countenance extracting potentially useful factors in macroeconomic time series. In the first section we uncover external temporal correlationstructure, using the assumption that the characteristic of each individual time series is meticulously depicted by the VARMA(q ,q ) structure with the same parameters. The second method encompasses 1 2 exposition of internal correlation structure between two different sets of variables, after the external correlations are properly reduced. II. UNRAVELING LAGGED EXTERNAL TEMPORAL CORRELATIONS FROM VARMA(P,Q) Finite order vector autoregressive moving average models (VARMA) motivated by Wold decomposition theorem [1] as an appriopriate multivariate setting for studying the dynamics of stationary time series. Vector autoregressive (VAR) models are cornerstones in contemporary macroeconomics, being a part of an approach called the “dynamic stochastic general equilibrium”(DSGE), which is superseding traditional large–scale macroeconometric forecasting methodologies[2]. Themotivationbehindthemisbasedontheassertionthatmorerecentvaluesofavariablearemore likely to contain useful information about its future movements than the older ones. On the other hand, a standard ∗Electronicaddress: [email protected] 2 tool in multivariate time series analysis is vector moving average (VMA) models, which is really a linear regression of the present value of the time series w.r.t. the past values of a white noise. A broader class of stochastic processes used in macroeconomics comprises both these kinds together in the form of vector autoregressive moving average (VARMA) models. These methodologies can capture certain spatial and temporal structures of multidimensional variables which are often neglected in practice; including them not only results in more accurate estimation, but also leads to models which are more interpretable. A. Correlated Gaussian Random Variables We will consider a situation of N time–dependent random variables which are measured at T consecutive time moments (separated by some time interval δt); let Y be the value od the i–th (i = 1,...,N) random number ia at the a–th time moment (a = 1,...,T); together, they make up a rectangular N T matrix Y. Firstly we will × assume, each Y is supposed to be drawn from a Gaussian probability distribution, and that they have mean values ia zero, Y =0. A set of correlated zero–mean Gaussian numbers is fully characterized by the two–point covariance ia h i function, Y Y iftheunderlyingstochasticprocessgeneratingthesenumbersisstationary. Thestationarity ia,jb ia jb C ≡h i conditionforVARMA(q ,q )stochasticprocessesimpliescertainrestrictionsontheirparameters;fordetails,werefer 1 2 to[3]. Wewillrestrictourattentiontoanevennarrowerclasswherethecross–correlationsbetweendifferentvariables and the auto–correlations between different time moments are factorized, i.e., Y Y =C A . (1) ia jb ij ab h i 1. Estimating External Cross – Covariances The realized cross–covariance between degrees i and j at the same time a is Y Y , the simplest method to ia ja estimate the today’s cross–covariancec is to compute the time average(named ”Pearsonestimator”), ij T 1 1 1 c Y Y , i.e., c= YYT = √CYAYT√C. (2) ij ia ja ≡ T T T Xa=1 e e The random matrix c is called a “doubly correlated Wishart ensemble” [4]. It is obvious, that our estimator will reflect the true covariances only to a certain degree, with a superimposed broadeningdue to the finiteness ofthe time seriesi.e., estimationaccuracywill depend onthe “rectangularityratio,” N r ; (3) ≡ T the closer r to zero, the more truthful the estimate. Furthermore we will consider the situation, where N , T , such that r=fixed. (4) →∞ →∞ B. Free Random Variables Calculus Crash Course 1. The M–Transform and the Spectral Density Let’s consider (real symmetric K K) random matrix H . The average of its eigenvalues λ ,...,λ is concisely 1 K × encoded in the “mean spectral density,” K 1 1 ρH(λ) δ(λ λi) = Tr(λ1K H) . (5) ≡ K h − i K h − i Xi=1 3 On the practical side, it is more convenient to work with either of the two equivalent objects, 1 1 GH(z) Tr , or MH(z) zGH(z) 1, (6) ≡ K (cid:28) z1 H(cid:29) ≡ − K − referred to as the “Green’s function” (or the “resolvent”) and the “M–transform” of H (“moments’ generating function,”). The latter one serves better in the case of multiplication of random matrices. 2. The N–Transform and Free Random Variables The doubly correlated Wishart ensemble c (2) may be viewed as a product of several random and non–random matrices. The general problem of multiplying random variables in classical probability theory can be effectively handledinthe specialsituationwhenthe randomtermsareindependent: then,the exponentialmapreducesitto the additionproblemofindependentrandomnumbers,solvedbyconsideringthe logarithmofthecharacteristicfunctions ofthe respectivePDFs,whichprovestobe additive. Inmatrixprobabilitytheorydue toD.Voiculescuandcoworkers and R. Speicher [5, 6], we have a parallel commutative construction in the noncommutative world (c.f. table I). It starts with the notion of “freeness,” which basically comprises probabilistic independence together with a lack of any directionalcorrelationbetweentworandommatrices. This nontrivialnewpropertyhappens tobe the rightextension ofclassicalindependence, asit allowsfor anefficientalgorithmofmultiplying free randomvariables(FRV), which we state below: Step 1: Supposewehavetworandommatrices,H andH ,mutuallyfree. Theirspectralpropertiesarebestwrought 1 2 into the M–transforms (6), MH (z) and MH (z). 1 2 Step 2: The critical maneuver is to turn attention to the functional inverses of these M–transforms, the so–called “N–transforms,” MH(NH(z))=NH(MH(z))=z. (7) Step 3: The N–transforms submit to a very straightforwardrule upon multiplying free random matrices (the “FRV multiplication law”), z NH1H2(z)= 1+zNH1(z)NH2(z), for free H1, H2. (8) Step 4: Finally, it remains to functionally invert the resulting N–transform NH H (z) to gain the M–transformof the product, MH H (z). 1 2 1 2 3. Extracting External Correlations The innate potential of the FRV multiplication algorithm (8) is surely revealed when inspecting the doubly correlated Wishart random matrix c=(1/T)√CYAYT√C (2). This has been done in detail in [7, 8], so we will only accentuate the main results here, referring the reader to the original papers for an exact explanation. The idea e e is that one uses twice the cyclic property of the trace (which permits cyclic shifts in the order of the terms), and twice the FRVmultiplicationlaw(8)(to breakthe N–transformsofproductsofmatricesdowntotheir constituents), in order to reduce the problem to solving the uncorrelated Wishart ensemble (1/T)YTY. This last model is further simplified, again by the cyclic property and the FRV multiplication rule applied once, to the standard GOE random e e matrixsquared(andtheprojectorP diag(1 ,0 ),designedtochiptherectangleYoffthesquareGOE),whose N T−N ≡ properties are firmly established. Let us sketch the derivation, e cy↓clic FR↓V z cy↓clic Nc(z) = NT1YeAYeTC(z) = 1+zNT1YeAYeT(z)NC(z) = 4 Classical Probability Noncommutative probability (FRV) x - random variable, p(x) H - random matrix, P(H) pdf spectral density ̺(λ)dλ characteristic function g (z)≡ eizx Green’s function G (z)= 1 Tr 1 x (cid:10) (cid:11) H N D z·1−HE or M - transform M(z)=zG (z)−1 H independence freeness Addition of independent r.v.: Addition of f.r.v. The logarithm of thecharacteristic function, The Blue’s function r (z)≡logg (z), is additive, G (B (z))=B (G (z))=z, is additive, x x H H H H rx1+x2(z)=rx1(z)+rx2(z) BH1+H2(z)=BH1(z)+BH2(z)− z1 Multiplication of independentr.v.: Multiplication of free r.v.: Reduced tothe addition problem The N - transform, via the exponentialmap, owing to M (N (z))=N (M (z))=z, H H H H ex1ex2 =ex1+x2 is multiplicative N (z)= z N (z)N (z) H1H2 1+z H1 H2 TABLE I: Parallel between Classical and Noncommutative probability cy↓clic z FR↓V z rz = 1+zNT1YeTYeA(rz)NC(z) = 1+z1+rzNT1YeTYe(rz)NA(rz)NC(z)= =rzNA(rz)NC(z). (9) This is the basic formula. Since the spectralpropertiesofc aregivenby its M–transform,M Mc(z), it is more ≡ pedagogical to recast (9) as an equation for the unknown M, z =rMNA(rM)NC(M). (10) It providesa means for computing the meanspectraldensity of a doubly correlatedWishart randommatrix once the “true” covariance matrices C and A are given. Inthiscommunication,onlyaparticularinstanceofthisfundamentalformulaisapplied,namelywithanarbitrary auto–covariance matrix A, but with trivial cross–covariances, C=1N. Using that N1K(z)=1+1/z, equation (10) thins out to z rM =MA , (11) (cid:18)r(1+M)(cid:19) which will be strongly exploited below. For all this, we must in particular take both N and T large from the start, with their ratio r N/T fixed (4). ≡ More precisely, we stretch the range of the a–index from minus to plus infinity. This means that all the finite–size effects (appearingatthe ends ofthe time series)are readilydisregarded. In particular,there is no need to careabout initial conditions for the processes, and all the recurrence relations are assumed to continue to the infinite past. C. The VARMA(q1,q2) Process 1. The Definition of VARMA(q1,q2) One can define stochastic proces called VARMA(q ,q ) in the following way as a convolution of VAR(q) and 1 2 VMA(q) processes: q1 q2 Y b Y = a ǫ . (12) ia β i,a−β α i,a−α − βX=1 αX=0 5 To be more specific, we consider a situation when N stochastic variables evolve according to identical independent VARMA(q ,q )(vector autoregressive moving average processes, , which we sample over a time span of T moments. 1 2 VMA(q) (vector moving average) process is a simple generalization of the standard univariate weak–stationary moving averageMA(q). In such a setting, the value Y of the i–th (i=1,...,N) randomvariable at time moment a ia (a=1,...,T) can be expressed as q Y = a ǫ . (13) ia α i,a−α αX=0 Herealltheǫ ’sareIIDstandard(meanzero,varianceone)Gaussianrandomnumbers(whitenoise), ǫ ǫ =δ δ . ia ia jb ij ab h i The a ’s are some (q+1) realconstants; importantly, they do not depend on the index i, which reflects the fact that α the processesareidenticalandindependent(no“spatial”covariancesamongthe variables). The rankq oftheprocess is a positive integer. TheVAR(q)(vectorauto–regressive)processespartissomewhatakinto(13),i.e.,weconsiderN decoupledcopies of a standard univariate AR(q) process, q Y b Y =a ǫ . (14) ia β i,a−β 0 ia − βX=1 It is again described by the demeaned and standardized Gaussian white noise ǫ (which triggers the stochastic ia evolution), as well as (q+1) real constants a , b , with β = 1,...,q. As announced before, the time stretches to 0 β the past infinity, so no initial condition is necessary. Although at first sight (14) may appear to be a more involved recurrence relation for the Y ’s, it is actually easily reduced to the VMA(q) case: It remains to remark that if ia one exchanges the Y ’s with the ǫ ’s, one precisely arrives at the VMA(q) process with the constants a(2) 1/a , a(2) b /a , β = 1ia,...,q. In othiaer words, the auto–covariance matrix A(3) of the VAR(q) process (140) is≡simpl0y β ≡− β 0 the inverseof the auto–covariancematrix A(2) of the correspondingVMA(q) processwith the describedmodification of the parameters, −1 A(3) = A(2) . (15) (cid:16) (cid:17) This inverse exists thanks to the weak stationarity supposition. The auto–covariance matrix A(5) of this process is simply the product (in any order) of the auto–covariance matrices of the VAR and VMA pieces; more precisely, −1 A(5) = A(4) A(1), (16) (cid:16) (cid:17) where A(1) corresponds to the generic VMA(q ) model, while A(4) denotes the auto–covariance matrix of VMA(q ) 2 1 withaslightlydifferentmodificationoftheparameterscomparedtothepreviouslyused,namelya(4) 1,a(4) b , 0 ≡ β ≡− β for β =1,...,q . 1 The M–transform of A(5) can consequently be derived from the general formulas [8]. We will evaluate here the pertinent integral only for the simplest VARMA(1,1) process, even though an arbitrary case may be handled by the technique of residues, 1 z a a + a2+a2 b +a a b2 MA(5)(z)= a0a1+b1z −a0a1+ (1 b )2(cid:0)z 0 (1a +(cid:0) a0 )2 1(cid:1)(11+b )02z1 1(cid:1)(a a )2. (17) 1 0 1 1 0 1  q − − q − −  D. Example from Macroeconomic Data We apply the method described above to Polish macroeconomic data. The motivation behind is twofold. First of all economic theory rarely has any sharp implications about the short-run dynamics of economic variables (so called scarcity of economic time series). Secondly in these very rare situations, where theoretical models include a dynamic adjustment equation, one has to work hard to exclude the moving average terms from appearing in the implied dynamics of the variables of interest. 6 1. An Application to Macroeconomic Data LetuspursuefurthertheaboveanalysisoftheVARMA(1,1)modelonaconcreteexampleofrealdata. Economists naturally think of co-movement in economic time series as arising largely from relatively few key economic factors like productivity, monetary policy and so forth. Classical way of representing this notion is in terms of statistical factormodel,forwhichweallowthelimitedspatio–temporaldependenceexpressedviaourVARMA(1,1)model. Two empirical questions are addressed in this section: 1. How the empirical data behave? Does the eigenvalues represent similar structure to financial market data? In other words, does the macroeconomic data represent collective response to the shock expressed in term of VARMA(1,1) model or are there any apparent outliers. 2. Shouldthenthe forecastsbeconstructedusingsmallfactormodelsorarethereanynon-zero,butperhapssmall coefficients. If so, then the large scale model framework is appropriate. WeinvestigateN =52variousmacroeconomictimeseriesforPolandoflengthT =118. Theyhavebeenselectedona monthlybasisinsuchamannersoastocovermostofthemainsectorsofthePolisheconomy,i.e., themoneymarket, domestic and foreign trade, labor market, balance of payments, inflation in different sectors, etc. The time series were taken directly from the Reutersc3000Xtra database. Although longer time series for Poland are accessible, (cid:13) we restrict ourselves to the last ten years in order to avoid the effects of structural change. We assume that each economicvariableisaffectedbythesameshock(i.e.,the“globalmarketshock”)ofanARMA(1,1)typewithunknown parameters, which we are to estimate; the AR part implies that the shock dies quite quickly, while the MA part is responsible for the persistency of the shock. To preserve the proper VARMA representation, the original time series were transformed using one of the following methods: First, many of the series are seasonally adjusted by the reporting agency. • Second, the data were transformed to eliminate trends and obvious nonstationarities. For real variables, this • typicallyinvolvedtransformationtogrowthrates(the firstdifferenceoflogarithms),andforpricesthis involved transformationto changes in growth rates (the second difference of logarithms). Interest rates were transformed to first differences. • Finally,someoftheseriescontainedafewlargeoutliersassociatedwitheventslikelabordisputes,otherextreme • events, or with data problems of various sorts. These outliers were identified as observations that differed from the sample median by more than 6 times the sample interquartile range, and these observations were dropped from the analysis. In fig. 1 we plot these time series (LEFT), and we make (RIGHT) a histogram of the mean spectral density ρc(λ), which we compare to a theoretical prediction solving the FRV equation for (17) with the estimated values of the parameters a , a , b . We have also plotted the standard Marˇcenko-Pastur(Bai-Silverstein) [9, 10] 0 1 1 2. Discussion The resultis importantfor forecastdesign,but moreimportantly, it providesinformationaboutthe way macroe- conomic variable interact. The empirical data as compared to the spectral density equation suggest that a lot of eigenvalues, similarly to stock market data express marginal predictive content. One can suppose, that each of the economic time series contains important information about the collective movements, that cannot be gleaned from other time series. Alternatively, if we suppose that macro variables interact in the simplest low-dimensional way suggested by VARMA(1,1) model,the conformity is nearly ideal (modulo the finite–size effects at the right edge of the spectrum). The economictime seriesexpresscommonresponsetothe ”globalshock”processi.e., eacheigenvalue nowcontainsusefulinformationaboutthevaluesofthefactors,thataffectco–movementandhenceusefulinformation about the future behavior of economy. Thus, while many more eigenvalues appear to be useful, the predictive com- ponent is apparently common to many series in a way suggested by our simplified VARMA(1,1) model. The above analysis on a real - complex systems example - i.e. Economy of Poland, for which we have assumed, that each of the time series under study is generated by the same type of univariate VARMA(q ,q ) process reveals a stunning fact, 1 2 7 x 1011 Macroeconomic time series for Poland 8 empirical data 7 0,7 WVAisRhMarAt Ffiitt 6 0,6 Results for Wishart value 2345 density of eigenvalues 0000,,,,2345 baR CRarCR 110eh^h^ 22iis ^^ u 22 l //t DDs - oo0=1020=f0oFF..... .60521 2r00 211002V..50891690A03507700R14264166==M77 88A00..(00100,122)99–––––– m88000000o......213010d296766e189070l764109129522 1 0,1 0 0,0 −1 0 2 4 6 0 20 40 60 80 100 120 number of observations eigenvalues FIG. 1: LEFT: The original N =52 time series of length T =118; they are non–stationary, with theseasonal components. RIGHT: The histogram of the mean spectral density ρc(λ) (the solid black line) compared to the theoretical result obtained by numerically solving the resulting sixth–order equation (17) (thesolid orange line) and the ”Wishart-fit” (purpleline). that again the flawless correspondence between theoretical spectral density and empirical data is found. We are in the position,where wecannotrejectthe hypothesis,thereareindeedno autocorrelationsamongmacroeconomictime series. One may also argue, that all these time series are closely bounded with the process, which we will identify as ”global shock-process”i.e., all time series represent the global response complex system under study, to a distortion and its adjustment to equilibrium state. This process is of univariate VARMA(q ,q ) i.e., ARMA(1,1) type with 1 2 hidden structure, that has to be revealed based on historical time series data. This crude empirical study allows potentially forvariety ofextensions. Atleasttwoare possible. First, the approachusedhere identifies the underlying factorsonlyuptoalineartransformation,makingeconomicinterpretationofthefactorsthemselvesdifficult. Itwould be interesting to be able to relate the factors more directly to fundamental economic forces in the spirit of DSGE models. Secondly,ourtheoreticalresultcoversonlystationarymodels,butsaynothingaboutintegrated,cointegrated andco–trendingvariables. Weknowthatcommonlong-runfactorsareimportantfordescribingmacroeconomicdata, and theory needs to be developed to handle these features in a large model framework. III. UNRAVELING INTERNAL TEMPORAL CORRELATIONS VIA SVD TECHNIQUE In order to investigate the temporal properties of the internal correlations between two data sets one is often interested not only in the analysis of it’s static properties given by Pearson estimator (2), C = 1XXT but more X T likely how this features behave overa certainperiod of time. Againthe primary wayto describe cross–correlationsin a Gaussian framework is through the two–point covariance function (IIA), C X X . (18) ia,jb ia jb ≡h i Where X x x are mean adjusted data, that can be further collected into a rectangular N T matrix ia ia ia ≡ −h i × R. The average ... is understood as taken according to some probability distribution whose functional shape is h i stable over time, but whose parameters may be time–dependent. In previous sections we have used a very simplified form of the two–point covariance function (IIA), namely with cross–covariancesand auto–covariancesfactorized and non–random (1) , C =C A (19) ia,jb ij ab (wehaveassembledcoefficientsintoanN N cross–covariancematrixCandaT T auto–covariancematrixA;both × × are taken symmetric and positive–definite). The matrix of “temporal structure” A is a way to model two temporal effects: the (weak, short–memory) lagged correlations between the returns , as well as the (stronger, long–memory) lagged correlations between the volatilities (weighting schemes, eg.EWMA [11]). On the other hand, the matrix of spatial correlations C models the hidden factors affecting the variables, thereby reflecting the structure of mutual 8 dependencies of the complex system. The salient feature assumed so far, these two matrices were decoupled and the assumption about the Gaussianity of random variables provides crude approximation, that variances of all random variables always exist. This was sufficient to fully characterize the dependencies of the X ’s. However, in more ia realistic circumstances (i.e., building efficient multivariate models,which help understanding the relation between a large number of possible causes and resulting effects) one is more interested in the situations, where the spatio– temporal structure does not factorize. Cross-correlations technique (sometimes alluded as ”time–lagged correlations technique”) is most likely meets these critical requirements. T 1 (∆)= X X (20) ia,ja+∆ ia ja+∆ C T Xa=1 The precise answer boils down to how to separate the spectrum of such a covariance matrix in the large N, large T limit(i.e., thermodynamicallimit),whenonecanmakeuseofthepowerofFRVcalculus(see[12]forasolutionbased on the circular symmetry of the problem and Gaussian approximation). In this chapter we will very closely follow the method presented in [13], where the authors suggested to compare the singular value spectrum of the empirical rectangular M N correlation matrix with a benchmark obtained using Random Matrix Theory results (c.f. [14]), × assuming there are no correlation between the variables. For T at N,M fixed, all singular values should be → ∞ zero, but this will not be true if T is finite. The singular value spectrum of this benchmark problem can in fact be computedexactly inthe limit whereN,M,T , whenthe ratiosm=M/T andn=N/T fixed. Since the original →∞ description is veiled, for pedagogical purposes we rederive all these results in the language of FRV presented in the table I. Furthermore we extend the results obtained in [15] to meet encounter time-lagged correlations. A. Mathematical Formulation of a Problem Duetotheworks[2,16,17]itisbelievedthat,thesystemitselfshoulddeterminethenumberofrelevantinputand output factors. In the simplest approachone would take all the possible input and output factors and systematically correlate them, hoping to unravel the hidden structure. This procedure swiftly blow up with just few variables. The cross- equationcorrelationmatrixcontainsallthe informationaboutcontemporaneouscorrelationina Vectormodel andmaybeitsgreateststrengthanditsgreatestasset. Sincenoquestionableaprioriassumptionsareimposed,fitting a Vector modelallowsdata–setto speak for itself i.e., find the relevantnumber offactors. Still without imposing any restrictions on the structure of the correlation matrix one cannot make a causal interpretation of the results. The theoretical study of high dimensional factor models is indeed actively pursued in literature [18–25]. The main aim of this chapter is to present a method, which helps extract highly non-trivial spatio–temporal correlations between two samples of non-equal size (i.e. input and output variables of large dimensionality), for these can be then treated as ”natural” restrictions for the correlations matrix structure. 1. Basic framework and notation Wewilldivideallvariablesintotwosubsetsi.e., focusonN inputfactorsX (a=1,...,N)andM outputfactors a Y (α=1,...,M)with the totalnumber ofobservationsbeing T. All time series arestandardizedto havezeromean α and unit variance. The data can be completely different or be the same variables but observed at different times. First one has to remove potential correlations inside each subset, otherwise it may interfere with the out-of-sample signal. To remove the correlations inside each sample we form two correlation matrices,which contain information about in-the-sample correlations. 1 1 CX = XXT CY = YYT (21) T T The matrices are then diagonalized,provided T > N,M, and the empirical spectrum is compared to the theoretical Marˇcenko-Pastur spectrum [10, 26–28]in order to unravel statistically significant factors.The eigenvalues,which lie muchbelowtheloweredgeoftheMarˇcenko-Pasturspectrumrepresenttheredundantfactors,rejectedbythesystem, so one can exclude them from further study and in this manner reduce somewhat the dimensionality of the problem, by removing possibly spurious correlations. Having found all eigenvectors and eigenvalues, one can then construct a 9 set of uncorrelated unit variance input variables Xˆ and output variables Yˆ. 1 1 Xˆ = VTX Yˆ = UTY (22) at t αt t √Tλ √Tλ a α where V,U, λ , λ are the corresponding eigenvectors and eigenvalues of C , C respectively. It is obvious, that a α X Y C = XˆXˆT and C = YˆYˆT are identity matrices, of dimension, respectively, N and M. Using general property Xˆ Yˆ of diagonalization, this means that the T T matrices D = XˆTXˆ and D = YˆTYˆ have exactly N (resp. M) × Xˆ Yˆ eigenvalues equalto 1 and T N (resp. T M) equalto zero. These non-zeroeigenvaluesare randomly arrangedon − − a diagonal. Finally we can reproduce the asymmetric M N cross-correlationmatrix G between the Yˆ and Xˆ: × G=YˆXˆT (23) which includes only the correlations between input and output factors. In general the spectrum of such a matrix is complex, but we will use the singular value decomposition (SVD) technique (c.f. [29]) to find the empirical spectrum of eigenvalues. 2. The Singular Value Decomposition The singular value spectrum represent the strength of cross-correlationsbetween input and output factors. Sup- pose G is an M N matrix whose entries are either real or complex numbers. Then there exists a factorization of × the form G=UΣV† (24) where U is an M M unitary matrix. The columns of U form a set of orthonormal”output” basis vector directions for G - these are×the eigenvectors of G†G. Σ is M N diagonal matrix with nonnegative real numbers on the × diagonal,which can be thought of as scalar ”gain controls” by which each corresponding input is multiplied to give a correspondingoutput. These are the square roots of the eigenvalues of GG† and G†G that correspondwith the same columns in U and V. and V† denotes the conjugate transpose of V, an N N unitary matrix,whose columns form a set of orthonormal ”input” or vector directions for G. These are the eige×nvectors of GG†. A common convention for the SVD decompositionis to orderthe diagonalentries Σ in descending order. In this case,the diagonalmatrix i,i Σ is uniquely determined by G (though the matrices U and V are not). The diagonal entries of Σ are known as the singular values of G. B. Singular values from free random matrix theory Inordertoevaluate thesesingulareigenvalues,assumewithoutlossofgeneralityM <N. Thetrickis toconsider the matrix M M matrix GGT (or the N N matrix GTG if M > N), which is symmetrical and has M positive × × eigenvalues, each of which being equal to the square of a singular value of G itself. Furthermore use the cyclic properties of the trace. Then non-zero eigenvalues of GGT =YˆXˆTXˆYˆT are then the same (up to the zero modes) as those of the T T matrix × D=D D =XˆTXˆYˆTYˆ Xˆ Yˆ obtained by swapping the position of Yˆ from first to last. In the limit N,M,T where the Xˆ’s and the Yˆ’s are →∞ independent from each other, the two matrices D and D are mutually free [30], and we can use the results from Xˆ Yˆ FRV,where giventhe spectraldensity ofeachindividualmatrix,one is ableto constructthe spectrumofthe product or sum of them. 10 1. FRV Algorithm for Cross-correlation matrix As usual we will start with constructing the Green’s function for matrices D and D . Each of these matrices, Xˆ Yˆ have off-diagonal elements equal to zero, while on diagonal a set of M (or N respectively) randomly distributed eigenvalues equal to 1 G = 1 M + T−M = m + 1−m m= M; DXˆ T (cid:16)z−1 z (cid:17) z−1 z T (25) G = 1 N + T−N = n + 1−n n= N; DYˆ T z−1 z z−1 z T (cid:16) (cid:17) Then: S (z)=S (z) S (z) (26) DXˆ·DYˆ DXˆ · DYˆ or equivalently 1+z N (z)=N (z) N (z), (27) z DXˆ·DYˆ DXˆ · DYˆ where: 1+z 1 Sx(z)= χx(z) Nx(z)= Nx(z)G(Nx(z)) 1=z z χ (z) − x From this, one easily obtains: N m + 1−m 1 = z DXˆ(z)(cid:18)NDXˆ(z)−1 NDXˆ(z)(cid:19)− (28) NDXˆ(z)m +1 m 1 = z NDXˆ(z)−1 − − m+z n+z N (z)= N (z)= (29) DXˆ z DYˆ z (m+z)(n+z) N N (z)= (30) DXˆ ·DYˆ z2 and one readily gets the N–transform for the matrix DXˆ DYˆ · (m+z)(n+z) N (z)= (31) DXˆ·DYˆ z(1+z) Inverting functionally (31) NDXˆ·DYˆ(z)GD NDXˆ·DYˆ(z) =z+1 (32) (cid:0) (cid:1) i.e., solving the second order equation in z, one is able to find the Green’s function of a product D D Xˆ · Yˆ 0 = z2(1 N(z))+(n+m N(z))z+mn 2−(n+−m+N(z))−√(n+m−N−(z))2−4(1−N(z))mn (33) G(N(z)) = , 2N(z)(1−N(z)) where we have omitted the subscripts for brevity. Subsequently mean spectral density is obtained from the standard relation 1 1 1 lim =PV iπδ(λ) ρD(λ)= ImGD(λ+iǫ). (34) ǫ→0+ λ+iǫ (cid:18)λ(cid:19)− ⇒ −π The final result i.e., the benchmark case where all (standardized) variables X and Y are uncorrelated, meaning that the ensemble average E(C )=E(XXT) and E(C )=E(YYT) are equal to the unit matrix, whereas the ensemble X Y average cross-correlationE(G)=E(YXT) is identically zero, reads as in original paper [13]: ρD(λ)=max(1 n,1 m)δ(λ)+max(m+n 1,0)δ(λ 1)+ − − − −

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.