ebook img

The 20-60-20 Rule PDF

1 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The 20-60-20 Rule

The 20-60-20 Rule Piotr Jaworski∗, Marcin Pitera†, August 21, 2015 Abstract 5 In this paper we discuss an empirical phenomena known as the 20-60-20 rule. It says that if we 1 split the population into three groups, according to some arbitrary benchmark criterion, then this 0 particularratioimpliessomesortofbalance. Frompracticalpointofview,thisfeatureoftenleadsto 2 efficientmanagementorcontrol. Weprovideamathematicalillustration,justifyingtheoccurrenceof g this rule in many real world situations. We show that for any population, which could be described u usingmultivariatenormalvector,thisfixedratioleadstoaglobalequilibriumstate,whendispersion A and linear dependancemeasurement is considered. Keywords: 20-60-20 Rule, 60/20/20 Principle, 20:60:20, Pareto principle, law of the vital few, the 0 principle of factor sparsity, truncated normal distribution, conditional elliptic distribution. 2 MSC2010: 60K30, 91B10, 91B14, 91B52. 60A86, 62A86, ] R P Introduction . h t The 20-60-20 rule is an empirical statement. It says that if we want to split the population into three a groups, using some arbitrary benchmark criterion, then the ratio of 20%, 60% and 20% proves to give m an efficient partition. The division is usually made according to the performance of each element in the [ population and the groups are referredto as negative,neutral and positive, respectively. The first group 2 relates to elements of population which positively contribute to the considered subject (e.g. effective v workers, top sale managers, productive members), while the last one denotes the opposite. The middle 3 set corresponds to the middle part of the population, having average performance. Putting it another 1 way we cluster the population basing on a notion of effectiveness. 5 2 The importance of this rule comes from the fact that this particular partition seems to be the most 0 effective one, for many empirical problems. Let us present in details two common illustrations of this . phenomena and then comment on the efficiency to make this idea more transparent. 1 0 The first example considers sales departments. In almost any big company, the employees of the 5 sales department could be split into three groups, maintaining 20-60-20 ratio. The first group are top 1 performers, who make big profits, even without supervision. The middle group are people who need to : v be managed to make average but stable profits. The last group are people who are heading towards i termination or resignation. They produce no good income, even when supervised. X The second example relates to change capability. If you are willing to make substantial changes in r a any big institution, then on average 20% of the people are ready, willing and able to change, while 20% of people would not accept the change, whatever the cost. The middle 60% will wait to see how the situation turns out. Corporationsuse the 20-60-20rule widely in managementand sales departments [15, 13]. One of the practicalaspects ofthis phenomena relatestothe factthatdifferentproceduresandmethods arecreated to handle the efficiency in positive, negative and neutral group and the 20-60-20 ratio proves to be the most efficient partition. For example, in many problems related to human resource management, one should identify and focus his attention on the middle 60%, as this group could and should be managed efficiently. ∗Institute ofMathematics, UniversityofWarsaw,Warsaw,Poland. †Institute ofMathematics, JagiellonianUniversity,Cracow,Poland. 1 Of course there are countless illustrations of this phenomena. One could consider financial market overall condition, fraud and theft capability among group of people, the structure of electorate, sport performance among athletes, potential of students, patient handling, medical treatments, etc. Pleasesee e.g. [14, 3, 1, 5, 7, 8, 11, 2, 4], where the 20-60-20ratio is used and the detailed procedures are proposed to handle many practical problems. The natural question is why this specific 20/60/20 ratio is valid in so many situations? Why not 10/80/10 or 30/40/30? Is this a coincidence, or does it follow from some underlying and fundamental structure of the population? While very popular among practitioners, no scientific evidence of the 20-60-20 principle has been presented yet, due to the authors knowledge. Consequently, this noteworthy rule become more of a slogan, than the scientific fact. The possible mathematical illustration of this phenomena, based on the dispersion and linear depen- dance measurement will be the main topic of this paper. We will show that if a (multivariate) random vectorisdistributednormallyandwedoconditioningbasedonthe(quantilefunctionof)firstcoordinate, thenthe ratiocloseto20/60/20imply aglobalequilibriumstate,whendispersionandlineardependance measurement is considered. In particular, we prove that this particular partition implies the equality of covariance matrices, for all conditional vectors, implying some sort of global balance in the popula- tion. We will also discuss the case of monotone dependance using conditional Kendall τ and Spearman ρ matrices. The material is organized as follows. The introduction is followed by a short preliminaries, where we establish basic notations used throughout this paper. Next, in Section 2 we introduce a mathematical model for the 20-60-20 rule and define the equilibrium state, using conditional covariance matrices. The 20-60-20ruleformultivariatenormalvectorsisdiscussedinSection3. Theorem1mightbeconsideredas the main result of this paper. Section 4 is devoted to the study of different equilibrium states, obtained using correlation matrices, Kendall τ matrices and Spearman ρ matrices. In particular we present here sometheoreticalresults,whenSpearmanρ matricesareconsideredandanumericalexample,illustrating the20-60-20ruleforsampledata. InSection5wediscussshortlywhathappensifweloosetheassumption about normality. The general elliptic case is considered here. 1 Preliminaries Let(Ω,Σ,P)be aprobabilityspaceandletn N. Letusfix ann-dimensionalcontinuousrandomvector ∈ X =(X ,...,X ). We will use 1 n H(x ,...,x ):=P[X x ,...X x ], 1 n 1 1 n n ≤ ≤ to denote the corresponding joint distribution function and H (x)=P[X x], i=1,2,...,n, i i ≤ to denote the marginal distribution functions. Given a Borel set in R¯n such that B P[ ω Ω:(X (ω),...,X (ω)) ]>0 1 n { ∈ ∈B} we can define the conditional distribution H for all (x ,...,x ) by B 1 n ∈B H (x ,...,x )=P[X x ,...,X x X ]. (1) B 1 n 1 1 n n ≤ ≤ | ∈B PuttingitinanotherwordswetruncatetherandomvectorX totheBorelset . Ifnecessary,weassume B theexistenceofregularconditionalprobabilities. Inthispaperwewillassumethat isanon-degenerate B rectangle, i.e. , where B ∈R := A R¯n :A=[a ,b ] [a ,b ] ... [a ,b ], where a ,b R¯ and a <b . 1 1 2 2 n n n n n n R { ∈ × × × ∈ } As we will be mainly interested in quantile-based conditioning on the first coordinate, for q ,q [0,1] 1 2 ∈ such that q <q , we shall use notation 1 2 H (x ,...,x ):=H (x ,...,x ), (2) [q1,q2] 1 n B(q1,q2) 1 n 2 where the conditioning set is given by (q ,q ):=[H−1(q ),H−1(q )] R¯ ... R¯. B 1 2 1 1 1 2 × × × WeshallalsorefertoH asthetruncateddistribution,while (q ,q )willbecalledtruncationinterval [q1,q2] B 1 2 (see [10]). Moreover,wewilldenotebyµ=(µ ,...,µ )andΣ= σ2 ,the meanvectorandcovariance 1 n { ij}i,j=1,...,n matrix of X. Similarly as in formula(1), given ,we will use µ andΣ to denote the conditionalmean B B B vector and the conditional covariance matrix, i.e. mean vector and conditional covariance matrix of a random vector with distribution H . Consequently, as in (2), we shall write B µ :=µ and Σ :=Σ . [q1,q2] B(q1,q2) [q1,q2] B(q1,q2) We willalsouseΦ andφ todenote the distributionanddensity functionofa standardunivariatenormal distribution, respectively. 2 The global balance To split the whole population into three separate groups basing on a notion of effectiveness, we need to makeanassumptionabouttheprobabilitydistributionofthewholepopulationandthegivenbenchmark, which measures the effectiveness of each element in the population. We will assume that X (µ,Σ), ∼ N i.e. the population could be described using n-dimensional random vector X = (X ,...,X ), which is 1 n normally distributed with mean vector µ and covariance matrix Σ. Furthermore, we will assume that the benchmark level is determined by the first coordinate, i.e. X . Please note that for multivariate 1 normalthis maybe a linearcombinationofallother coordinates. One couldlookatother coordinatesas various factors, which could influence the main benchmark. Note that, if we talk about people measures or abilities, then Gaussian functions, often described as bell curves, are a natural choice. We will seek for two real numbers q ,q [0,1] and the corresponding partition 1 2 ∈ (0,q ), (q ,1 q ), (1 q ,1), 1 1 2 2 B B − B − which will admit some sort of equilibrium. In other words, we want to divide the whole population into three subgroups,correspondingtothelower100q %,the middle 100(1 q q )%andtheupper 100q % 1 1 2 2 − − ofthepopulation,wheretheeffectivenessismeasuredbythebenchmark. Todoso,letusgiveadefinition of equilibrium state or global balance. Definition 1. We will say that a global balance (or equilibrium state) is achieved in X if Σ =Σ =Σ , (3) [0,q1] [q1,1−q2] [1−q2,1] for some q ,q [0,1], such that q <q . 1 2 1 2 ∈ Definition 1 seems to be very intuitive. Indeed, the equality of conditional covariance matrices say that: 1. The dispersion measured by variance is the same in each subgroup for any coordinate X (for i i=1,2,...,n). In particular the dispersion of the benchmark is the same everywhere. 2. The linear dependance structure, measured by the conditional correlation matrices, is the same in all three subgroups. The first property creates a natural equilibrium state, as any perturbation leads to irregularity, when the square distance from the averagemember of each group is considered. The choice of this measure of dispersion seems to be natural, because people awareness of any differences should be high, as variance (or standard deviation) seems to be the simplest measure of variability. The second property relate to the linear dependence structure. The equality of correlation matrices imply a natural equilibrium between groups, as people tend to notice the simplest (linear) dependancies first. Any shift between groups will cause dependence instability between them. In general (i.e. when we loose assumption about normality) the global balance might not exists or strongly depend on initial Σ, when we consider some family parametrised by covariance matrices. 3 3 The 20/60/20 principle If X is a multivariate normal, it is reasonable to set q = q , due to the symmetry of the Gaussian 1 2 density. For simplicity we will use q = q = q for the symmetric case. Thus, we will in fact seek for 1 2 q (0,0.5) such that the conditional covariance matrix for the lower 100q% of the population coincide ∈ with the conditional covariance matrices of the middle 100(1 2q)% and upper 100q%. − We are now ready to present the main result of this paper. We will show that if X (µ,Σ), then ∼ N the equilibrium state will be achieved for the unique q (0,0.5). This is a statement of Theorem 1. ∈ Theorem 1. Let X (µ,Σ). Then there exists a unique q (0,0.5) such that the global balance in ∼ N ∈ X is achieved, i.e. the equality (3) is true for q =q =q . Moreover, the value of q is independent of µ 1 2 and Σ and the approximate value of q is 0,198089616... The proof of Theorem 1 is surprisingly simple. It is a direct consequence of Lemma 1 and Lemma 2, whichwewillnowpresentandprove. Beforewedothis,letusgiveacommentonTheorem1. Itsaysthat if we split the whole population, into three separate groups, then the ratio close to 20-60-20(and in fact onlythisratio),willimplytheequalityofconditionalcovariancematricesforallgroups,creatinganatural equilibrium. To proveTheorem1we needananalyticformulaforconditionalcovariancestructure, given any conditioning Borel set of positive measure. This will be the statement of Lemma 1. B Lemma 1. Let X (µ,Σ). Then for any Borel subset of R with positive measure, ∼N B Σ =Σ+(D2[X X ] D2[X ])ββT, B 1 1 1 | ∈B − where Cov[X ,X ] βT =(β ,...,β ), β = 1 i . 1 n i D2[X ] 1 Proof of Lemma 1. Being in Gaussian world we can describe each random variable X as a combination i of the random variable X and a random variable Y independent of X . Indeed, we put for i=1,...,n 1 i 1 Cov[X ,X ] 1 i Y =X β X , where β = . (4) i i− i 1 i D2[X ] 1 Obviously β = 1 and Y = 0. Since for i = 2,...,n, the newly defined variable Y is uncorrelated with 1 1 i X , they are independent. 1 Next, we calculate the conditional covariance matrix. Using (4), we get for i,j =1,...,n Cov[X ,X X ]=Cov[β X +Y ,β X +Y X ]. i j 1 i 1 i j 1 j 1 | ∈B | ∈B Since Y and Y do not dependent on X , we get i j 1 Cov[Y ,X X ]=0=Cov[Y ,X X ], i 1 1 j 1 1 | ∈B | ∈B and Cov[Y ,Y X ]=Cov[Y ,Y ]=Cov[X ,X ] β β D2[X ]. i j 1 i j i j i j 1 | ∈B − Therefore, we obtain Cov[X ,X X ]=Cov[X ,X ]+β β (D2[X X ] D2[X ]). i j 1 i j i j 1 1 1 | ∈B | ∈B − Since β β is the i,j-th entry of the n n matrix ββT, we finish the proof of the lemma. i j × From Lemma 1 we see, that we can parametrise Σ in such a way, that it will only depend on the B conditional variance of X . Thus, we only need to show that there exists q (0,0.5) such that the 1 ∈ (conditional) dispersion of X in all three groups, determined by sets (0,q), (q,1 q) and (1 q,1) 1 B B − B − will coincide. This will be the statement of Lemma 2. 4 Lemma 2. Let X (µ ,σ2 ). Then there exist a unique q (0,0.5) such that 1 ∼N 1 11 ∈ D2[X X (0,q)]=D2[X X (q,1 q)]=D2[X X (1 q,1)]. 1 1 1 1 1 1 | ∈B | ∈B − | ∈B − Moreover, q =Φ(x), where x<0 is the unique negative solution of the following equation xΦ(x)=φ(x)(1 2Φ(x)), (5) − − where φ and Φ denote the density and distribution function of standard normal, respectively. The ap- proximate value of q is 19,8089616.... Proof of Lemma 2. Without any loss of generality we may assume that X has the standard normal 1 distribution (0,1). Indeed, for Xst = X1−µ1, and q ,q [0,1], such that q <q , we get N 1 σ11 1 2 ∈ 1 2 D2 X H (X ) [q ,q ] =D2 σ Xst+µ Φ(Xst) [q ,q ] =σ2 D2 Xst Φ(Xst) [q ,q ] . 1 | 1 1 ∈ 1 2 11 1 1 | 1 ∈ 1 2 11 1 | 1 ∈ 1 2 To(cid:2)proceed, we need to c(cid:3)omput(cid:2)e the first two moments of the(cid:3)truncated(cid:2)normal distribution of(cid:3)X . 1 For transparency, we will show full proofs (compare [10, Section 13.10.1]). Let us calculate the conditional expectations E[X X <x] and E[X x<X < x] for any fixed 1 1 1 1 | | − x ( ,0). Since φ′(x)= xφ(x), we get ∈ −∞ − 1 x 1 φ(x) E[X X <x]= ξφ(ξ)dξ = ( φ(ξ))x = , 1 | 1 Φ(x) Φ(x) − |−∞ −Φ(x) Z−∞ E[X x<X < x]=0. 1 1 | − To get the corresponding second moments we integrate by parts. 1 x 1 x E[X2 X <x] = ξ2φ(ξ)dξ = ξφ(ξ))x + φ(ξ)dξ 1 | 1 Φ(x) Φ(x) − |−∞ Z−∞ (cid:18) Z−∞ (cid:19) 1 xφ(x) = ( xφ(x)+Φ(x))=1 , Φ(x) − − Φ(x) 1 −x 1 −x E[X2 x<X < x] = ξ2φ(ξ)dξ = ξφ(ξ))−x+ φ(ξ)dξ 1 | 1 − 1 2Φ(x) 1 2Φ(x) − |x − Zx − (cid:18) Zx (cid:19) 1 2xφ(x) = (2xφ(x)+1 2Φ(x))=1+ . 1 2Φ(x) − 1 2Φ(x) − − Therefore, xφ(x) φ(x)2 D2[X X <x]=1 , 1 | 1 − Φ(x) − Φ(x)2 2xφ(x) D2[X x<X < x]=1+ . 1 1 | − 1 2Φ(x) − Sincetheconditionalexpectedvaluebehaveslikeaweightedarithmeticmean,wegetthatE[X X <x] 1 1 | is strictly increasing in x, while E[X2 x < X < x] and E[X2 X < x] are strictly decreasing with 1 | 1 − 1 | 1 respect to x. Consequently,the central conditionalvarianceD2[X x<X < x] is strictly decreasing. 1 1 | − Next, we will show that the tail conditional variance D2[X X <x] is strictly increasing. Indeed, 1 1 | d φ(x) φ(x) φ(x)2 φ(x)2 φ(x)2 D2[X X <x] = +x2 x +2x +2 dx 1 | 1 −Φ(x) Φ(x) − Φ(x) Φ(x) Φ(x)2 φ(x) φ(x) φ(x)2 = x2 1+x +2 Φ(x) − Φ(x) Φ(x)2 (cid:18) (cid:19) φ(x) 1φ(x) 2 7 φ(x)2 = x2 + 1 >0. Φ(x) (cid:18) − 2Φ(x)(cid:19) 4Φ(x)2 − ! The last inequality follows from the fact that since φ(x) = E[X X < x] is decreasing and positive, Φ(x) − 1 | 1 we get φ(x)2 φ(0)2 2 4 = > . Φ(x)2 ≥ Φ(0)2 π 7 5 Next, note that (compare [9, Lemma 8.1]) 2 lim D2[X X <x]=0 and D2[X X <0]=1 , 1 1 1 1 x→−∞ | | − π while lim D2[X x<X < x]=1 and limD2[X x<X < x]=0. 1 1 1 1 x→−∞ | − x→0 | − Hence there exists a unique x<0 such that D2[X X <x]=D2[X x<X < x]. 1 1 1 1 | | − Compare Figure 1 for visualization. 1.0 B[0,q] (tail) B[q,1−q] (central) 0.8 0.6 variance 0.4 0.2 0.0 0.0 0.1 0.2 0.3 0.4 0.5 q Figure 1: The graph of conditional tail variance D2[X X (0,q)] and conditional central variance 1 1 | ∈ B D2[X X (q,1 q)] as functions of q (0,0.5), under the assumption X (0,1). 1 1 1 | ∈B − ∈ ∼N Moreover, xφ(x) φ(x)2 2xφ(x) D2[X X <x] D2[X x<X < x] = 1 1 1 | 1 − 1 | 1 − − Φ(x) − Φ(x)2 − − 1 2Φ(x) − Φ(x) = ( xΦ(x) φ(x)(1 2Φ(x))), Φ(x)2(1 2Φ(x)) − − − − which shows that x is a (negative) solution of equation (5). Using basic numerical tools we checked that (5) is satisfied for x 0,8484646848,for which Φ(x) 0,198089615. ≈− ≈ Theorem1providesanillustrationtotheempirical20-60-20rule. Inparticularwehaveshownthatfor any multivariate normal vector, this fixed ratio leads to a global equilibrium state, when dispersion and linear dependance measurementis considered. Nevertheless,please note, that the equality of conditional variances does not imply the equality of conditional distributions, as could be seen in Figure 1. Also, while linear dependance structure will be the same, the overall dependance in each subgroup, measured e.g. by the copula function [12], will be different. Indeed, for example it seems to be unwise to require the dependance structure in the best group, to coincide with the dependance structure in the average group. See Figure 2 for an illustrative example. Remark 1. The equilibrium level q calculated in Lemma 2 depends neither on µ nor Σ. Therefore, if we consider correlation matrices instead of covariance matrices in (3), then the optimal value of q from Theorem 1 will also imply the corresponding equilibrium state, for correlation matrices.1 1Please note we need additional assumption that X1 is not independent of (X2,...,Xn) as otherwise any q ∈(0,0.5) willsatisfy(3)forcorrelationmatricesinsteadofcovariancematrices. 6 1.4 1.4 1.4 1.2 1.2 1.2 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 Figure 2: The conditionaldensity functionof the lower20%,middle 60%andupper 20%ofthe standard normal distribution. The conditional variances for all three cases coincide. 3 3 3 2 2 2 1 1 1 0 0 0 −1 −1 −1 −2 −2 −2 −3 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 3: The conditional samples (upper row) and their conditional copula functions (lower row) from the bivariate normal with µ = (0,0) and Σ = σ , where σ = σ = 1 and σ = σ = 0.8. The ij 11 22 12 21 conditioning is based on the the first coordinate and relates to the lower 20%, middle 60% and upper 20% of the whole population. Remark 2. The value Σ Σ , for q 0.198 and some arbitrary matrix norm (e.g. Frobenius [0,q] [q,1−q] k − k ≈ norm) might beusedtotesthowfar X is from amultivariatenormaldistribution. This testisparticularly important, as it shows the impact of the tails on the central part of the distribution, as usually (for empirical data) the dependence (correlation) structure in the tails significantly increases, revealing non- normality. Remark 3. We can also consider more than three states, when clustering the population (e.g. having 5 states we might relate to them as critical, bad, normal, good and outstanding performance, based on selected benchmark). The ratios, which imply equilibrium state (similar to the one from Definition 1) for 5 and 7 different states are close to 0.027/0.243/0.460/0.243/0.027 and 0.004/0.058/0.246/0.384/0.246/0.058/0.004, respectively. Those values could be easily computed using results from Lemma 1 and Lemma 2. 4 Equilibrium for monotonic dependance In the definition of the equilibrium state (Definition 1) we have in fact measured the distance between conditional covariance matrices to compare the variability and linear dependance structure between the 7 groups. As explained in Remark 1, one could use conditional correlation matrices instead of covariance matrices and focus on the comparison of the linear dependance structure. Of course there are also other measures of dependance, which could be used to reformulate Definition 1. Among most popular ones are so called measures of concordance, where Kendall τ and Spearman ρ are usually pickedrepresentatives for two dimensional case (see [12, Section 5] for more details). Instead of measuring the linear dependence, they focus on the monotone dependence, being invariant to any strictly monotone transform of a random variable (note that correlation is only invariant wrt. positive linear transformation). Thus,insteadofcovariancematricesΣ ,Σ andΣ in(3)wecanconsiderthecorrespond- [0,q] [q,1−q] [1−q,1] ing matrices of conditional Kendall τ and conditional Spearman ρ, denoted by Στ , Στ , Στ [0,q] [q,1−q] [1−q,1] and Σρ , Σρ , Σρ , respectively. For comparison, we will also consider conditional correlation [0,q] [q,1−q] [1−q,1] matrices, for which we shall use notation Σr , Σr and Σr . [0,q] [q,1−q] [1−q,q] Unfortunately, the analog of Theorem 1 is not true, if we substitute covariance matrices with the Spearman ρ or Kendall τ matrices in Definition 3. Becauseof that we need differentkind of notation for the equilibrium state, as stated in Definition 2. Definition 2. Let us assume that X is symmetric2 and let κ r,ρ,τ 3. We will say that a quasi-global ∈{ } balance (or quasi-equilibrium state) is achieved in X for κ and qˆ [0,1] if ∈ kΣκ[0,qˆ]−Σκ[qˆ,1−qˆ]kF =q∈(i0n,f0.5)kΣκ[0,q]−Σκ[q,1−q]kF. (6) where is a standard Frobenius matrix norm given by F k·k n n A :=trAAT = a 2, F ij k k v | | ui=1j=1 uXX t for any n-dimensional matrix A= a . ij i,j=1,...,n { } Similarly as in Definition 1, we will say that a global balance (or equilibrium state) is achieved in X for κ and qˆ [0,1] if the value in (6) is equal to 0. ∈ For transparency, we will write qˆr =argmin Σr Σr , (7) k [0,q]− [q,1−q]kF q∈(0,0.5) qˆτ =argmin Στ Στ , (8) k [0,q]− [q,1−q]kF q∈(0,0.5) qˆρ =argmin Σρ Σρ , (9) k [0,q]− [q,1−q]kF q∈(0,0.5) to denote ratios, which imply quasi-equilibrium states given in (6).4 As expected, for X (µ,Σ), the values qˆτ and qˆρ also seem to be very close to 0.2, for almost any ∼N value of µ and Σ. To illustrate this property, we have picked 1000 random covariance matrices Σ 1000 { i}i=1 for n=45 and computed the values of functions fi(q)= (Σi)r (Σi)r , (10) r k [0,q]− [q,1−q]kF fi(q)= (Σi)τ (Σi)τ , (11) τ k [0,q]− [q,1−q]kF fi(q)= (Σi)ρ (Σi)ρ . (12) ρ k [0,q]− [q,1−q]kF To do so, for each i 1,2,...,1000 we have taken 1.000.000 Monte Carlo sample from X (0,Σi) ∈ { } ∼ N andcomputedvaluesof (10),(11)and(12)usingMCestimatesofthecorrespondingconditionalmatrices. The graphs of fi, fi and fi for i = 1,2,...,50 are presented in Figure 4. In Figure 5, we also present r τ ρ the smoothed histogram function of points qˆr 1000, qˆτ 1000 and qˆρ 1000, for which the minimum is { i}i=1 { i}i=1 { i}i=1 attained in (10), (11) and (12) for i=1,2,...,1000. 2i.e. X issymmetricwrt. E[X]=(E[X1],...,E[Xn]);notethatitimpliesthatΣ[0,q]=Σ[1−q,1] foranyq∈(0,0.5). 3Thiswillrelatetotheconditionalcorrelationmatrices,SpearmanρmatricesorKendallτ matrices,respectively. 4Forsimplicity,weuseargminandassumethatthe(quasi)equilibriumstateexistsandisunique. 5Withadditionalassumptionthatcorrelationcoefficientsarebiggerthan0.2andsmallerthan0.8,toavoidcomputation 8 0.20 0.20 0.20 0.15 0.15 0.15 F−norm 0.10 F−norm 0.10 F−norm 0.10 0.05 0.05 0.05 0.00 0.00 0.00 0.15 0.20 0.25 0.30 0.15 0.20 0.25 0.30 0.15 0.20 0.25 0.30 q q q Figure 4: The graphs of functions fi, fi and fi for i = 1,2,...,50, computed using 1.000.000 sample r τ ρ from (0,Σi) and the corresponding estimates of conditional matrices. N 200 200 200 150 150 150 density 100 density 100 density 100 50 50 50 0 0 0 0.19 0.20 0.21 0.22 0.23 0.19 0.20 0.21 0.22 0.23 0.19 0.20 0.21 0.22 0.23 q q q Figure 5: Monte Carlo density functions constructed using points qˆ 1000, qˆτ 1000 and qˆρ 1000. For { i}i=1 { i}i=1 { i}i=1 each i=1,2,...,1000 a 1.000.000 sample from (0,Σi) was simulated and the corresponding estimates N of conditional matrices were used for computations. Unfortunately,ingeneralthevaluesqˆτ andqˆρ definedin(8)and(9)arenotconstantandindependent of Σ. In particular,if the dependance inside X is very strong,e.g. the vector (X ,X ,...,X ) is almost 1 2 n comonotone, then the values of qˆτ and qˆρ might increase substantially.6 To illustrate this property, let us present some theoretical results, involving conditional Spearman ρ and Kendall τ. For simplicity, till the end of this subsection, we will assume that n=2. Then, given X (µ,Σ), we know that σ2 = σ2 = rσ σ , where r [ 1,1] is the correlation ∼ N 12 21 11 22 ∈ − between X and X . It is easy to show (see [9]), that both unconditional and conditional values of 1 2 Spearman ρ as well as Kendall τ will depend only on the copula of X7, which is parametrised by the correlation coefficient. Thus, without loss of generality, instead of considering all µ and Σ, we might assume that 1 r X =(X ,X ) (µ,Σ) where µ=(0,0) and Σ= . 1 2 ∼N r 1 (cid:18) (cid:19) for a fixed r [ 1,1]. ∈ − Let ρ (r) and τ (r) denote the corresponding conditional Spearman ρ and Kendall τ, given [p,q] [p,q] truncation interval (p,q). Note that ρ (r) and τ (r) are odd functions of r. [p,q] [p,q] B problems resulting from independence or comonotonicity, respectively (see also Remark 1). Note also that the sign of correlationcoefficientisirrelevant,duetosymmetryofX,sowithoutlossofgenerality,wecanassumethatthecorrelation matrixispositive. Moreover,thevaluesofqˆτ andqˆρ areinvariantwrt. µ,sowecansetµ=0withoutlossofgenerality. 6Notethatinournumericalexamplewehaveassumedthatthecorrelationforanypairisbetween0.2and0.8,excluding extremal cases. 7Notethat the(conditional)SpearmanρandKendallτ isinvarianttoanymonotone transformofX1 orX2,andsois thecopulafunction. 9 Lemma 3. For all 0 p<q 1 and r ( 1,1), ≤ ≤ ∈ − ρ ( r)= ρ (r) and τ ( r)= τ (r). [p,q] [p,q] [p,q] [p,q] − − − − Proof. Before we begin the proof, let us recall some basic facts from the copula theory (cf. [12] and references therein). We will use Cr to denote the Gaussian copula, with parameter r ( 1,1), which ∈ − coincideswiththecorrelationcoefficient. Noting,thatthecopulacouldbeseenasadistributionfunction (with uniform margins) let us assume that (U,V) is a random vector with distribution Cr. We will denote by Cr the copula of the conditional distribution (U,V) under the condition U [p,q], where [p,q] ∈ 0 p<q 1. Due to Sklar’s Theorem we get the following description of Cr : ≤ ≤ [p,q] Cr(q,v) Cr(p,v) Cr((q p)u+p,v) Cr(p,v) Cr u, − = − − , u,v [0,1]. (13) [p,q] q p q p ∈ (cid:18) − (cid:19) − Next, it is easy to notice, that the distribution function of (U,1 V) is equal to C−r. Hence the − Gaussian copulas commute with flipping, i.e. C−r(u,v)=u Cr(u,1 v) for u,v [0,1]. − − ∈ Onthe otherhandthe flipping transformsthe conditionaldistribution(U,V) to(U,1 V) . U∈[p,q] U∈[p,q] | − | Hence we get C−r (u,v)=u Cr (u,1 v). [p,q] − [p,q] − Thus basing on [12, Theorem 5.1.9], we conclude ρ ( r)= ρ (r), [p,q] [p,q] − − τ ( r)= τ (r), [p,q] [p,q] − − We recall that the Spearman ρ and Kendall τ of the conditional copula Cr are given by formulas: [p,q] 1 1 ρ (r) = ρ(Cr )= 3+12 Cr (u,v)dudv, (14) [p,q] [p,q] − [p,q] Z0 Z0 τ (r) = τ(Cr )= 1+4 Cr (u,v)dCr (u,v). (15) [p,q] [p,q] − ZZ[0,1]2 [p,q] [p,q] To describe their behaviour for small r we will need their Taylor expansions with respect to r. Proposition 1. For a fixed p,q (0,1) (p<q) and r ( 1,1), such that r is close to 0, we get ∈ ∈ − 3 ρ (r) = r Φ(√2x ) Φ(√2x ) (q p)√π(ϕ(x )+ϕ(x )) +O(r3), (16) [p,q] (q p)2π 2 − 1 − − 1 2 − (cid:16) (cid:17) 2 τ (r) = ρ (r)+O(r3). (17) [p,q] 3 [p,q] where x =Φ−1(p) and x =Φ−1(q). 1 2 Proof. We will use notation similar to the one introduced in Lemma 3. The proof will be based on two facts. First, for r =0 both C and C are equal to product copula Π(u,v):=uv, i.e. [p,q] C0(u,v)=uv =C0 (u,v). [p,q] Second,thederivativeofthedistributionfunctionofabivariateGaussiandistributionhavingstandardised margins with respect to the parameter r is equal to its density, which implies ∂Cr(u,v) 1 Φ−1(u)2+Φ−1(v)2 2rΦ−1(u)Φ−1(v) = exp − . ∂r 2π√1 r2 − 2(1 r2) − (cid:16) − (cid:17) 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.