Random matrix approach to the dynamics of stock inventory variations Wei-Xing Zhou,1,2,3,∗ Guo-Hua Mu,1,2,3 and J´anos Kert´esz4,5,† 1School of Business, East China University of Science and Technology, Shanghai 200237, China 2School of Science, East China University of Science and Technology, Shanghai 200237, China 3Research Center for Econophysics, East China University of Science and Technology, Shanghai 200237, China 4Department of Theoretical Physics, Budapest University of Technology and Economics, Budapest, Hungary 5Laboratory of Computational Engineering, Helsinki University of Technology, Espoo, Finland (Dated: January 4, 2012) Investors trade stocks based on diverse strategies trying to beat the market and gain access returns, whose stock inventories change accordingly. The dynamics of inventory variations thus containrichinformationaboutthetradingbehaviorsofinvestorsandhavecrucialinfluenceonprice fluctuations. We study the cross-correlation matrix Cij of inventory variations of the most active 2 individualandinstitutionalinvestorsinanemergingmarkettounderstandthedynamicsofinventory 1 variations. Wefindthatthedistributionofcross-correlation coefficient Cij hasapower-lawform in 0 thebulkfollowedbyexponentialtailsandtherearemorepositivecoefficientsthannegativeones. In 2 addition,itismorepossiblethattwoindividualsortwoinstitutionshavestrongerinventoryvariation n correlation thanoneindividualandoneinstitution. Wefindthatthelargest andthesecondlargest a eigenvalues(λ andλ )ofthecorrelation matrixcannotbeexplainedbytherandommatrixtheory 1 2 J andtheprojection ofinventoryvariationsonthefirsteigenvectoru(λ )arelinearly correlated with 1 2 stock returns, where individual investors play a dominating role. The investors are classified into three categories based on the cross-correlation coefficients CVR between inventory variations and ] stock returns. Half individuals are reversing investors who exhibit evident buy and sell herding T behaviors,while6%individualsaretrendinginvestors. Forinstitutions,only10% and8%investors S are trending and reversing investors. A strong Granger causality is unveiled from stock returns to . inventoryvariations, which means that a large proportion of individuals hold the reversing trading n strategy and a small part of individuals hold the trending strategy. Comparing with the case of i f Spanish market, Chinese investors exhibit common and market-specific behaviors. Our empirical - q findings have scientific significance in the understanding of investors’ trading behaviors and in the [ construction of agent-based models for stock markets. 1 PACSnumbers: 89.65.Gh,89.75.Da,02.10.Yn,05.45.Tp v 3 3 I. INTRODUCTION Exchangeandfoundthatthereweremorereversingfirms 4 thantrending firms [5]. They also found that the largest 0 . Stock markets are complex systems, whose elements eigenvalue of the correlation matrix of inventory varia- 1 are heterogenous individual and institutional investors tions cannot be explained by the random matrix theory 0 interacting with each other by stock exchanges [1–4]. and its eigenvector contains information of stock price 2 1 Stock price fluctuates due to investors’ trading activi- fluctuations. Both buying and selling herding behaviors : ties and the cross-sectional relation between investors’ have been observed for trending and reversing firms. v stock inventory variations and stock returns have at- Inthiswork,weperformasimilaranalysisasinRef.[5] i X tractedmuchattention [5]. The huge literaturefalls into based on the trading records of Chinese investors in the r three groups to study the relation between past returns Shenzhen Stock Exchange. Different from the Spanish a and inventory variations, to investigate the contempo- case, our data set contains both individual and institu- raneous relation between inventory variations and stock tional investors, which allows us to observe interesting returns,andtoanalysisreturnpredictabilityofinventory investorbehaviors. Ouranalysisstartsfromtheperspec- variations[6]. Themainfindingsarethatinstitutionsare tiveofrandommatrixtheory,whichhasbeenextensively trendinginvestorsadoptingthemomentumtradingstrat- used to investigate the cross-correlations of financial re- egy [6–8], while individuals are reversing investors who turns in different stock markets [12–14]. However, very buy previous losers and sell previous winners [6, 8–11], few studies have been conducted on the Chinese stocks and stock returns lead inventory variations but not vice [15] and, to our knowledge,there is no researchreported versa [5–8]. on the dynamics of inventory variations of Chinese in- However, there is evidence showing different trading vestors. Alternatively,therearestudies onChinese equi- patterns. Lilloetalinvestigatedthetradingbehaviorsof ties atthe transactionandtraderlevelfromthe complex about 80 firms that were members of the Spanish Stock network perspective [16–19]. This paper is organized as follows. Section II de- scribes the data and the method to construct the time ∗ [email protected] seriesofinvestors’inventoryvariations. SectionIII stud- † [email protected] ies the statistical properties of the elements, eigenvalues 2 and eigenvectors of the correlation matrix of inventory stocks into one sample. We find that the mean value variations. Section IV investigates the contemporaneous is C = 0.02 for the real data and 0 for the shuf- ij h i andlaggedcross-correlationbetweeninventors’inventory fled data. When the types of investors are taken into variationsandstockreturnstodivideinvestorsintothree account, the mean value of the cross-correlation coeffi- categories and their herding behaviors. Section V sum- cients is C = 0.048 (shuffled: 0.001) for both i and ij h i − marizes our findings. j being individuals, C = 0.014 (shuffled: 0) for both ij h i i and j being institutions, and C = 0.008 (shuffled: ij h i − 0.001)for i being individual and j being institution. − II. DATA Figure 1 plots the daily returns of stock 000001 and the sliding average values of the correlation coefficients We analyze 39 stocks actively traded on the Shenzhen Cij for comparison. We observethatlargevalues Cij h i h i Stock Exchange in 2003. The data base contains all the appear during periods of large price fluctuations by and information needed for the analysis in this work. For large,which is reminiscentofthe similar resultfor cross- each transaction i of a given stock, the data record the correlations of financial returns [14]. However, the short identities of the buyer and seller, the types (individual timeperiodofourdatasampledoesnotallowustoreach or institution) of the two traders, the price p and the a decisive conclusion. There are also less volatile time i size qi of the trade, and the time stamp. Therefore, the periods with large hCiji. The situation is quite similar tradinghistoryofeachinvestorisknown. Foreachstock, for other stocks. we identify active traders who had more than 150 trans- actions, amount to about three transactions per week. 0.3 <C> If the number of active traders of a stock is less than ij R 120,we exclude it fromanalysis. Inthis way,we have15 0.2 stocks for analysis. Following Ref. [5], we investigate the dynamics of the i inventory variation of the most active investors who ex- ij C 0.1 ecuted more than 120 transactions for each stock. Al- h , though the trading period of each day consists of call R auction and continuous auction, their behaviors are dif- 0 ferentinmanyaspectsandareusuallystudiedseparately [20, 21]. We stress that all the transactions in both call auction and continuous auction are included in our in- −0.1 vestigation. The daily inventory variation of an investor 0 50 100 150 200 250 t i trading a given stock on day t is defined as follows + − FIG. 1. (Color online) Evolution of the 5-day average cross- vi(t)= pi(t)qi(t) pi(t)qi(t), (1) correlation coefficient hCiji and thedaily return R. − X X where +p (t)q (t) is the total buy quantity on trad- Figure 2(a) shows the empirical probability distribu- i i ing dayPt and −pi(t)qi(t) is the total sell quantity in tionsofCij whichiscalculatedusingdailyinventoryvari- the same day. PThe basic statistics of the 80 most active ation. Thefourcurveswithdifferentmarkerscorrespond traders and the resultant inventory variations are given to Cij, Cind,ind, Cind,ins, and Cins,ins, respectively. It is in Table I. found that most coefficients are small and the tails are exponentials: III. STATISTICS OF CORRELATION MATRIX e−λ−C, 0.6<C 0.1 P(C) − ≤− (3) BETWEEN TWO TIME SERIES OF INVENTORY ∝(cid:26) e−λ+C, 0.1<C 0.6 ≤ VARIATIONS where λ =8.8 0.2 and λ =11.1 0.3 for C , λ = + − ij + ± ± A. Distributions of cross-correlation coefficients 8.9 0.2andλ− =12.5 0.4forCind,ind,λ+ =10.8 0.4 ± ± ± and λ =10.5 0.4 for C , and λ =6.6 0.5 and − ind,ins + ± ± λ =8.7 0.5forC ,respectively. Wefindthatthere TheempiricalcorrelationmatrixC isconstructedfrom − ± ins,ins aremorepositivecross-correlationcoefficients(λ <λ ) the time series of inventory variationv (t) of the investi- + − i when both investors are individuals or institutions. In gated stock, defined as contrast, the distribution is symmetric (λ λ ) when + − ≈ (v v )(v v ) one investor is an individual while the other is an insti- i i j j C = h −h i −h i i. (2) ij tution. This finding implies that herding behaviors are σ σ i j morelike to occur amongthe same type of investorsand Since the results for individual stocks are quantitatively individuals have larger probability to herd than institu- similar,weputthecross-correlationcoefficientsofthe15 tions. Weshuffletheoriginaltimeseriesandperformthe 3 TABLE I. Basic statistics of the investigated stocks. The first column is the stock code, which is the unique identity of each stock. Thesecondandthirdcolumnspresentsinvestor-averagedtotalinventoryvariationhPtviiandaverageabsolutevariation hh|v|itii. The fourth to eighth columns gives the number of investors N, the number of trending investors Ntr, the number of reversing investors Nre, the number of uncategorized investors Nun, and the slope of the factor versus stock return k. The variables in the ninth to thirteenth columns are the same as in the five “all investors” columns but for individual investors and the last five columns are for institutional investors. Each value in the last row gives the sum of the numbers in the same column. All investors Individuals Institutions Code hPtvii hh|v|itii N Ntr Nre Nun k Nind Nitnrd Nirned Niunnd kind Nins Nitnrs Nirnes Niunns kins 000001 −2.99×106 1.12×105 80 7 41 32 0.83 61 4 39 18 0.83 19 3 2 14 0.04 000002 1.01×106 1.46×105 80 6 29 45 0.49 42 2 26 14 0.52 38 4 3 31 0.06 000012 −1.64×106 7.88×104 81 5 20 56 0.19 78 5 20 53 0.18 3 0 0 3 0.06 000021 5.09×105 4.35×104 81 2 43 36 0.78 64 2 43 20 0.78 17 2 0 16 0.04 000063 1.46×107 2.46×105 80 9 20 51 0.13 20 2 13 5 0.71 60 7 7 46 0.05 000488 1.35×106 8.92×104 80 5 5 70 0.08 69 2 5 63 0.06 11 4 0 7 0.06 000550 4.20×106 7.53×104 81 2 37 42 0.18 45 0 35 10 0.18 36 2 2 32 0.06 000625 1.87×106 1.22×105 80 10 26 44 0.71 62 8 24 30 0.69 18 2 2 14 0.05 000800 2.53×106 2.56×105 80 6 19 55 0.30 31 2 17 13 0.31 49 5 2 42 0.06 000825 6.33×106 9.38×104 80 5 38 37 0.69 50 3 37 10 0.70 30 2 2 27 0.05 000839 8.13×105 6.67×104 80 2 40 38 0.84 60 2 38 20 0.84 20 0 2 18 0.04 000858 1.75×106 1.66×105 82 4 21 57 0.33 31 0 19 12 0.67 51 4 2 45 0.05 000898 6.26×106 1.20×105 83 6 32 45 0.79 46 0 31 15 0.80 37 6 2 30 0.04 200488 3.59×106 5.05×104 80 9 30 41 0.65 53 7 25 21 0.66 27 2 5 20 0.05 200625 6.27×106 1.26×105 83 9 14 60 0.50 46 5 9 32 0.58 37 4 5 28 0.05 P - - 1211 87 415 709 - 758 44 381 336 - 453 47 36 373 - same analysis. The resulting distributions collapse onto It is natural that we are more interested in large a single curve, which has an exponential form cross-correlations. The preceding discussions focus on the cross-correlations not larger than 0.6. As shown in P(C)=λshufe−λshufC (4) Fig.2(a),therearepairsofinventoryvariationtimeseries thathaveverylargecross-correlationsthatlooklikeout- where the parameter λshuf = 23.3 is determined us- liers. To have a better visibility, we plot in Fig. 2(c) the ing robust regression [22, 23]. It is not surprising that numbers of occurrences of positive and negative cross- realdata havehighercross-correlationsthanthe shuffled correlations in 10 nonoverlapping intervals for the four data,whichisconfirmedbyλ± <λshuf. Thisexponential types of pairs. It is shown that N(C > 0) > N(C < 0) distribution is different from the Gaussian distribution in all intervals for C = C , C and C . In ij ind,ind ind,ins for the shuffled data of financial returns [14]. contrast, N(Cind,ins > 0) < N(Cind,ins < 0) when Figure 2(b) plots the distributions in double logarith- Cind,ins<0.6andN(Cind,ins>0)>N(Cind,ins<0) mic coordinateswherethe negativepartsarereflectedto whenCind,ins>0.6. Hence,forlargercross-correlations the right with respect to Cij = 0. Nice power laws are (C > 0.6), there are much more occurrences of positive observed spanning over three orders of magnitude: cross-correlationsthannegativeonesforallthefourtypes of pairs. This striking feature can be attributed to two ( C)−γ−, 0.01<C 10−5 reasons. The first is that a large proportion of investors P(C)∝(cid:26) −C−γ+, −10−5 <C≤−0.01 (5) react to the same external news in the same direction ≤ [5]: they buy following good news and sell following bad where γ+ = 0.69 0.01 and γ− = 0.69 0.01 for Cij, news. The second is that investors imitate the trading ± ± γ+ = 0.67 0.02 and γ− = 0.62 0.02 for Cind,ind, behaviors of others of the same type and rarely imitate ± ± γ+ = 0.67 0.02 and γ− = 0.70 0.02 for Cind,ins, and otherinvestorsofdifferenttype. Thesecondreasonisra- ± ± γ+ =0.72 0.02andγ− =0.73 0.02forCins,ins,respec- tional because the friends of individual (or institutional) ± ± tively. It is found that γ− γ+ and all the power-law investors are more likely individual (or institutional) in- ≈ exponents are close to each other. An intriguing feature vestors. is that the distributions of C , C and C ex- ij ind,ind ind,ins hibit an evident bimodal behavior, which is reminiscent ofthedistributionsofwaitingtimesandintereventtimes B. Eigenvalue spectrum of human short message communication [24]. Certainly, the underlying mechanisms are different and the factors causingthebimodaldistributionofthecross-correlations For the correlation matrix C of each stock, we can are unclear. calculate its eigenvalues, whose density f (λ) is defined c 4 102 [12, 14, 25], all (a) ind−ind 101 Q (λ λ)(λ λ ) ind−ins max min f (λ)= − − , (7) ins−ins c 2πσ2p λ ) 100 Cij with λ [λ ,λ ], where λmax is given by P( 10−1 ∈ min max min λmax =σ2(1+1/Q 2 1/Q) , (8) 10−2 min ± p and σ2 is equal to the variance of the elements of M 10−−31 −0.5 0 0.5 1 [12, 25]. The variance σ2 is equal to 1 in our normalized C ij data. 104 (b) 1.2 102 16 1 14 C)ij100 12 P( 10−2 0.8 10 all ) λ 8 λ 10−4 ind−ind (0.6 6 ind−ins c ins−ins f 4 101−60 −6 10−5 10−4 10−3 10−2 10−1 100 0.4 2 C 0 ij 0.2 5 10 15 5 Dataset (c) all: + all: − 0 4 ind−ind: + 0 2 4 6 8 10 ] ind−ind: − C)ij iinndd−−iinnss:: +− λ N( 3 ins−ins: + + ins−ins: − 1 FIG.3. (Coloronline)Eigenvaluespectrumofthecorrelation [102 matrixofinventoryvariationofinvestorstradingstock000001 g o within 1daytimehorizon in2003. Thesolid lineisthespec- l 1 tral density obtained by shuffling independently the buyers andthesellersinsuchawaytomaintainthesamenumberof 00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 purchasesandsales foreachinvestorasintherealdata. The |C | dashed blue line shows the spectral density predicted by the ij randommatrixtheoryusingEq.(7)withQ=237/80=2.96. The inset shows the largest eigenvalue λ ((cid:13)) and the sec- FIG. 2. (Color online) Empirical distributions of the cross- 1 ond largest eigenvalue λ ((cid:3)) of all 15 investigated data sets correlationcoefficientsforallthe15stocks. (a)Log-linearplot 2 from 15 stocks. The solid line indicates the upperthresholds of P(Cij) for cross-correlations between any two investors, by shuffling experiments, and the dashed line presents the any two individuals, any one individual and one institution, threshold predicted bythe random matrix theory. andanytwoinstitutions,whichshowsexponentialformswhen 0.1<|C|≤0.6. The dashed line corresponds to the result of shuffleddata. (b)Log-logplotofP(Cij),whichshowspower- Figure3illustratestheprobabilitydistributionfc(λ)of law forms when 10−5 < |C| ≤ 0.01. (c) Comparison of oc- the correlationmatrix of inventory variation of investors currence numbersof positive and negative cross-correlations. trading stock 000001. The solid line is the spectral den- Theordinategiveslog10[1+N(Cij)]ratherthanlog10[N(Cij)] sity obtained by shuffling independently the buyers and for better presentation. thesellersinsuchawaytomaintainthe samenumberof purchases and sales for each investor as in the real data, whilethedashedbluelineshowsthespectraldensitypre- as follows [12], dicted by the random matrix theory using Eq. (7) with 1 dn(λ) Q=237/80=2.96. Wefindthelargesteigenvalueiswell fc(λ)= , (6) outsideofthebulkandthesecondlargesteigenvaluealso N dλ escapesthebulk. Theresultsforother14stocksarequite where n(λ) is the number of eigenvalues of C less than similar. In the inset of Fig. 3, we plot the largest eigen- λ. If M is a T N random matrix with zero mean valuesλ andthesecondlargesteigenvaluesλ forallthe 1 2 × and unit variance, f (λ) is self-averaging. Particularly, 15stocks. Wefindthatallthelargesteigenvaluesarewell c in the limit N , T and Q = T/N 1 fixed, abovetheupperthresholdsdeterminedfromshufflingex- → ∞ → ∞ ≥ the probabilitydensity function f (λ) of eigenvaluesλof periments and the thresholds λ in Eq. (8) predicted c max the random correlation matrix M can be described as by the random matrix theory. Moreover, all the second 5 largesteigenvaluesare abovethe twothresholdlines and inFig.4showthe empiricaldistributions ofthe eigenvec- some of them are well above the thresholds. These find- tor components u with the eigenvalues in the bulk for ingsindicatesthatboththelargestandthesecondlargest two typical stocks. Rather than analyzing the vector for eigenvalues carryinformationabout the investors,which one eigenvector, we normalized the components of each is different from different from the results of the Span- eigenvectorand put all the eigenvectorstogether to gain ish stock market, where only the largest eigenvalue is better statistics, since eacheigenvectorhas only 80 com- larger than the up thresholds while the second largest ponents. We find that the distributions of 10 stocks are eigenvalue is within the bulk [5]. This discrepancy can well consistent with the Gaussian, while other 5 stocks be attributed to the difference of the two markets and exhibit high peaks in the center. the fact that our analysis contains both individuals and For deviating eigenvalues λ and λ , the distribution 1 2 institutions while Lillo et al studies only firms. for each stock is very noisy and deviates from Gaussian. Wetreatthecomponentsofthe15eigenvectorsasasam- ple to have better statistics. The two distributions ob- C. Distribution of eigenvector components tained are illustrated in Fig. 4(c) and (d). It is evident thatbothdeviatefromtheGaussiandistributionandthe distribution for λ is more skewed. If there is no information contained in an eigenvalue, 1 the normalized components of its associated eigenvector should conform to a Gaussian distribution [12–14]: D. Information in eigenvectors for deviating 1 u2 eigenvalues f(u)= exp . (9) √2π (cid:18)− 2 (cid:19) Wehaveshownthatthe largestandthe secondlargest Sincetheempiricaleigenvaluedistributionf (λ)deviates eigenvaluesdeviatefromtheRMTpredictionandthedis- c from the theoretic expression (6) from the random ma- tributionsoftheir eigenvectorcomponentsarenotGaus- trixtheorywithtwolargeeigenvaluesoutsidethebulkof sian. It implies that these eigenvectorscarry some infor- the distribution,it is expectedthat the associatedeigen- mation. For u(λ2), it is not clear what kind of informa- vectors also contain certain information. tion they carry. We find no evident dependence of the magnitude of u (λ ) on the averageabsolutioninventory i 2 0.8 variation vi , the total variation vi, or the maxi- h| |i (a) (b) malabsolutevariationmax vi . SamPeconclusionisob- )0.6 {| |} u tained for ui(λ1), which differs from the conclusion that ( f 0.4 theeigenvectorcomponentsofthereturncorrelationma- , y trixdependonthemarketcapitalizationinalogarithmic sit0.2 form[14]. Inaddition,aswewillshowinthenextsection n e 0 that the investors can be categorized into three trading d 1 types. Wealsofindnorelationbetweenthetradingstrat- y it0.8 (c) (d) egycategoryandthemagnitudeofthevectorcomponent l bi0.6 u(λ2). We thusfocusonextractingthe informationfrom a b0.4 u(λ1). o r Forthe correlationmatrixwhoseelements arethe cor- P0.2 relation coefficients of price fluctuations of two stocks, 0 −4 −2 0 2 4 −4 −2 0 2 4 the eigenvector of the largest eigenvalue contains mar- Eigenvector component s , u ket information [12, 13]. The market information indi- cates the collective behavior of stock price movements [26], which can be unveiled by the projection of the FIG. 4. Distribution of eigenvector components u: (a) all the eigenvectors associated with the eigenvalues in the bulk time series on the eigenvector [14]. We follow this ap- λ <λ<λ after normalization for each eigenvector for proach and calculate the projection G(t) of the time se- min max stock000001, (b)sameas(a)butforstock200625,(c)allthe ries V (t) = [v (t) v (t) ]/σ on the eigenvector u(λ ) i i i i 1 −h i eigenvectors associated with the largest eigenvalues λ1 after corresponding to the first eigenvalue [5]: normalization for all the stocks, and (d) all the eigenvectors associated with the second largest eigenvalues λ after nor- 2 G(t)= V (t) u (λ )(t), (10) malization for all the stocks. The solid lines show the Gaus- i × i 1 sian distribution predicted by therandom matrix theory. Xi The projection G can be called the factor associated For correlation matrices of financial returns, the com- with the largest eigenvalue [5]. We plot the factor G(t) ponents of an eigenvector with the eigenvalue λ in the against the normalized return R(t) for stock 000001 in bulk of its distribution (λ < λ < λ ) are dis- Fig. 5(a). There is a nice linear dependence between min max tributedaccordingto Eq.(9) [12–14]. Panels(a)and(b) the two variables and a linear regression gives the slope 6 k = 0.83 0.04. It indicates that these most active in- comparing the experimental results with the results of ± vestors have dominating influence on the price fluctua- a null hypothesis based on a block bootstrap of both R tions. andV. Inthisregard,1000blockbootstrapreplicaswith a block length of 20 are performed. For each investor, 5 we have checked whether the estimated correlation with G (a) (b) (c) return exceeds the 0.97725 quantile or is smaller than , or 0 the 0.02275 quantile of the correlation distribution ob- ct tained frombootstrapreplicas. The results are shownin a F Fig. 6(a-c). There are 1211 investors in the whole sam- −5 −4−2 0 2 4 −4−2 0 2 4 −4−2 0 2 4 ple in Fig. 6(a), including 453 institutional investors in Return, R Fig. 6(b) and 758 individual investors in Fig. 6(c). 1 AsshowninthelastrowofTableI,thenumbersofthe (d) s three kinds of investors (trending, reversing and uncate- n ki gorized)are46,34and373forinstitutionalinvestorsand ,0.5 41,381and336forindividualinvestors,respectively. We d n find that most institutional investors are uncategorized ki and there are more trending investors than reversing in- 0 0 0.2 0.4 0.6 0.8 1 vestors. TheseresultsaredifferentfromtheSpanishcase, k whereonlyone-thirdinvestorsareuncategorizedandthe number of reversing firm investors is about three times FIG. 5. (Color online) Influence of the most active investors the number of trending firm investors [5]. In contrast, tradingstock000001onthepricefluctuationsforalltheinves- about half individuals are reversing investors and only tigated investors (a), for individual investors (b), and for in- 6% individuals are trending investors. The observation stitutionalinvestors(c),wheretheslopesarek=0.83±0.04, that most investorsare uncategorizedis probably due to k =0.83±0.04, andk =0.41±0.06, respectively. Panel ind ins thefactthattheChinesemarketwasemerginganditsin- (d) plots k and k against k, where each symbol corre- ind ins vestors are not experienced. Comparing individuals and sponds to a stock. institutions, we find a larger proportion of individuals exhibiting a reversing behavior. It indicates that these Panels (b) and (c) of Fig. 5 illustrate the relation be- individuals buy when the price drops and sell when the tweenthe factorandthe returnforindividualsandinsti- price rises in the same day. This finding is very inter- tutions. Linear regression gives kind = 0.83 0.04 and esting since it explains the worse performance in stock ± kins = 0.41 0.06. Comparing (b) and (c) with (a), we markets [10, 11]. ± find that the influence ofindividuals matches excellently The empirical evidence for the significant cross- with the whole sample, which can be quantified by the correlationbetween inventoryvariation V (t) of trending i facts that kind =k and kind <k. The results are similar and reversing investors and stock return R(t) leads us for other stocks. The resulting kind and kins are plotted to adopt a linear model for the dynamics of inventory in Fgi. 5 against k for all the 15 stocks. For individual variation as a first approximation[5]: investors,wefindthatk =kfor13stocksandk >k ind ind for 2stocks. Incontrast,wefind thatkins <k exceptfor Vi(t)=γiR(t)+ǫi, (12) one stock. where γ is proportional to the cross-correlation coef- i ficient C . It follows immediately that the cross- Vi,R IV. INVENTORY VARIATION AND STOCK correlation coefficient between the inventory variations RETURN of two investors are A. Categorization of investors Cij =CVi,Vi =γiγj. (13) Iftwoinvestorsbelongtothesamecategory,eithertrend- Following Ref. [5], we divide the investors into three ing (γ > 2σ and γ > 2σ significantly) or reversing categories according to the cross-correlation coefficient i j (γ < 2σ and γ < 2σ significantly), the value of C CViR between the inventory variation Vi and the stock isiexpe−cted to bejsign−ificantly positive. On the contrariyj, return R. The investor i belongs to the trending or re- if two investors belong respectively to the trending and versingcategoryif its inventoryvariationis positively or reversing categories, the value of C is expected to be negatively correlated with the return. We use a wieldy ij significantly negative. To show the performance of the significant threshold to categorize the investors: model, we plot the contours of the correlation matrix of 2σ = 2/ N , (11) inventory variation for all investors, for individuals and T ± ± for institutions, where the investorsare sortedaccording p where N is the number of time records for each time to their cross-correlation coefficients C of the inven- T VR series [5]. We also verify the robustness of Eq. (11) by tory variation with the price return. Figure 6(d) shows 7 0.6 0.6 0.6 (a) (b) (c) 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 R R R V V V C −0.2 C −0.2 C −0.2 −0.4 −0.4 −0.4 −0.6 −0.6 −0.6 −0.8 −0.8 −0.8 −4 −3 −2 −1 0 1 2 −4 −3 −2 −1 0 1 2 −4 −3 −2 −1 0 1 2 log (size) log (size) log (size) 10 10 10 1 0.8 (d) (e) (f) 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 −0.2 0 0 −0.4 −0.2 −0.2 −0.6 0.4 −0.8 0.4 −0.4 0.4 −0.4 R 0 R 0 R 0 V V V C −0.4 C −0.4 C −0.4 −0.8 −0.8 −0.8 FIG. 6. (Color online) Panels (a-c) show the scatter plots of CVR versus a proxy of the size of the investor. For each stock, the proxy is the ratio of the value exchanged by the investor (scaled by a factor 104) to the capitalization of the stock. Each markerreferstoaninvestortradingaspecificstock. Thethreekindsofmarkersrefertoinvestorswhoseinventoryvariationsare positivelycorrelated((cid:13)),negativelycorrelated((cid:3)),oruncorrelated(△)withreturnsaccordingtotheblockbootstrapanalysis. The two dashed lines indicate the 2σ threshold calculated using Eq. (11). Panels (d-f) are contour plots of the correlation matrix of daily inventoryvariation of investors trading the stock 000001. We havesorted the investors into rows and columns according to their cross-correlation coefficients of inventory variation with its price return CVR. The evolution of CVR in the same order as in the matrix is shown in thebottom panel, where the dashed lines boundthe ±2σ significance intervals. thattheleft-topcornergiveslargepositiveC valuesand We first investigate the autocorrelation function ij the left-bottom and right-top corners gives large nega- C oftheinventoryvariationtimeseriessampled V(t)V(t+τ) tive C values, as expected. Figure 6(e) give better re- in15-mintimeintervals. Figure7(a)showsthethreeau- ij sults for individual investors,validating the linearmodel tocorrelationfunctionsforallthetrending,reversingand (12). The results in Fig. 6(f) are worse for institutional uncategorized investors. Each autocorrelation function investors, which is due to the fact that most C val- is obtainedby averagingthe autocorrelationfunctions of VR ues are small for institutions, as illustrated in Fig. 6(c). the investors in the same category to have better statis- However,Fig. 6(f) doses not invalidate the linear model, tics. Itis found thatthe inventoryvariationis long-term since there are only three trending institutions and one correlatedandthecorrelationissignificantoverdozensof reversing institution. Indeed, the situation is quite simi- minutes,whichcanbepartlyexplainedbytheordersplit- lar for other individual stocks with very few investors of tingbehavioroflargeinvestors[27–29]. Wealsofindthat the same category, as shown in Table I. thecorrelationisstrongeramongtrendinginvestorsthan reversing investors. Figure 7(b) and Fig. 7(c) illustrate the results for individuals and institutions. We observe that institutions have stronger long memory than indi- B. Causality viduals. It implies that institutions are more specialized to their trading strategies than individuals [5]. In Sec. IVA, we have shown that the inventory varia- tionV (t)andthestockreturnR(t)havesignificantposi- Panels (d-f) of Fig. 7 illustrate the averaged lagged i tiveornegativecorrelationforpartofthe investors. Itis cross-correlationfunctionsC betweeninventory V(t)R(t+τ) interesting to investigate the lead-lag structure between variations and returns. The results in the three panels these two variables. For the largest majority of revers- are qualitatively the same. For uncategorized investors, ing and trending firms in the Spanish stock market, it nosignificantcross-correlationsarefoundbetweeninven- is found that returns Granger cause inventory variation tory variations and returns, which is trivial due to the but not vice versa at the day or intraday level, and the “definition”ofthiscategory,asshowninFig.6(a-c). For Granger causality disappears over longer time intervals trending and reversing investors, it is evident that the [5]. Here, we aim to study the same topic for both indi- returns lead the inventory variations by dozens of min- vidual and institutional investors. utes (τ < 0), where the cross-correlation C is V(t)R(t+τ) 8 100 100 100 (a) Trending (b) Trending (c) Trending Reversing Reversing Reversing 10−1 Uncategotized 10−1 Uncategotized Uncategotized τ+)10−2 τ+) τ+)10−1 Vt( Vt(10−2 Vt( Vt()10−3 Vt() Vt()10−2 C10−4 C10−3 C 10−5 10−4 10−3 0 150 300 450 600 750 900 1050 0 150 300 450 600 750 900 1050 0 150 300 450 600 750 900 1050 τ(min) τ(min) τ(min) 0.06 0.08 0.06 (d) Trending (e) Trending (f) Trending 0.03 RUenvceartseignogtized 0.04 RUenvceartseignogtized 0.03 RUenvceartseignogtized ) ) ) τ+ 0 τ+ τ+ t( t( 0 t( 0 R−0.03 R R ) ) ) t( t(−0.04 t(−0.03 V−0.06 V V C C C −0.09 −0.08 −0.06 −0.12 −0.12 −0.09 −450−300−150 0 150 300 450 −450−300−150 0 150 300 450 −450−300−150 0 150 300 450 τ(min) τ(min) τ(min) 0.8 0.8 0.6 (g) R → V (h) R → V (i) R → V V → R V → R V → R )]0.6 )]0.6 )] y y y0.4 → → → 0.4 0.4 x x x ( ( ( I I I0.2 [ [ [ E0.2 E0.2 E 0 0 0 5 min 0.5 hour 1 hour 1 day 1 week 5 min 0.5 hour 1 hour 1 day 1 week 5 min 0.5 hour 1 hour 1 day 1 week ∆T ∆T ∆T 1 1 1 (j) (k) (l) 0.8 0.8 0.8 )] )] )] y y y →0.6 R → V →0.6 R → V →0.6 R → V V → R V → R V → R x x x (0.4 shuffled data (0.4 shuffled data (0.4 shuffled data I I I [ [ [ E E E 0.2 0.2 0.2 0 0 0 −0.3 −0.2 −0.1 0 0.1 0.2 −0.3 −0.2 −0.1 0 0.1 0.2 −0.3 −0.2 −0.1 0 0.1 0.2 C C C VR VR VR FIG. 7. (Color online) The first column (a,d,g,j) shows the results for all investors. The second column (b,e,h,k) shows the results for all individual investors. The third column (c,f,i,l) shows the results for all institutional investors. (a-c) Averaged autocorrelation functions C of the 15-min inventory variation V for trending, reversing and uncategorized investors. V(t)V(t+τ) The dashed lines give the 5% significance level. (d-f) Averaged lagged cross-correlation functions C of the 15-min V(t)R(t+τ) inventory variation for trending, reversing and uncategorized investors. The dashed lines bound the ±2σ significance interval. (g-i) Conditional expected value of the indicator I(x → y) of the rejection of the null hypothesis of non-Granger causality betweenxandy with95% confidenceasafunctionoftimehorizons∆T. Thedashedlinesshowthe5%significancelevel. (j-l) Conditional expected value of the indicator I(x → y) as a function of the simultaneous cross-correlation C[Vi(t),R(t)]. The black symbols refer to theGranger test on shuffled data and the dashed lines bound ±2σ significance interval. significantlynonzero. Whenthepricedrops,trendingin- to reduce their inventory. In the meanwhile, we also ob- vestors will sell stock shares to reduce their inventory in servenonzerocross-correlationsforτ >0inshortertime a few minutes, while reversing investors will buy shares periods, which means that the inventory variations lead to increase their inventory. When the price rises, trend- returns. ing investors will buy shares to increase their inventory inafewminutes,whilereversinginvestorswillsellshares To further explore the lead-lag structure between in- ventory variations and returns, we perform Granger 9 causality analysis. We define an indicator I(X Y), → TABLE II. Number of herding days for different groups of whose value is 1 if X Granger causes Y and 0 otherwise investors. The total number of trading days is 237. The su- [5]. In our analysis, the time resolution of the two time perscripts“+”and“−”indicatebuyherdingandsellherding, series is 15-min. The values of I(V R) and I(R V) respectively. The subscripts “d” and “s” indicate individuals → → for all investors are determined at different time scales and institutions, respectively. The time horizon is one day. ∆T. The average indicator values E[I(V R)] and → Reversing Trending Uncategorized E[I(R V)] are plotted in Fig. 7(g-i) with respect to ∆T →for all investors, individual investors and insti- Code n+d n−d n+s n−s n+d n−d n+s n−s n+d n−d n+s n−s 000001 60 63 0 0 0 0 0 0 6 8 0 0 tutional investors. Both I(V R) and I(R V) are → → 000002 23 19 0 0 0 0 0 0 3 6 0 0 decreasing functions of ∆T. We note that E[I(X Y)] 000012 26 37 0 0 0 0 0 0 2 1 0 0 → isthepercentageofinvestorswithI(X Y)=1. Figure 000021 54 62 0 0 0 0 0 0 4 3 1 0 → 7showsthattherearemoreinvestorswithI(R V)=1 000063 34 34 2 1 0 0 0 0 0 0 1 1 → than investors with I(V R)=1. On average,bidirec- 000488 0 0 0 0 0 0 0 0 2 1 0 0 → tionalGrangercausalityis observedatthe intradaytime 000550 17 34 0 0 0 0 0 0 0 0 2 1 scalesandtheGrangercausalitydisappearsattheweekly 000625 21 15 0 0 3 7 0 0 5 2 0 0 level or longer. Moreover, individual investors are more 000800 29 25 0 0 0 0 0 0 7 7 0 0 000825 34 31 0 0 0 0 0 0 0 0 0 0 probable to be influenced by the intraday price fluctua- 000839 61 58 0 0 0 0 0 0 7 7 0 0 tions than institutions, because the I(R V) values of → 000858 38 36 0 0 0 0 0 0 0 1 0 0 individuals are greater than those of institutions at the 000898 55 40 0 0 0 0 0 0 3 4 0 0 same time scale level. 200488 31 24 0 0 0 0 0 0 2 7 0 3 We then investigate the impact of investor category 200625 6 6 0 0 0 0 0 0 3 6 0 3 on the causality indicator. The results for ∆T = 4 (i.e., one hour) are depicted in Fig. 7(j-l). The mid- dle parts bounded by two vertical lines at C = σ VR giventime horizon. When the herding index h is smaller ± correspond to uncategorized investors. The left parts than5%underabinomialnullhypothesis,weassessthat (C < σ) correspond to reversing investors and the VR herding is present. In our analysis, we fix the time hori- − right parts C > σ correspond to trending investors. VR zoninto one day and determine the number of days that ± Itisfoundthatainvestoradjustshis inventoryfollowing herdingwaspresentfordifferentgroupsofinvestors. The price fluctuations with very large probabilities when his results are depicted in Table II. C value is large. This conclusion holds for both in- VR According to Table II, there are no buying and sell- | | dividual and institutional investors. The strong Granger ing herding days observed for trending institutions. For causality from inventory variations to returns and the trending individuals and reversing institutions, herding weak but significant causality from returns to inventory is observedinonly one stock onveryfew days. For cate- variations cannot be attributed to the non-Gaussianity gorized investors, we see slightly more herding days in a in the distributions of the variables, as verified by boot- few stocks. For reversing investors, the number of herd- strapping analysis. Qualitatively similar results are ob- ing days is greater than for other investors and we ob- tained for other ∆T values. serve comparable buying and selling herding days. Our findings are consistent with those for the Spanish stock market, especially in the sense that reversing investors C. Herding behavior are more likely to herd [5]. Our analysis also allows us toconclude thatindividuals aremorelikelyto herdthan Herding and positive feedbacks are essential for the institutions in 2003. boom of bubbles [30, 31]. These topics have been stud- iesextensivelytounderstandthepriceformationprocess [32–35]. Herdingisaphenomenathatagroupofinvestors V. SUMMARY tradinginthesamedirectionoveraperiodoftime. Here, wetrytoinvestigatepossibleherdingbehaviorsindiffer- In summary, we have studied the dynamics of in- ent groups of investors. vestors’ inventory variations. Our data set contains 15 We study possible buy and sell herding behaviors stocks actively traded on the Shenzhen Stock Exchange among the same group of investors. Investors are classi- in2003andthe investorscanbe identifiedaseither indi- fiedintodifferentgroupsbasedontheirtypes(individual viduals or institutions. orinstitution)andtheircategories(reversing,trendingor We studied the cross-correlation matrix C of inven- uncategorized). We define a herding index as follows [5]: ij tory variations of the most active individual and insti- N+ tutional investors. It is found that the distribution of h= , (14) cross-correlationcoefficient C is asymmetric and has a N++N− ij power-lawforminthebulkandexponentialtails. Thein- where N+ is the number of buying investors and N− is ventoryvariationsexhibitstrongercorrelationwhenboth the number of selling investors in the same group over a investors are either individuals or institutions, which in- 10 dicatesthatthetradingbehaviorsaremoresimilarwithin strategies. Moreover, there are far more reversing indi- investors of the same type. The eigenvalue spectrum viduals (50%) than trending individuals (6%). In con- showsthatthelargestandthesecondlargesteigenvalues trast,there areslightlymoretrending institutions (10%) ofthecorrelationmatrixcannotbeexplainedbytheran- than reversing institutions (8%). Hence, Chinese indi- dommatrixtheoryandthecomponentsofthefirsteigen- vidual investors are prone to selling winning stocks and vector u(λ ) carry information about stock price fluctu- buyinglosingstocks,whichprovidessupportingevidence 1 ation. In this respect, the behaviors differ for individual that trading is hazardous to the wealth of individuals and institutional investors. [9, 10]. Based on the contemporaneous cross-correlation coef- ficients C between inventory variations and stock re- ACKNOWLEDGMENTS VR turns,weclassifiedinvestorsintothreecategories: trend- inginvestorswhobuy(sell)whenstockpricerises(falls), WearegratefultoFabrizioLilloforfruitfuldiscussion. reversing investors who sell (buy) when stock price rises This work was partially supported by the National Nat- (falls), and uncategorized investors. We also observed uralScience FoundationofChina under grants11075054 that stock returns predict inventory variations. It is in- and 71131007, the Shanghai Rising Star Program un- teresting to find that about 56% individuals hold trend- der grant 11QH1400800,and the Fundamental Research ingorreversingstrategiesandonly18%institutionshold Funds for the Central Universities. [1] J.-P. Bouchaud, Nature455, 1181 (2008). collusive cliques in futures markets based on trading be- [2] T. Lux and F. Westerhoff, Nat. Phys.5, 2 (2009). haviors from real data,” (2011), arXiv:1110.1522. [3] J. D. Farmer and D. Foley,Nature460, 685 (2009). [20] G.-F. Gu, W. Chen, and W.-X. Zhou, [4] F. Schweitzer, G. Fagiolo, D. Sornette, F. Vega- Physica A 387, 3173 (2008). Redondo, A. Vespignani, and D. R. White, [21] G.-F.Gu,F.Ren,X.-H.Ni,W.Chen, andW.-X.Zhou, Science 325, 422 (2009). Physica A 389, 278 (2010). [5] F. Lillo, E. Moro, G. Vaglica, and R. N. Mantegna, [22] P.W.HollandandR.E.Welsch,Commun.Statist.The- NewJ. Phys.10, 043019 (2008). ory Methods 9, 813 (1977). [6] J. Griffin, J. H. Harris, and S. Topaloglu, [23] J.O.Street,R.J.Carroll, andD.Ruppert,Ann.Statist. J. Financ. 58, 2285 (2003). 42, 152 (1988). [7] H. Choe, B. C. Kho, and R. M. Stulz, [24] Y. Wu, C.-S. Zhou, J.-H. Xiao, J. Financ. Econ. 54, 227 (1999). J. Kurths, and H. J. Schellnhuber, [8] M. Grinblatt and M. Keloharju, Proc. Natl. Acad. Sci. U.S.A. 107, 18803 (2010). J. Financ. Econ. 55, 43 (2000). [25] A. M. Sengupta and P. P. Mitra, [9] T. Odean, J. Financ. 53, 1775 (1998). Phys. Rev.E 60, 3389 (1999). [10] B. M. Barber and T. Odean, J. Financ. 55, 773 (2000). [26] V. Plerou, P.Gopikrishnan, B. Rosenow, L.A.N. Ama- [11] B. M. Barber, Y. Lee, Y. Liu, and T. Odean, ral, and H. E. Stanley,Physica A 299, 175 (2001). Rev.Financ. Stud.22, 609 (2009). [27] G. Vaglica, F. Lillo, E. Moro, and R. N. Mantegna, [12] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, Phys. Rev.E 77, 036110 (2008). Phys.Rev.Lett. 83, 1467 (1999). [28] E. Moro, J. Vicente, L. G. Moyano, A. Gerig, J. D. [13] V.Plerou, P. Gopikrishnan, B. Rosenow, L.A. N.Ama- Farmer, G. Vaglica, F. Lillo, and R. N. Mantegna, ral, andH.E.Stanley,Phys.Rev.Lett. 83, 1471 (1999). Phys. Rev.E 80, 066102 (2009). [14] V. Plerou, P. Gopikrishnan, B. Rosenow, [29] G. Vaglica, F. Lillo, and R. N. Mantegna, L. A. N. Amaral, T. Guhr, and H. E. Stanley, New J. Phys.12, 075031 (2010). Phys.Rev.E 65, 066126 (2002). [30] D. Sornette, Why Stock Markets Crash: Critical Events [15] J. Shen and B. Zheng, in Complex Financial Systems (Princeton University EPL (Europhys.Lett.) 86, 48005 (2009). Press, Princeton, 2003). [16] Z.-Q. Jiang and W.-X. Zhou, [31] D. Sornette, Phys.Rep. 378, 1 (2003). Physica A 389, 4929 (2010). [32] J. Lakonishok, A. Shleifer, and R. W. Vishny, [17] J.-J. Wang and S.-G. Zhou, Physica A 390, 398 (2011). J. Financ. Econ. 32, 23 (1992). [18] X.-Q.Sun, X.-Q. Cheng, H.-W. Shen, and Z.-Y. Wang, [33] R. Wermers, J. Financ. 54, 581 (1999). Physica A 390, 3427 (2011). [34] J. R. Nofsinger and R. W. Sias, [19] J.-J. Wang, S.-G. Zhou, and J.-H. Guan, “Detecting J. Financ. 54, 2263 (1999). [35] R. W. Sias, Rev. Financ. Stud.17, 165 (2004).