ebook img

Combining Alphas via Bounded Regression PDF

0.23 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Combining Alphas via Bounded Regression

Combining Alphas via Bounded Regression Zura Kakushadze§†1 § Quantigicr Solutions LLC 1127 High Ridge Road #135, Stamford, CT 06905 2 † Free University of Tbilisi, Business School & School of Physics 5 1 240, David Agmashenebeli Alley, Tbilisi, 0159, Georgia 0 2 (January 7, 2015; revised October 22, 2015) t c O Abstract 2 2 Wegiveanexplicitalgorithmandsourcecodeforcombiningalphastreams ] via boundedregression. In practical applications typically there is insufficient M history to compute a sample covariance matrix (SCM) for a large number of P alphas. To compute alpha allocation weights, one then resorts to (weighted) . n regression over SCM principal components. Regression often produces alpha i f weights with insufficient diversification and/or skewed distribution against, - q e.g., turnover. This can be rectified by imposing bounds on alpha weights [ within the regression procedure. Bounded regression can also be applied to 2 stock andother asset portfolioconstruction. We discussillustrative examples. v 1 8 3 Keywords: hedgefund, alphastream, alphaweights, portfolioturnover, investment 5 0 allocation, weighted regression, diversification, bounds, optimization, factor models . 1 0 5 1 : v i X r a 1 ZuraKakushadze,Ph.D., is the PresidentofQuantigicr SolutionsLLC,anda Full Professor at Free University of Tbilisi. Email: [email protected] 2 DISCLAIMER: This address is used by the corresponding author for no purpose other than to indicate his professional affiliation as is customary in publications. In particular, the contents ofthis paperarenotintendedas aninvestment,legal,taxorany othersuchadvice,andinno way represent views of Quantigic Solutions LLC, the website www.quantigic.com or any of their other affiliates. 1 Introduction With technological advances there is an ever increasing number of alpha streams.3 Many of these alphas are ephemeral, with relatively short lifespans. As a result, in practical applications typically there is insufficient history to compute a sample covariance matrix (SCM) for a large number of alpha streams – SCM is singular. Therefore, directly using SCM in, say, alpha portfolio optimization is not a option.4 One approach to circumvent this difficulty is to build a factor model for alpha streams[56]. Because oftheutmostsecrecy inthealphabusiness, suchfactormodels must be build in-house – there are no commercial providers of “standardized” factor models for alpha streams. As with factor models for equities, such model building for alphas requires certain nontrivial expertise and time expenditure. Therefore, in practice, one often takes a simpler path. As was discussed in more detail in [56], one can deform SCM such that it is nonsingular, and then use the so-deformed SCM in, say, Sharpe ratio optimization for a portfolio of alphas. For small deformations this then reduces to a cross-sectional weighted regression of the alpha stream expected returns [56]. The regression weights are the inverse sample variances of the alphas. The columns of the loadings matrix, over which the expected returns are regressed, are nothing but the first K principal components of SCM corresponding to its positive (i.e., non-vanishing) eigenvalues [56]. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. Thus, if some expected returns are skewed, then, despite nonunit regression weights (which suppress more volatile al- phas), the corresponding alpha weights can be larger than desired by diversification considerations. Also, the principal components know nothing about quantities such as turnover.5 A simple way of obtaining a more “well-rounded” portfolio composi- tion is to set bounds on alpha weights. This is the approach we discuss here. When individual alpha streams are traded on separate execution platforms, the alpha weights are non-negative. By combining and trading multiple alpha streams on the same execution platform – the framework we adapt here – one saves on transaction costs by internally crossing trades between different alpha streams (as opposed to going to the market).6 Then the alpha weights can be negative. When alpha weights can take both positive and negative values, the bounded regression problem simplifies. It boils down to an iterative algorithm we discuss in Section 2. This algorithm can actually be derived from an optimization algorithm 3 Here“alpha”–followingthecommontraderlingo–generallymeansanyreasonable“expected return” that one may wish to trade on and is not necessarily the same as the “academic” alpha. Inpractice,oftenthe detailedinformationabouthowalphasareconstructedmaynotbe available, e.g., the only data available could be the position data, so “alpha” then is a set of instructions to achieve certain stock holdings by some times t1,t2,... 4 Forapartiallistofhedgefundliterature,see,e.g.,[1]-[20]andreferencestherein. Forapartial list of portfolio optimization and related literature, see, e.g., [21]-[55] and references therein. 5 One approach to rectify this is to add a turnover-basedfactor to the loadings matrix [56]. 6 For a recent discussion, see [57]. 1 with bounds (in a factor model context) discussed in [58] by taking the regression limit of optimization. We also give R source code for the bounded regression algo- rithm in Appendix A. Appendix B contains some legalese. We conclude in Section 3, where we also discuss bounded regression with transaction costs following [59]. 2 Bounded Regression 2.1 Notations We have N alphas α , i = 1,...,N. Each alpha is actually a time series α (t ), i i s s = 0,1,...,M, where t is the most recent time. Below α refers to α (t ). 0 i i 0 Let C be the sample covariance matrix (SCM) of the N time series α (t ). If ij i s M < N, then only M eigenvalues of C are non-zero, while the remainder have ij “small” values, which are zeros distorted by computational rounding.7 Alphas α are combined with weights w . Any leverage is included in the defini- i i tion of α , i.e., if a given alpha labeled by j 1,...,N before leverage is α′ (this i ∈ { } j is a raw, unlevered alpha) and the corresponding leverage is L : 1, then we define j α L α′. With this definition, the weights satisfy the condition j ≡ j j N w = 1 (1) i | | i=1 X Here we allow the weights to be negative as we are interested in the case where the alphas are traded on the same execution platform and trades between alphas are N crossed, so one is actually trading the combined alpha α α w . ≡ i=1 i i P 2.2 Weighted Regression When SCM C is singular and no other matrix (e.g., a factor model) to replace it ij is available, one can deform SCM such that it is nonsingular, and then use the so- deformed SCM in, say, Sharpe ratio optimization for a portfolio of alphas [56]. For small deformations this reduces to a cross-sectional weighted regression of the alpha stream expected returns [56]. The regression weights z (not to be confused with i the alpha weights w ) are the inverse sample variances of the alphas: z 1/C . i i ii ≡ The columns of the loadings matrix Λ , A = 1,...,K, over which the expected iA returns are regressed, are nothing but the first K principal components of SCM corresponding to its positive (i.e., non-vanishing) eigenvalues. However, for now we will keep Λ general (e.g., one may wish to include other risk factors in Λ [56]). iA iA 7 Actually, this assumes that there are no N/As in any of the alpha time series. If some or all alpha time series containN/As in non-uniformmanner and the correlationmatrix is computed by omittingsuchpair-wiseN/As,thentheresultingcorrelationmatrixmayhavenegativeeigenvalues that are not zeros distorted by computational rounding. 2 The weights w are given by: i w = γ z ε (2) i i i where ε are the residuals of the cross-sectional regression of α over Λ (without i i iA the intercept, unless the intercept is subsumed in Λ , that is – see below) with the iA regression weights z : i N K ε = α z α Λ Λ Q−1 (3) i i − j j iA jB AB j=1 A,B=1 X X where Q−1 is the inverse of AB N Q z Λ Λ (4) AB i iA iB ≡ i=1 X and the overall factor γ in (2) is fixed via (1). Note that we have N A 1,...,K : w Λ = 0 (5) i iA ∀ ∈ { } i=1 X So, the weights w are neutral w.r.t. the risk factors defined by the columns of the i loadings matrix Λ . iA 2.3 Bounds Since the weights w can have either sign, we will assume that the lower and upper i bounds on the weights w− w w+ (6) i ≤ i ≤ i satisfy the conditions w− 0 (7) i ≤ w+ 0 (8) i ≥ w− < w+ (9) i i The last condition is not restrictive: if for some alpha labeled by i we have w− = i w+, then we can simply set w = w− and altogether exclude this alpha from the i i i bounded regression procedure below. Also, if, for whatever reason, we wish to have no upper/lower bound for a given w , we can simply set w± = 1. i i ± The bounds can be imposed for diversification purposes: e.g., one may wish to require that no alpha has a weight greater than some fixed (small) percentile ξ, i.e., w ξ, so w± = ξ. One may also wish to suppress the contributions of high | i| ≤ i ± 3 turnover alphas, e.g., by requiring that w ξ if τ τ , where τ is the turnover,8 i i ∗ i | | ≤ ≥ τ is some cut-off turnover, and ξ is some (small) percentile. Bounds can also be ∗ used to limit the weights of low capacity9 alpheas. Etc.10 e 2.4 Running a Bounded Regression So, how do we impose the bounds in the context of a regression? There are two subtleties here. First, we wish to preserve the factor neutrality property (5), which is invariant under the simultaneous rescalings w ζw (where ζ is a constant). If i i → we simply set some w to their upper or lower bounds, this generally will ruin the i rescaling invariance, so the property (5) will be lost. Second, we must preserve the normalization condition (1). In fact, it is precisely this normalization condition that allows to meaningfully set the bounds w±, as the regression itself does not fix the i overall normalizationcoefficient γ in(2), owing to therescaling invariance w ζw . i i → Here we discuss the bounded regression algorithm. To save space, we skip the detailed derivation as it follows straightforwardly by taking the regression limit of optimization with bounds in the context of a factor model, both of which are discussed in detail in [58].11 Let us define the following subsets of the index i J 1,...,N : ∈ ≡ { } w = w+, i J+ (10) i i ∈ w = w−, i J− (11) i i ∈ J J+ J− (12) ≡ ∪ J J J (13) ≡ \ Further, let e α γ α (14) i i ≡ y z α Λ + w+ Λ + w− Λ (15) A ≡ i i iA i iA i iA e Xi∈Je iX∈J+ iX∈J− e 8 Here the turnover(overagivenperiod, e.g., daily turnover)is defined asthe ratioτ D /I i i i ≡ of total dollars D (long plus short) traded by the alpha labeled by i over the corresponding total i dollar holdings I (long plus short). i 9 By capacity I∗ for a given alpha we mean the value of the investment level I for which the i i P&L P (I ) is maximized (considering nonlinear effects of impact). i i 10 Since the regressionwe consider here is weighted with the regressionweights z =1/C , this i ii already controls exposure to alpha volatility, so imposing bounds based on volatility would make a difference only if one wishes to further suppress volatile alphas. 11 The regression limit of optimization essentially amounts to the limit ξ2 η ξ2, η 0, i ≡ i → ξ2 =fixed, where ξ is the specific (idiosyncratic) risk in the factor model with the factor loadings i i matrixidentifiedwiththe regressionloadingsmatrixΛ (andtheK K factorcovariaencematrix iA × beecomes immaterial in the regressionlimit) – see [58] for details. 4 where γ is to be determined (see below). Then we have K w = z α Λ Q−1 y , i J (16) i i i − iA AB B ∈ ! A,B=1 X e Ke e i J+ : z α Λ Q−1 y w+ (17) ∀ ∈ i i − iA AB B ≥ i ! A,B=1 X e K e i J− : z α Λ Q−1 y w− (18) ∀ ∈ i i − iA AB B ≤ i ! A,B=1 X e e where Q−1 is the inverse of the K K matrix Q: × Q z Λ Λ (19) e AB i iAe iB ≡ e Xi∈J e Here the loadings matrix Λ must be such that Q is invertible.12 Also, note that iA w , i J given by (16) together with w = w+, i J+ and w = w−, i J− satisfy i ∈ i i ∈ i i ∈ (5), as they should. e Noteethat, for a given value of γ, (15) solves for y given J+ and J−. On the A other hand, (17) and (18) determine J+ and J− in terms of y . The entire system is A then solved iteratively, where at the initial iteration one takes J(0) = J, so that J+(0) and J−(0) are empty. However, we still need to fix γ. This is done via a separate iterative procedure, which we describe below. e Because we have two iterations, to guarantee (rapid) convergence, the J± it- (s) eration (that is, for a given value of γ) can be done as follows. Let w be such i that b i J : w− w(s) w+ (20) ∀ ∈ i ≤ i ≤ i N (s) A 1,...,K : w Λ = 0 (21) ∀ ∈ { } b i iA i=1 X At the (s+1)-th iteration, let w(s+1) be given by (1b6) for i J(s), with w(s+1) = w± i ∈ i i for i J±(s). This solution satisfies (5), but may not satisfy the bounds. Let ∈ e (s+1) (s) q w w (22) i ≡ i − i (s) h (t) w +t q , t [0,1] (23) i ≡ i i ∈ b Then b (s+1) (s) w h (t ) = w +t q (24) i ≡ i ∗ i ∗ i 12 This is the case if the columns of Λ are comprised of the first K principal components of iA SCM C corresponding to its pbositive eigenvalues. Hbowever, as mentioned above, here we keep ij the loadings matrix general. 5 where t is the maximal value of t such that h (t) satisfies the bounds. We have: ∗ i q > 0 : p min w(s+1), w+ (25) i i ≡ i i q < 0 : p max(cid:16)w(s+1), w−(cid:17) (26) i i ≡ i i (s)(cid:16) (cid:17) p w t = min i − i q = 0, i J (27) ∗ i q 6 ∈ i ! (cid:12) b (cid:12) Now, at each step, instead of (17) and (18), w(cid:12)e can define J±(s+1) via i J+(s+1) : w(s+1) = w+ (28) ∀ ∈ i i i J−(s+1) : w(s+1) = w− (29) ∀ ∈ i i b (s+1) (0) where w is computed iteratively as above and we can take w 0 at the initial i b i ≡ iteration. Unlike (17) and (18), (28) and (29) add new elements to J± one (or a few) element(s) at each iteration. b b The convergence criteria are given by J+(s+1) = J+(s) (30) J−(s+1) = J−(s) (31) These criteria are based on discrete quantities and are unaffected by computational (machine) precision effects. However, in practice the equalities in (28) and (29) are understood within some tolerance (or machine precision) – see the R code in (s+1) Appendix A. We will denote the value of w at the final iteration (for a given i value of γ, that is) via w . i Finally, γ is determined via another iterative procedure as follows (we use su- b perscript a for the γ iterations to distinguish it from the superscript s for the J± e iterations): γ(a) γ(a+1) = (32) N (a) w i=1 i (cid:12) (cid:12) where w(a) is computed as above for γ =Pγ(a)(cid:12). To(cid:12) achieve rapid convergence, the i (cid:12)e (cid:12) initial value γ(0) can be set as follows: e 1 γ(0) = (33) N z ε i=1 i| i| where ε are the residuals of the weightePd regression (without bounds) given by (3). i The convergence criterion for the γ iteration is given by γ(a+1) = γ(a) (34) understood within some preset computational tolerance (or machine precision). 6 The R code for the above algorithm with some additional explanatory documen- tation is given in Appendix A. Note that this code is not written to be “fancy” or optimized for speed or in any other way. Instead, its sole purpose is to illustrate the bounded regression algorithm as it is described above in a simple-to-understand fashion. Some legalese relating to this code is given in Appendix B. 2.5 Application to Stock Portfolios Above we discussed the bounded regression algorithm in the context of computing weights for portfolios of alpha streams. However, the algorithm is quite general and – with appropriate notational identifications – can be applied to portfolios of stocks or other suitable instruments. In fact, it can also be applied outside of finance. Here, for the sake of definiteness, we will focus on stock portfolios, in fact, we will assume that they are dollar neutral, so both long and short positions are allowed.13 2.5.1 Establishing Trades Let us first discuss establishing trades, i.e., we start from nil positions and establish a portfolio of N stocks. Instead of alpha streams, our index i 1,...,N J now ∈ { } ≡ labels the stocks. We will denote the desired dollar (not share) holdings via H , and i the total dollar investment (long plus short) via I: N I H (35) i ≡ | | i=1 X Let w H /I. These are now our stock weights (analogous to the alpha weights). i i ≡ Then we have the familiar normalization condition N w = 1 (36) i | | i=1 X However, normally, one imposes bounds on H , not on w . For example, in the case i i of establishing trades one may wish to cap the positions such that: i) not more than a small percentile ξ of the total dollar investment I is allocated to any given stock – this is a diversification constraint; and ii) only a small percentile ξ of ADDV (average daily dollar volume) V is traded – this is a liquidity constraint (see below). i In this case we have the following bounds on the dollar holdings H : e i H− H H+ (37) i ≤ i ≤ i H± = min ξ I, ξ V (38) i ± i (cid:16) (cid:17) In this case the upper and lower bounds are symmeetrical. In some cases, such as for hard-to-borrow-stocks, we may have some H− = 0. In other cases one may not i 13 Various generalizations are possible, some more straightforwardthan others. 7 wish to have a long position in some stocks. Etc. We will only assume that H− 0 i ≤ and H+ 0, in line with our discussion above for the bounds on the weights, which i ≥ are then given by w± H±/I (39) i ≡ i The final touch then is that instead of α one uses some expected returns E in the i i case of stocks. The rest goes through exactly as above for a suitably chosen Λ . iA 2.5.2 Rebalancing Trades With rebalancing trades, we have the current dollar holdings H∗ and the desired i dollar holdings H . In this case, one may wish to cap the positions such that: i) not i more than a small percentile ξ of the total dollar investment I is allocated to any given stock – this is the same diversification constraint as above; ii) only a small percentile ξ of ADDV V is traded – this the same liquidity constraint as above; and i iii) not more than a small percentile ξ′ of ADDV V is allocated to any given stock i – this is aneother liquidity constraint stemming from the consideration that, if the portfolio must be liquidated swiftly (e.g., due to an unforeseen event), to mitigate liquidation costs, the positions are capped based on liquidity. Here ξ′ typically can be several times larger than ξ – the portfolio can be built up in stages as long at each stage the bounds are satisfied. The bounds on H now read: i e H min(ξ I, ξ′ V ) (40) i i | | ≤ H H∗ ξ V (41) | i − i| ≤ i It is more convenient to rewrite these boundsein terms of the traded dollar amounts D H H∗: i ≡ i − i D− D D+ (42) i ≤ i ≤ i D+ = min min(ξ I, ξ′ V ) H∗, ξ V 0 (43) i i − i i ≥ (cid:16) (cid:17) D− = max min(ξ I, ξ′ V ) H∗, ξ V 0 (44) i − i − ie − i ≤ (cid:16) (cid:17) and we are assuming that H∗ min(ξ I, ξ′ V ). Furthermeore, we will assume that | i| ≤ i H∗ itself satisfies (5): i N A 1,...,K : H∗ Λ = 0 (45) ∀ ∈ { } i iA i=1 X Then the bounded regression algorithm can be straightforwardly applied to the weights w and x defined as follows: i i w H /I (46) i i ≡ x D /I (47) i i ≡ 8 IntheJ± iterationwenowusex insteadofw , whileintheγ iterationwestillusew . i i i Then the rest of the algorithm goes through unchanged. Let us note, however, that the source code given in Appendix A is written with alpha weights in mind, so while it can be adapted to the case of stock portfolios in the case of establishing trades, straightforward modifications are required to accommodate rebalancing trades. 2.5.3 Examples: Intraday Mean-Reversion Alphas To illustrate the use of the algorithm, we have employed it to construct portfolios for intraday mean-reversion alphas with the loadings matrix Λ in the following 5 iA incarnations: i) intercept only (so K = 1); ii) BICS (Bloomberg Industry Classifi- cation System) sectors; iii) BICS industries; iv) BICS sub-industries; and v) the 4 style factors prc, mom, hlv and vol of [60] plus BICS sub-industries. The regres- sion weights are the inverse sample variances: z = 1/C (see below). In the cases i ii ii)-v) above the intercept is subsumed in the loadings matrix Λ . Indeed, we have iA Λ 1, where G is the set of columns of Λ corresponding to sectors in A∈G iA ≡ iA the case ii), industries in the case iii), and sub-industries in the cases iv) and v). P Consequently, the resultant portfolios are automatically dollar neutral. The portfolio construction and backtesting are identical to those in [61], where more detailed discussion can be found, so to save space, here we will only give a brief summary. The portfolios are assumed to be established at the open and liquidated atthecloseonthesameday, sotheyarepurelyintradayandthealgorithmofSection 2.5.1 for the establishing trades applies. The expected returns E for each date are i taken to be E = R , where R ln Popen/Pclose , and for each date Popen is i − i i ≡ i i i today’s open, while Pclose is yesterday’s close adjusted for splits and dividends if the i (cid:0) (cid:1) ex-date is today. So, these are intraday mean-reversion alphas. The universe is top 2000 by ADDV V , where ADDV is computed based on 21- i trading-day rolling periods. However, the universe is not rebalanced daily, but also every 21 trading days (see [61] for details). The sample variances C are computed ii based on the same 21-trading-day rolling periods, and are not applied daily, but also every 21 trading days, same as the universe rebalancing (see [61] for details). We run our simulations over a period of 5 years (more precisely, 252 5 trading × days going back from 9/5/2014, inclusive). The annualized return-on-capital (ROC) is computed as average daily P&L divided by the total (long plus short) intraday investment level I (with no leverage) and multiplied by 252. The annualized Sharpe Ratio (SR) is computed as the daily Sharpe ratio multiplied by √252. Cents-per- share (CPS) is computed as the total P&L divided by the total shares traded. On each day the total (establishing plus liquidating) shares traded for each stock are open given by Q = 2 H /P (see [61] for details). i | i| i For comparison purposes, the results for regressions without bounds are given in Table 1. The results for the bounded regressions, with the bounds on the desired holdings set as H 0.01 V (48) i i | | ≤ 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.