Statistical Arbitrage Using Limit Order Book Imbalance Anton D. Rubisov University of Toronto Institute for Aerospace Studies Faculty of Applied Science and Engineering University of Toronto A thesis submitted in conformity with the requirements for the degree of Master of Applied Science c 2015AntonD.Rubisov (cid:13) Statistical Arbitrage Using Limit Order Book Imbalance Anton D. Rubisov UniversityofTorontoInstituteforAerospaceStudies FacultyofAppliedScienceandEngineering UniversityofToronto 2015 Abstract This dissertation demonstrates that there is high revenue potential in us- ing limit order book imbalance as a state variable in an algorithmic trading strategy. Beginning with the hypothesis that imbalance of bid/ask order volumes is an indicator for future price changes, exploratory data analysis suggests that modelling the joint distribution of imbalance and observed price changes as a continuous-time Markov chain presents a monetizable opportunity. The arbitrage problem is then formalized mathematically as a stochastic optimal control problem using limit orders and market orders withtheaimofmaximizingterminalwealth. Theproblemissolvedinboth continuous and discrete time using the dynamic programming principle, whichproducesbothconditionsformarketorderexecution,aswellaslimit order posting depths, as functions of time, inventory, and imbalance. The optimalcontrolsarecalibratedandbacktestedonhistoricalNASDAQITCH data,whichproducesconsistentandsubstantialrevenue. ii Acknowledgements I couldn’t begin to enumerate the times and ways this dissertation could have failed to be. I am deeply grateful to several individuals for making it a reality, and I’d like to acknowledgethemhere. Dr. Gabriele D’Eleuterio, my supervisor at UTIAS, is the epitomical supernatural mentoroftheherojourney,withoutwhomIwouldhaveabandonedmystudies. Gabe was the inspirational upholder of academic values, who relentlessly stressed the sub- tleties of scientific presentation and communication, and guided me through a pretty toughtime. Thankyouforyoursupport,guidance,andforeverycoffee. Dr. SebastianJaimungal,professorofstatisticsattheuniversity,tookmeonasasurro- gate student half-way through my studies, and guided every step of this dissertation. Despitehavingabsolutelynoobligationtodoso,hehadfaithinmeandpatiencewith me in navigating an entirely unfamiliar subject. Thank you for repeatedly making yourselfavailabletohelpwithallthemath. Dr. Jack Carlyle, then a PhD student in solar physics at UCL, inspired me to quit my job and go back to school for a graduate degree. It wasn’t an easy decision, and the juryisoutonwhetheritwasagoodone. Butthanksforalltheteaandcrumpets. Ted Herman, a friend that redefines friendship, has read more of my research than anyoneeverwill. Atacrucialtime,westrucktheFran’sAccordthatinvolvedprewrit- ten$100chequesbeingshreddedonaweeklybasisuponreceiptofaprogressupdate. Legitimately,thisisprobablythenumberonereasonthatmydissertationgotdone. Thank you mom for always loving, supporting, and inspiring me to my best. Thank youdadforprettywellbeingasupervisortoo. iii Contents 1 Introduction 1 1.1 TheLimit-OrderBook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 ITCHDataSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 OrderImbalance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 ExploratoryDataAnalysis 8 2.1 ModellingImbalance: ContinuousTimeMarkovChain . . . . . . . . . . 8 2.2 MaximumLikelihoodEstimateofaMarkov-ModulatedPoissonProcess 10 2.2.1 InfinitesimalGeneratorMatrix . . . . . . . . . . . . . . . . . . . . 10 2.2.2 ArrivalRates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Two-DimensionalCTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 PredictingFuturePriceChanges . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 NaiveTradingStrategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.6 CalibrationandBacktesting . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.7 ConclusionsfromtheNaiveTradingStrategies . . . . . . . . . . . . . . . 20 iv 3 MaximizingWealthviaContinuous-TimeStochasticControl 24 3.1 SystemDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 DynamicProgramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3 MaximizingTerminalWealth . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4 InterpretingtheDPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4 MaximizingWealthviaDiscrete-TimeStochasticControl 37 4.1 SystemDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.2 DynamicProgramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3 MaximizingTerminalWealth . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.4 SimplifyingandInterpretingtheDPE . . . . . . . . . . . . . . . . . . . . 49 5 Results 52 5.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.2 DynamicsoftheOptimalPostingDepths . . . . . . . . . . . . . . . . . . 53 5.2.1 ComparingOptimalControlPerformance . . . . . . . . . . . . . 62 5.3 In-SampleBacktesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.3.1 Same-DayCalibration . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.3.2 WeekOffsetCalibration . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3.3 AnnualCalibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.4 Out-of-SampleBacktesting . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6 Conclusion 79 v List of Figures 1.1 Structureandmechanicsofthelimit-orderbook . . . . . . . . . . . . . . 4 2.1 Hypotheticaltimelineofmarketordersandimbalanceregimeswitches . 10 2.2 Time intervals for time-weighted averaging of imbalance and for price change. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Comparisonofthenaivetradingstrategies . . . . . . . . . . . . . . . . . 21 5.1 OptimalbuyLOdepthsforsellpressureimbalance . . . . . . . . . . . . 55 5.2 OptimalbuyLOdepthsforneutralimbalance . . . . . . . . . . . . . . . 56 5.3 OptimalbuyLOdepthsforbuypressureimbalance . . . . . . . . . . . . 57 5.4 OptimalsellLOdepthsforsellpressureimbalance . . . . . . . . . . . . . 59 5.5 OptimalsellLOdepthsforneutralimbalance . . . . . . . . . . . . . . . . 60 5.6 OptimalsellLOdepthsforbuypressureimbalance . . . . . . . . . . . . 61 5.7 Comparisonofthefourstochasticoptimalcontrolmethods . . . . . . . . 63 5.8 Cointegrationrelationofthefourstochasticcontrolmethods . . . . . . . 64 5.9 Detailedsamplepathsofthestochasticoptimalcontrolstrategies . . . . 66 5.10 Comparisonofoptimalpostingdepthsonthesamplepath . . . . . . . . 67 5.11 ComparisonofP&Landinventoryonthesamplepath . . . . . . . . . . 68 vi 5.12 In-samplebacktestingperformanceusingsame-daycalibration . . . . . 70 5.13 In-samplebacktestingperformanceusingweek-offsetcalibration . . . . 73 5.14 In-samplebacktestingperformanceusingannualcalibration . . . . . . . 76 5.15 Out-of-samplebacktestingperformanceusingannualcalibration . . . . 78 vii List of Tables 2.1 Stocksusedintheexploratorydataanalysis . . . . . . . . . . . . . . . . . 8 2.2 One-dimensionalencodingoftwo-dimensionalCTMC . . . . . . . . . . 13 2.3 p-valuesfortestingthetimehomogeneityhypothesis . . . . . . . . . . . 15 2.4 Probabilitiesoffuturepricechanges . . . . . . . . . . . . . . . . . . . . . 18 2.5 Hypotheticaltimelineofadverseselectionwithmarketorders . . . . . . 23 5.1 Stocksusedinthestochasticoptimalcontrolbacktesting . . . . . . . . . 52 5.2 Fixedparametersforbacktesting . . . . . . . . . . . . . . . . . . . . . . . 53 5.3 Parameterformulaeforbacktesting . . . . . . . . . . . . . . . . . . . . . . 53 5.4 Correlationmatrixofstochasticoptimalcontrolreturns . . . . . . . . . . 62 5.5 Numberoftradescomparisonofthefourstochasticcontrolmethods . . 64 5.6 In-samplebacktestingperformanceusingsame-daycalibration . . . . . 71 5.7 In-samplebacktestingperformanceusingweek-offsetcalibration . . . . 74 5.8 In-samplebacktestingperformanceusingannualcalibration . . . . . . . 76 5.9 Out-of-samplebacktestingperformanceusingannualcalibration . . . . 78 viii Nomenclature # Numberofstatesintowhichthesmoothedimbalanceisdiscretized bins Filtrationofaprobabilityspace F Infinitesimalgeneratoroperator L ∆S(t) Signoftheobservedpricechangeattime t ∆ t Sizeofrollingwindowforimbalancesmoothing I ∆ t Sizeofwindowforpricechangecalculation S δ Thedepthatwhichwepostabuy(δ+)andsell(δ )limitorder ± − η Magnitudeofthemidpricechange(arandomvariable) 0,z 1 Indicatorfunction κ Limitorderfillprobabilityconstant λ Arrivalratesofbuy(λ+)andsell(λ )orders ± − G Continuous-timeMarkovchaingeneratormatrix P Continuous-timeMarkovchaintransitionprobabilitymatrix µ Rateofarrivalofotheragents’buy(µ+)andsell(µ )marketorders ± − ∂ Partialderivativewithrespecttox x ρ(t) Imbalanceratio,smoothedbyaveragingoverarollingwindow τ Astoppingtimeatwhichweexecuteamarketorder ξ Bid-AskHalf-Spread ix H(t,x,s,z,q) Continuous-timevaluefunction Hτ,δ±(t,x,s,z,q) Continuous-timeperformancecriterionofthecontrolsτ and δ ± I(t) Imbalanceratio,raw K Otheragents’buy(K+)andsell(K )marketorders ± − L A counting process of how many of our buy (L+) and sell (L ) limit orders are ± − filled M Executionofourbuy(M+)andsell(M )marketorders ± − Qτ,δ Ourinventoryattime t havingusedcontrolsτ and δ t S(t) Midprice V (x,s,z,q) Discrete-timevaluefunction k Vδ±(x,s,z,q) Discrete-timeperformancecriterionofthecontrolsτ and δ k ± τ,δ W Ourwealth(cash,valueofassets,andliquidationpenalty)attimethavingused t controlsτ and δ Xτ,δ Ourcashattime t havingusedcontrolsτ and δ t Z(t) Continuous-timeMarkovchainvariable AAPL StocktickerforAppleInc. FARO StocktickerforFAROTechnologiesInc. INTC StocktickerforIntelCorporation MMM Stocktickerfor3MCompany NTAP StocktickerforNetApp,Inc. ORCL StocktickerforOracleCorporation CTMC Continuous-timeMarkovchain DPE Dynamicprogrammingequation ITCH MarketdatafeedprovidedbyNASDAQ x
Description: