Stochastic Modelling of Big Data in Finance Stochastic Modelling of Big Data in Finance provides a rigorous overview and exploration of sto- chastic modelling of big data in finance (BDF). The book describes various stochastic models, includ- ing multivariate models, to deal with big data in finance. This includes data in high-frequency and algorithmic trading, specifically in limit order books (LOB), and shows how those models can be applied to different datasets to describe the dynamics of LOB, and to figure out which model is the best with respect to a specific data set. The results of the book may be used to also solve acquisition, liquidation and market making problems, and other optimization problems in finance. Features • Self-contained book suitable for graduate students and post-doctoral fellows in financial math- ematics and data science, as well as for practitioners working in the financial industry who deal with big data • All results are presented visually to aid in understanding of concepts Dr. Anatoliy Swishchuk is a Professor in Mathematical Finance at the Department of Mathematics and Statistics, University of Calgary, Calgary, AB, Canada. He got his B.Sc. and M.Sc. degrees from Kyiv State University, Kyiv, Ukraine. He earned two doctorate degrees in Mathematics and Physics (PhD and DSc) from the prestigious National Academy of Sciences of Ukraine (NASU), Kiev, Ukraine, and is a recipient of NASU award for young scientist with a gold medal for series of research publica- tions in random evolutions and their applications. Dr. Swishchuk is a chair and organizer of finance and energy finance seminar ‘Lunch at the Lab’ at the Department of Mathematics and Statistics. Dr. Swishchuk is a Director of Mathematical and Compu- tational Finance Laboratory at the University of Calgary. He was a steering committee member of the Professional Risk Managers International Association (PRMIA), Canada (2006-2015), and is a steer- ing committee member of Global Association of Risk Professionals (GARP), Canada (since 2015). Dr. Swishchuk is a creator of mathematical finance program at the Department of Mathematics & Sta- tistics. He is also a proponent for a new specialization “Financial and Energy Markets Data Modelling” in the Data Science and Analytics program. His research areas include financial mathematics, ran- dom evolutions and their applications, biomathematics, stochastic calculus, and he serves on editorial boards for four research journals. He is the author of more than 200 publications, including 15 books and more than 150 articles in peer-reviewed journals. In 2018 he received a Peak Scholar award. Chapman & Hall/CRC Financial Mathematics Series Aims and scope: The field of financial mathematics forms an ever-expanding slice of the financial sector. This series aims to capture new developments and summarize what is known over the whole spectrum of this field. It will include a broad range of textbooks, reference works and handbooks that are meant to appeal to both academics and practitioners. The inclusion of numerical code and concrete real-world examples is highly encouraged. Series Editors M.A.H. Dempster Centre for Financial Research Department of Pure Mathematics and Statistics University of Cambridge, UK Dilip B. Madan Robert H. Smith School of Business University of Maryland, USA Rama Cont Department of Mathematics Imperial College, UK Robert A. Jarrow Lynch Professor of Investment Management Johnson Graduate School of Management Cornell University, USA Machine Learning for Factor Investing: R Version Guillaume Coqueret, Tony Guida Malliavin Calculus in Finance: Theory and Practice Elisa Alos, David Garcia Lorite Risk Measures and Insurance Solvency Benchmarks: Fixed-Probability Levels in Renewal Risk Models Vsevolod K. Malinovskii Financial Mathematics: A Comprehensive Treatment in Discrete Time, Second Edition Giuseppe Campolieti, Roman N. Makarov Pricing Models of Volatility Products and Exotic Variance Derivatives Yue Kuen Kwok, Wendong Zheng Quantitative Finance with Python: A Practical Guide to Investment Management, Trading, and Financial Engineering Chris Kelliher Stochastic Modelling of Big Data in Finance Anatoliy Swishchuk Introduction to Stochastic Finance with Market Examples, Second Edition Nicolas Privault Commodities: Fundamental Theory of Futures, Forwards, and Derivatives Pricing, Second Edition M.A.H. Dempster, Ke Tang For more information about this series please visit: https://www.crcpress. com/Chapman-and-HallCRC-Financial-Mathematics-Series/book series/ CHFINANCMTH Stochastic Modelling of Big Data in Finance Anatoliy Swishchuk Department of Mathematics and Statistics, University of Calgary, Calgary, Canada First edition published 2023 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN © 2023 Anatoliy Swishchuk CRC Press is an imprint of Taylor & Francis Group, LLC Reasonable efforts have been made to publish reliable data and information, but the author and pub- lisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright. com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermis- [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Swishchuk, Anatoliy, author. Title: Stochastic modelling of big data in finance / Anatoliy Swishchuk, Department of Mathematics and Statistics, University of Calgary, Calgary, Canada. Description: 1 Edition. | Boca Raton, FL : Chapman & Hall, CRC Press, 2023. | Series: Chapman and Hall/CRC financial mathematics series | Includes bibliographical references and index. Identifiers: LCCN 2022022045 (print) | LCCN 2022022046 (ebook) | ISBN 9781032209265 (hardback) | ISBN 9781032209289 (paperback) | ISBN 9781003265986 (ebook) Subjects: LCSH: Finance--Mathematical models. | Stochastic models. | Big data. Classification: LCC HG106 .S95 2023 (print) | LCC HG106 (ebook) | DDC 332.01/5195--dc23/eng/20220801 LC record available at https://lccn.loc.gov/2022022045 LC ebook record available at https://lccn.loc.gov/2022022046 ISBN: 978-1-032-20926-5 (hbk) ISBN: 978-1-032-20928-9 (pbk) ISBN: 978-1-003-26598-6 (ebk) DOI: 10.1201/ 9781003265986 Typeset in CMR10 font by KnowledgeWorks Global Ltd. Publisher’s note: This book has been prepared from camera-ready copy provided by the authors. @ a nnailuJ v I @ a k n a anyr a M M a r i @ a y @ i l o O t a el n n A a @ @ To My Family and My Motherland, Ukraine w o V n i @ c t s o u r @ @ h a t n i d w @ @ t hose @ n o t @ who @ a r e Contents Foreword xiii Preface xv Symbols xxi Acknowledgements xxiii 1 A Brief Introduction: Stochastic Modelling of Big Data in Finance 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Big Data in Finance: Limit Order Books . . . . . . . . . . . 3 1.2.1 Description of Limit Order Books Mechanism . . . . . 3 1.2.2 Big Data in Finance: Lobster Data . . . . . . . . . . . 4 1.2.3 MoreBigDatainFinance:XetraandFrankfurtMarkets (Deutsche Boerse Group), on September 23, 2013 and CISCO Data on November 3, 2014 . . . . . . . . . . . 5 1.3 Stochastic Modelling of Big Data in Finance: Limit Order Books (LOB) . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Semi-Markov Modelling of LOB . . . . . . . . . . . . 7 1.3.2 General Semi-Markov Modelling of LOB . . . . . . . . 9 1.3.3 Modelling of LOB with a Compound Hawkes Processes 10 1.3.4 Modelling of LOB with a General Compound Hawkes Processes . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3.5 ModellingofLOBwithaNon-linearGeneralCompound Hawkes Processes. . . . . . . . . . . . . . . . . . . . . 12 1.3.6 Modelling of LOB with a Multivariable General Compound Hawkes Processes . . . . . . . . . . . . . . 12 1.4 IllustrationandJustificationofOurMethodtoStudyBigData in Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.1 Numerical Results: Lobster Data (Apple, Google and Microsoft Stocks) . . . . . . . . . . . . . . . . . . . . . 13 1.4.2 NumericalResults:XetraandFrankfurtMarketsstocks (Deutsche Boerse Group), on September 23, 2013 . . . 14 1.4.3 Numerical Results: CISCO Data, November 3, 2014 . 15 1.5 Methodological Aspects of Using the Models . . . . . . . . . 15 vii viii Contents 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 I Semi-Markovian Modelling of Big Data in Finance 21 2 A Semi-Markovian Modelling of Big Data in Finance 23 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 A Semi-Markovian Modelling of Limit Order Markets . . . . 26 2.2.1 Markov Renewal and Semi-Markov Processes . . . . . 26 2.2.2 Semi-Markovian Modelling of Limit Order Books . . . 27 2.3 Main Probabilistic Results . . . . . . . . . . . . . . . . . . . 32 2.3.1 Duration until the next price change . . . . . . . . . . 32 2.3.2 Probability of Price Increase . . . . . . . . . . . . . . 38 2.3.3 ThestockpriceseenasafunctionalofaMarkovrenewal process . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4 Diffusion Limit of the Price Process . . . . . . . . . . . . . . 40 2.4.1 Balanced Order Flow case: Pa(1,1) = Pa(−1,−1) and Pb(1,1)=Pb(−1,−1) . . . . . . . . . . . . . . . . . . 41 2.4.2 Other cases: either Pa(1,1)<Pa(−1,−1) or Pb(1,1) <Pb(−1,−1) . . . . . . . . . . . . . . . . . . . . . . . 44 2.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . 45 2.6 More Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.6.1 More Data . . . . . . . . . . . . . . . . . . . . . . . . 51 2.6.2 Estimated Probabilities . . . . . . . . . . . . . . . . . 54 2.6.3 Assumption on Distributions f and f˜ . . . . . . . . . 59 2.6.4 Diffusion Limit (Not-Fixed Spread). . . . . . . . . . . 60 2.6.5 The Optimal Liquidation/Acquisition Problems . . . . 61 2.6.6 Market Making . . . . . . . . . . . . . . . . . . . . . . 62 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3 General Semi-Markovian Modelling of Big Data in Finance 67 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.1.1 Motivation for Generalizing the Model . . . . . . . . . 68 3.1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2 Reviewing the Assumptions with Our New Data Sets . . . . 69 3.2.1 Liquidity of Our Data . . . . . . . . . . . . . . . . . . 69 3.2.2 Empirical Distributions of Initial Queue Sizes and Calculated Conditional Probabilities . . . . . . . . . . 70 3.2.3 Inter-arrival Times of Book Events . . . . . . . . . . . 71 3.2.4 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . 72 3.3 General Semi-Markov Model for the Limit Order Book with Two States . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.3.1 Diffusion Limits . . . . . . . . . . . . . . . . . . . . . 73 Contents ix 3.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . 77 3.3.3 Numerical Results . . . . . . . . . . . . . . . . . . . . 78 3.3.4 Application of the Model . . . . . . . . . . . . . . . . 79 3.3.4.1 Examination of the Data . . . . . . . . . . . 80 3.3.4.2 Model Implementation . . . . . . . . . . . . 82 3.3.4.3 Results for Constructed Sample Day . . . . . 83 3.4 General Semi-Markov Model for the Limit Order Book with arbitrary number of states . . . . . . . . . . . . . . . . . . . 85 3.4.1 Justification . . . . . . . . . . . . . . . . . . . . . . . . 85 3.4.2 Diffusion Limits . . . . . . . . . . . . . . . . . . . . . 86 3.4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . 89 3.4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . 90 3.5 Discussion on Price Spreads . . . . . . . . . . . . . . . . . . 91 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 II Modelling of Big Data in Finance with Hawkes Processes 97 4 A Brief Introduction to Hawkes Processes 99 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2 Definition of Hawkes Processes (HPs) . . . . . . . . . . . . . 101 4.3 Compound Hawkes Processes . . . . . . . . . . . . . . . . . . 104 4.3.1 Special Cases of Compound Hawkes Processes in Limit Order Books . . . . . . . . . . . . . . . . . . . . . . . 105 4.4 Limit Theorems for Hawkes Processes: LLN and FCLT . . . 106 4.4.1 Law of Large Numbers (LLN) for Hawkes Processes . 106 4.4.2 FunctionalCentralLimitTheorems(FCLT)forHawkes Processes . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5 Limit Theorems for Poisson Processes: LLN and FCLT . . . 107 4.5.1 Law of Large Numbers (LLN) for Poisson Processes . 107 4.5.2 FunctionalCentralLimitTheorems(FCLT)forHawkes Processes . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.6 Stylized Properties of Hawkes Process . . . . . . . . . . . . . 107 4.6.1 Non-exponential Inter-arrival Times . . . . . . . . . . 108 4.6.2 Clustering Effect of Trades . . . . . . . . . . . . . . . 110 4.6.3 Non-independency of Mid-price Changes . . . . . . . . 112 4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5 Stochastic Modelling of Big Data in Finance with CHP 121 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.2 Definitions of HP, CHP and RSCHP . . . . . . . . . . . . . . 123 5.2.1 One-dimensional Hawkes Process . . . . . . . . . . . . 123 5.2.2 Compound Hawkes Process (CHP) . . . . . . . . . . . 125