ABSTRACT Title of dissertation: STOCHASTIC SIMULATION: NEW STOCHASTIC APPROXIMATION METHODS AND SENSITIVITY ANALYSES Marie Chau, Doctor of Philosophy, 2015 Dissertation directed by: Prof. Michael C. Fu R.H. Smith School of Business & Institute of Systems Research In this dissertation, we propose two new types of stochastic approximation (SA) methods and study the sensitivity of SA and of a stochastic gradient method to various input parameters. First, we summarize the most common stochastic gradient estimation techniques, both direct and indirect, as well as the two classical SA algorithms, Robbins-Monro (RM) and Kiefer-Wolfowitz (KW), followed by some well-known modifications to the step size, output, gradient, and projection operator. Second, we introduce two new stochastic gradient methods in SA for univari- ate and multivariate stochastic optimization problems. Under a setting where both direct and indirect gradients are available, our new SA algorithms estimate the gra- dient using a hybrid estimator, which is a convex combination of a symmetric finite di↵erence-type gradient estimate and an average of two associated direct gradient estimates. We derive variance minimizing weights that lead to desirable theoretical properties and prove convergence of the SA algorithms. Next, we study the finite-time performance of the KW algorithm and its sen- sitivity to the step size parameter, along with two of its adaptive variants, namely Kesten’s rule and scale-and-shifted KW (SSKW). We conduct a sensitivity anal- ysis of KW and explore the tightness of an mean-squared error (MSE) bound for quadratic functions, a relevant issue for determining how long to run an SA algo- rithm. Then, we propose two new adaptive step size sequences inspired by both Kesten’s rule and SSKW, which address some of their weaknesses. Instead of us- ing one step size sequence, our adaptive step size is based on two deterministic sequences, and the step size used in the current iteration depends on the perceived proximity of the current iterate to the optimum. In addition, we introduce a method to adaptively adjust the two deterministic sequences. Lastly, we investigate the performance of a modified pathwise gradient esti- mation method that is applied to financial options with discontinuous payo↵s, and in particular, used to estimate the Greeks, which measure the rate of change of (financial) derivative prices with respect to underlying market parameters and are central to financial risk management. The newly proposed kernel estimator relies on a smoothing bandwidth parameter. We explore the accuracy of the Greeks with varying bandwidths and investigate the sensitivity of a proposed iterative scheme that generates an estimate of the optimal bandwidth. STOCHASTIC SIMULATION: NEW STOCHASTIC APPROXIMATION METHODS AND SENSITIVITY ANALYSES by Marie Chau Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2015 Advisory Committee: Prof. Michael C. Fu, Chair/Advisor Prof. Steven I. Marcus Prof. Kasso A. Okoudjou Prof. Ilya O. Ryzhov Prof. Paul J. Smith c Copyright by � Marie Chau 2015 To my loving and supportive parents. ii Acknowledgments First, I would like to express my sincerest gratitude and deepest apprecia- tion to my advisor, Prof. Michael C. Fu, for the tremendous amount of support, guidance, feedback, and faith throughout this entire journey. Without him, this dis- sertation would not be possible. Prof. Fu is not only a highly-respected scholar and an excellent teacher who genuinely cares about his students, but also an amazing mentor, always with the best intentions. I admire his breadth and depth of knowl- edge, patience, humility, authenticity, honesty, compassion, optimism, and innate good-hearted nature. For this experience and the opportunity to work with him, I will be forever grateful. Next, I would like to thank the other committee members - Professors Steve I. Marcus, KassoA.Okoudjou, IlyaO.Ryzhov, andPaulJ.Smith-fortakingthetime toreadthisdissertationandattendmydefense. SpecialthankstoProfessorsPatrick M. Fitzpatrick, Eric V. Slud, and David H. Hamilton for their encouragement and willingness to help in their classes as well as in my job search. Last, but certainly not least, I would like to thank the people I’ve met at UMD whohavesincebecomesomeofmynearestanddearestfriends. Averyspecialthanks to Patrick Sodr´e Carlos who I met randomly in di↵erential equations during summer school in 2007 and is now one of my closest friends and biggest supporters. He’s always willing to help not only myself but others as well, and is one of the main reasons why I’m still here. Another special thanks to Karamatou Yacoubou Djima for her loyal friendship from day one of boot camp. Graduate school would not iii have been the same without her, from our countless food outings, endless hours of studying for quals, interesting conversations, fun times, and many laughs. Also a very special thanks to Huashuai Qu, Xuan Liu, Jong Jun Lee, Zhixin Lu, Ran Ji, and Anusha Dandapani for being amazing! Each of you have made a dent in my life in your own special way. Thanks to the rest of my 1305 o�cemates. Many thanks to Rhyneta Gumbs who has a very kind heart, always looking out for the students. Thanksto theothergirl’snightmath ladies, JenniferClarkson, ClareWickman, and Hana Ueda for your emotional support and regular dinner outings. Also, thanks to the rest of the boot camp class of 2008 for unforgettable memories. Thanks to Hisham Talukder for the many laughs, Temba and Joe for our fun outings, Lucia Simonelli for our salsa nights, Yimei Fan for fun times, Dana Botesteanu for good conversations over dinner/drinks, Changhui Tan and Wenqing Hu for their patience and invaluable math discussions. Also, thanks to Alverda McCoy for always being so pleasant and flexible, saving me from many administrative disasters. Mostofall,Ithankmyparents. Wordscan’tbegintoexplainhowgratefulIam for all they’ve done throughout my entire life. Their morals, values, determination, patience, and supportive nature have shaped me into the person I am today. iv Table of Contents List of Figures viii List of Notations x 1 Introduction 1 1.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.1 Stochastic Approximation . . . . . . . . . . . . . . . . . . . . 7 1.1.2 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 Background 11 2.1 Stochastic Gradient Estimation Methods . . . . . . . . . . . . . . . . 11 2.1.1 Indirect Gradients . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.1.1 Finite Di↵erences . . . . . . . . . . . . . . . . . . . . 12 2.1.1.2 Simultaneous Perturbation . . . . . . . . . . . . . . . 13 2.1.1.3 Random Directions . . . . . . . . . . . . . . . . . . . 14 2.1.2 Direct Gradients . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.2.1 Infinitesimal Perturbation Analysis . . . . . . . . . . 16 2.1.2.2 Likelihood Ratio/Score Function . . . . . . . . . . . 17 2.1.2.3 Weak Derivatives . . . . . . . . . . . . . . . . . . . . 18 2.1.2.4 General Extension . . . . . . . . . . . . . . . . . . . 19 2.2 Stochastic Approximation . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Classical Methods . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1.1 Robbins-Monro . . . . . . . . . . . . . . . . . . . . . 20 2.2.1.2 Kiefer-Wolfowitz . . . . . . . . . . . . . . . . . . . . 22 2.2.2 Robust Gradient . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2.2.1 Simultaneous Perturbation Stochastic Approximation 25 2.2.2.2 Gradient Averaging . . . . . . . . . . . . . . . . . . 27 2.2.3 Adaptive Step Sizes . . . . . . . . . . . . . . . . . . . . . . . . 29 2.2.3.1 Kesten’s Rule . . . . . . . . . . . . . . . . . . . . . . 29 2.2.3.2 Scaled-and-Shifted Kiefer-Wolfowitz . . . . . . . . . 30 2.2.4 Robust Output . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.2.4.1 Averaging Iterates . . . . . . . . . . . . . . . . . . . 35 2.2.4.2 Robust Stochastic Approximation . . . . . . . . . . . 37 2.2.4.3 Acceleration Stochastic Approximation . . . . . . . . 38 2.2.4.4 Numerical Comparison . . . . . . . . . . . . . . . . . 43 2.2.5 Varying Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . 45 v 3 New Hybrid Stochastic Approximation Methods 49 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2 Secant-Tangents AveRaged Stochastic Approximation . . . . . . . . . 50 3.2.1 Optimal Convex Weight . . . . . . . . . . . . . . . . . . . . . 52 3.2.1.1 Homogeneous Noise . . . . . . . . . . . . . . . . . . 52 3.2.1.2 Non-homogeneous Noise . . . . . . . . . . . . . . . . 54 3.2.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2.3 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . 64 3.2.3.1 Experiment 1: vary initial value . . . . . . . . . . . . 65 3.2.3.2 Experiment 2: vary steepness level . . . . . . . . . . 67 3.2.3.3 Results Summary . . . . . . . . . . . . . . . . . . . . 69 3.3 STAR-SPSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.3.1 Optimal Deterministic Weights . . . . . . . . . . . . . . . . . 71 3.3.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.3.3 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . 78 3.3.3.1 9-station Closed Jackson Network . . . . . . . . . . . 78 3.4 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . 85 4 Step Size Selection in Stochastic Approximation 87 4.1 Sensitivity of Finite-time Performance to Step Size . . . . . . . . . . 87 4.1.1 KW and its Variants . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Finite-time MSE Bound . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.1 Tightness of the Finite-time MSE Bound for Quadratics . . . 92 4.3.2 Sensitivity of KW and its Variants . . . . . . . . . . . . . . . 95 4.4 PROX-step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.5 Adaptive PROX-step . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.6 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.6.1 Deterministic Problem with Added Noise . . . . . . . . . . . . 110 4.6.2 9-station Closed Jackson Queueing Network . . . . . . . . . . 116 4.7 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . 117 5 Greek Kernel Estimators 120 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 5.2 Problem Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.3 Generalized Pathwise Method . . . . . . . . . . . . . . . . . . . . . . 125 5.3.1 First-Order Greeks . . . . . . . . . . . . . . . . . . . . . . . . 125 5.3.2 Second-Order Greeks . . . . . . . . . . . . . . . . . . . . . . . 126 5.3.3 Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.3.4 First- and Second-Order Greek Estimators . . . . . . . . . . . 129 5.4 Pilot Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 5.5 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.5.1 Sensitivity to Bandwidth . . . . . . . . . . . . . . . . . . . . . 131 5.5.2 Sensitivity of Bandwidth Generator to Input Parameters . . . 136 5.6 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . 141 vi Bibliography 142 vii
Description: