ebook img

Randomized methods for computing low-rank approximations of matrices PDF

200 Pages·2012·4.38 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Randomized methods for computing low-rank approximations of matrices

Randomized methods for computing low-rank approximations of matrices by Nathan P. Halko B.S., University of California, Santa Barbara, 2007 M.S., University of Colorado, Boulder, 2010 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Applied Mathematics 2012 This thesis entitled: Randomized methods for computing low-rank approximations of matrices written by Nathan P. Halko has been approved for the Department of Applied Mathematics Per-Gunnar Martinsson Keith Julian David M. Bortz Francois G. Meyer Date The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline. iii Halko, Nathan P. (Ph. D., Applied Mathematics) Randomized methods for computing low-rank approximations of matrices Thesis directed by Professor Per-Gunnar Martinsson Randomized sampling techniques have recently proved capable of efficiently solving many standard problemsinlinearalgebra, andenablingcomputationsatscalesfarlargerthanwhatwaspreviously possible. Thenewalgorithmsaredesignedfromthebottomuptoperformwellinmoderncomputing environments where the expense of communication is the primary constraint. In extreme cases, the algorithms can even be made to work in a streaming environment where the matrix is not stored at all, and each element can be seen only once. The dissertation describes a set of randomized techniques for rapidly constructing a low-rank ap- proximationtoamatrix. Thealgorithmsarepresentedinamodularframeworkthatfirstcomputes an approximation to the range of the matrix via randomized sampling. Secondly, the matrix is pro- jected to the approximate range, and a factorization (SVD, QR, LU, etc.) of the resulting low-rank matrix is computed via variations of classical deterministic methods. Theoretical performance bounds are provided. Particular attention is given to very large scale computations where the matrix does not fit in RAM on a single workstation. Algorithms are developed for the case where the original matrix must be stored out-of-core but where the factors of the approximation fit in RAM. Numerical examples are provided that perform Principal Component Analysis of a data set that is so large that less than one hundredth of it can fit in the RAM of a standard laptop computer. Furthermore, the dissertation presents a parallelized randomized scheme for computing a reduced rank Singular Value Decomposition. By parallelizing and distributing both the randomized sampling stage and the processing of the factors in the approximate factorization, the method requires an amount iv of memory per node which is independent of both dimensions of the input matrix. Numerical experiments are performed on Hadoop clusters of computers in Amazon’s Elastic Compute Cloud with up to 64 total cores. Finally, we directly compare the performance and accuracy of the randomized algorithm with the classical Lanczos method on extremely large, sparse matrices and substantiate the claim that randomized methods are superior in this environment. Dedication vi Acknowledgements vii Contents Chapter 1 Introduction 1 1.1 Approximation by low rank matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Existing methods for computing low-rank approximations . . . . . . . . . . . . . . . 3 1.2.1 Truncated Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Direct Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.3 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Changes in computer architecture, and the need for new algorithms. . . . . . . . . . 5 1.4 Computational framework and problem formulation . . . . . . . . . . . . . . . . . . 5 1.5 Randomized algorithms for approximating the range of a matrix . . . . . . . . . . . 7 1.5.1 Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5.2 Basic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 viii 1.5.3 Probabilistic error bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5.4 Computational cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.6 Variations of the basic theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.6.1 Structured Random Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.6.2 Increasing accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.6.3 Numerical Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.6.4 Adaptive methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.7 Numerical linear algebra in a distributed computing environment . . . . . . . . . . . 15 1.7.1 Distributed computing environment . . . . . . . . . . . . . . . . . . . . . . . 15 1.7.2 Distributed operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.7.3 Stochastic Singular Value Decomposition in MapReduce . . . . . . . . . . . . 17 1.7.4 Lanczos comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.7.5 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.8 Survey of prior work on randomized algorithms in linear algebra . . . . . . . . . . . 18 1.9 Structure of dissertation and overview of principal contributions . . . . . . . . . . . 18 ix Bibliography 22 2 Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions 25 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.1 Approximation by low-rank matrices . . . . . . . . . . . . . . . . . . . . . . . 27 2.1.2 Matrix approximation framework . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.1.3 Randomized algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.1.4 A comparison between randomized and traditional techniques . . . . . . . . . 32 2.1.5 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.1.6 Example: Randomized SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.1.7 Outline of paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2 Related work and historical context. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2.1 Randomized matrix approximation . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2.2 Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.3 Linear algebraic preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.3.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.3.2 Standard matrix factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.3.3 Techniques for computing standard factorizations . . . . . . . . . . . . . . . . 48 x 2.4 Stage A: Randomized schemes for approximating the range . . . . . . . . . . . . . . 51 2.4.1 The proto-algorithm revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.4.2 The number of samples required . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.4.3 A posteriori error estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.4.4 Error estimation (almost) for free . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.4.5 A modified scheme for matrices whose singular values decay slowly . . . . . . 55 2.4.6 An accelerated technique for general dense matrices . . . . . . . . . . . . . . 56 2.5 Stage B: Construction of standard factorizations . . . . . . . . . . . . . . . . . . . . 58 2.5.1 Factorizations based on forming Q∗A directly . . . . . . . . . . . . . . . . . . 58 2.5.2 Postprocessing via row extraction . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.5.3 Postprocessing an Hermitian matrix . . . . . . . . . . . . . . . . . . . . . . . 61 2.5.4 Postprocessing a positive semidefinite matrix . . . . . . . . . . . . . . . . . . 61 2.5.5 Single-pass algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 2.6 Computational costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.6.1 General matrices that fit in core memory . . . . . . . . . . . . . . . . . . . . 63 2.6.2 Matrices for which matrix–vector products can be rapidly evaluated . . . . . 64 2.6.3 General matrices stored in slow memory or streamed . . . . . . . . . . . . . . 66 2.6.4 Gains from parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Description:
Nathan P. Halko. B.S., University of California, Santa Barbara, 2007. M.S., University of Colorado, Boulder, 2010. A thesis submitted to the. Faculty of the
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.