Table Of Content

Matrix Approximation for Large-scale Learning by Ameet Talwalkar A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science Courant Institute of Mathematical Sciences New York University May 2010 Mehryar Mohri—Advisor c Ameet Talwalkar (cid:13) All Rights Reserved, 2010 For Aai and Baba iv Acknowledgments I would first like to thank my advisor, Mehryar Mohri, for his guidance throughout my doctoral studies. He gave me an opportunity to pursue a PhD, patiently taught me about the field of machine learning and guided me towards exciting research questions. He also introduced me to my mentors and collaborators at Google Research, Sanjiv Kumar and Corinna Cortes, both of whom have been tremendous role models for me throughout my studies. I would also like to thank the final two members of my thesis committee, Den- nis Shasha and Mark Tygert, as well as Subhash Khot, who sat on my DQE and thesis proposal, for their encouragement and helpful advice. During my time at Courant and my summers at Google, I have had the good fortune to work and interact with several other exceptional people. In particular, IwouldliketothankEugeneWeinstein, AmeeshMakadia, CyrilAl- lauzen, Dejan Jovanović, Shaila Musharoff, Ashish Rastogi, Rosemary Amico, Michael Riley, Henry Rowley and Jeremy Shute for helping me along the way and making my studies and research more enjoyable over these past four years. I would especially like to thank my partner in crime, Afshin Rostamizadeh, for v being a supportive officemate and a considerate friend throughout our count- less hours working together. Last, but not least, I would like to thank my friends and family for their unwavering support. In particular, I have consistently drawn strength from my lovely girlfriend Jessica, my brother Jaideep, my sister-in-law Kristen and the three cutest little men in the world, my nephews Kavi, Nayan and Dev. And to my parents, Rohini and Shrirang, to whom this thesis is dedicated, I am infinitely grateful. They are my sources of inspiration and my greatest teachers, and any achievement I may have is a credit to them. Thank you, Aai and Baba. vi Abstract Modern learning problems in computer vision, natural language processing, computational biology, and other areas are often based on large data sets of tens of thousands to millions of training instances. However, several stan- dardlearningalgorithms, suchaskernel-based algorithms, e.g., SupportVector Machines, Kernel Ridge Regression, Kernel PCA, do not easily scale to such ordersofmagnitude. Thisthesisfocusesonsampling-basedmatrixapproxima- tion techniques that help scale kernel-based algorithms to large-scale datasets. We address several fundamental theoretical and empirical questions including: 1. What approximation should be used? We discuss two common sampling- based methods, providing novel theoretical insights regarding their suit- ability for various applications and experimental results motivated by this theory. Our results show that one of these methods, the Nyström method, is superior in the context of large-scale learning. 2. Do these approximations work in practice? We show the effectiveness of approximation techniques on a variety of problems. In the largest study vii to-dateformanifoldlearning, weusetheNyströmmethodtoextractlow- dimensional structure from high-dimensional data to effectively cluster face images. We also report good empirical results for Kernel Ridge Regression and Kernel Logistic Regression. 3. How should we sample columns? A key aspect of sampling-based algorithms is the distribution according to which columns are sampled. We study both fixed and adaptive sampling schemes as well as a promising ensembletechniquethatcan beeasily parallelized and generatessuperior approximations, both in theory and in practice. 4. How well do these approximations work in theory? We provide theoretical analyses of the Nyström method to understand when this technique shouldbeused. Wepresentguaranteesonapproximationaccuracybased on various matrix properties and analyze the effect of matrix approximation on actual kernel-based algorithms. This work has important consequences for the machine learning commu- nity since it extends to large-scale applications the benefits of kernel-based algorithms. The crucial aspect of this research, involving low-rank matrix approximation, is of independent interest within the field of linear algebra. viii Contents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Low Rank Approximations 10 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 Nyström method . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 Column-sampling method . . . . . . . . . . . . . . . . 13 2.2 Nyström vs Column-sampling . . . . . . . . . . . . . . . . . . 14 ix 2.2.1 Singular values and singular vectors . . . . . . . . . . . 15 2.2.2 Low-rank approximation . . . . . . . . . . . . . . . . . 16 2.2.3 Empirical comparison . . . . . . . . . . . . . . . . . . . 24 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3 Applications 32 3.1 Large-scale Manifold Learning . . . . . . . . . . . . . . . . . . 33 3.1.1 Manifold learning . . . . . . . . . . . . . . . . . . . . . 37 3.1.2 Approximation experiments . . . . . . . . . . . . . . . 40 3.1.3 Large-scale learning . . . . . . . . . . . . . . . . . . . . 42 3.1.4 Manifold evaluation . . . . . . . . . . . . . . . . . . . . 49 3.2 Woodbury Approximation . . . . . . . . . . . . . . . . . . . . 56 3.2.1 Nyström Logistic Regression . . . . . . . . . . . . . . . 57 3.2.2 Kernel Ridge Regression . . . . . . . . . . . . . . . . . 62 3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4 Sampling Schemes 66 4.1 Fixed Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.1.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2 Adaptive Sampling . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.1 Adaptive Nyström sampling . . . . . . . . . . . . . . . 74 4.2.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Ensemble Sampling . . . . . . . . . . . . . . . . . . . . . . . . 79 x

Description:

Matrix Approximation for Large-scale Learning by Ameet Talwalkar nis Shasha and Mark Tygert, as well as Subhash Khot, who sat on my DQE and thesis proposal,

Matrix Approximation for Large-scale Learning PDF

180 Pages·2010·1.5 MB·English

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Matrix Approximation for Large-scale Learning PDF Free - Full Version

by Unknow| 2010| 180 pages| 1.5| English

Download Matrix Approximation for Large-scale Learning by in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Matrix Approximation for Large-scale Learning

Matrix Approximation for Large-scale Learning by Ameet Talwalkar nis Shasha and Mark Tygert, as well as Subhash Khot, who sat on my DQE and thesis proposal,

Detailed Information

Author:	Unknown
Publication Year:	2010
Pages:	180
Language:	English
File Size:	1.5
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Matrix Approximation for Large-scale Learning Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Matrix Approximation for Large-scale Learning PDF?

Yes, on https://PDFdrive.to you can download Matrix Approximation for Large-scale Learning by completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Matrix Approximation for Large-scale Learning on my mobile device?

After downloading Matrix Approximation for Large-scale Learning PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Matrix Approximation for Large-scale Learning?

Yes, this is the complete PDF version of Matrix Approximation for Large-scale Learning by Unknow. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Matrix Approximation for Large-scale Learning PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.