Joint Optimization of Fidelity and Commensurability for Manifold Alignment and Graph Matching by Sancar Adali A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore, Maryland March, 2014 © Sancar Adali 2014 All rights reserved Abstract In this thesis, we investigate how to perform inference in settings in which the data consist of different modalities or views. For effective learning utilizing the information available, data fusion that considers all views of these multiview data settings is needed. We also require dimensionality reduction to address the problems associated with high dimensionality, or “the curse of dimensionality.” We are interested in the type of infor- mation that is available in the multiview data that is essential for the inference task. We alsoseektodeterminetheprinciplestobeusedthroughoutthedimensionalityreduction and data fusion steps to provide acceptable task performance. Our research focuses on exploring how these queries and their solutions are relevant to particular data problems of interest. Primary Reader: Carey E Priebe Secondary Reader: Donniell E Fishkind ii Dedication This thesis is dedicated to myself because I did all the hard work and to my family whosupportedmeineveryway,especiallymymother,fromwhomIinheritmyloveof science. iii Contents Abstract ii List of Tables x List of Figures xi 1 Introduction 1 1.1 Data Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Exploitation Task . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Dissimilarity representation . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Match Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Related Work 9 2.1 Multiple View Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Transfer Learning and Domain Adaptation . . . . . . . . . . . . . . . . 13 2.3 Manifold Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 iv CONTENTS 3 Variants of Multidimensional Scaling and Principal Components Analysis 19 3.1 Multidimensional Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Different criteria for MDS . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.1 Metric MDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1.1 Stress Criterion . . . . . . . . . . . . . . . . . . . . . 21 3.2.1.2 Sammon Mapping Criterion . . . . . . . . . . . . . . 22 3.2.2 Ordinal(Nonmetric) MDS . . . . . . . . . . . . . . . . . . . . 22 3.2.3 Classical MDS and the Strain Criterion . . . . . . . . . . . . . . 23 3.2.4 Relationship with other embedding methods . . . . . . . . . . . 27 3.2.5 Effect of Perturbations . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.6 Maximum Likelihood MDS and MULTISCALE . . . . . . . . . 28 3.2.7 Three-way MDS . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Principal Components Analysis . . . . . . . . . . . . . . . . . . . . . . 30 3.3.1 Principal Components Analysis and Classical Multidimensional Scaling . . . . . . . . . . . . . . . . 32 4 An expository problem for Multiview Learning : Match detection 34 4.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2 Definition of an optimal embedding weight parameter: w∗ . . . . . . . . 41 4.2.1 Continuity of AUC(·) . . . . . . . . . . . . . . . . . . . . . . . 46 5 Fidelity and Commensurability 50 v CONTENTS 5.1 The concepts of Fidelity and Commensurability . . . . . . . . . . . . . 50 5.2 Fidelity and Commensurability Tradeoff . . . . . . . . . . . . . . . . . 55 6 Data Models for the Match Detection Task 57 6.1 Two data settings for Match Detection . . . . . . . . . . . . . . . . . . . 57 6.1.1 Gaussian setting . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.1.2 Dirichlet setting . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.1.3 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 7 Procrustes Analysis for Data Fusion 61 7.1 Procrustes Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.2 Procrustes Analysis for Manifold Matching . . . . . . . . . . . . . . . . 64 7.2.1 Relation of P◦M and JOFC . . . . . . . . . . . . . . . . . . . . 65 7.3 Generalized Procrustes Analysis(K > 2) . . . . . . . . . . . . . . . . . 67 8 Canonical Correlation Analysis for Data Fusion 69 8.1 Canonical Correlational Analysis on Multidimensional Scaling embeddings . . . . . . . . . . . . . . . . . . . 69 8.2 Canonical Correlational Analysis . . . . . . . . . . . . . . . . . . . . . 70 8.3 Geometric Interpretation of Canonical Correlational Analysis . . . . . . . . . . . . . . . . . . . . . 72 8.4 Relationship between CCA and Commensurability . . . . . . . . . . . 74 8.5 Spectral Embedding Generalization of CCA . . . . . . . . . . . . . . . 76 vi CONTENTS 8.6 Generalized CCA: K > 2 . . . . . . . . . . . . . . . . . . . . . . . . . 78 9 Multiple Minima in Multidimensional Scaling 79 9.1 Discontinuity in weighted raw stress OOS configurations . . . . . . . . 81 10 Simulations and Experiments 94 10.1 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 10.1.1 McNemar’s Test . . . . . . . . . . . . . . . . . . . . . . . . . . 106 10.2 Effects of the parameters of the data model . . . . . . . . . . . . . . . . 108 10.3 Match Testing when the number of conditions, K is larger than 2 . . . . 110 10.4 Experiments on Wiki Data . . . . . . . . . . . . . . . . . . . . . . . . . 112 10.5 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 11 Seeded Graph Matching and Fast Approximate Quadratic Programming 119 11.1 Introduction to Graph Matching . . . . . . . . . . . . . . . . . . . . . . 119 11.1.1 Graph Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 120 11.2 Fast Approximate Quadratic Programming for the Seeded Graph Matching problem . . . . . . . . . . . . . . . . . . . 127 11.2.1 Frank-Wolfe algorithm . . . . . . . . . . . . . . . . . . . . . . . 128 11.2.2 rQAP formulation of the Seeded Graph Matching problem 1 and the FAQ Algorithm . . . . . . . . . . . . . . . . . . . . . . 129 11.2.2.1 Demonstration of the FAQ algorithm on simulated data 133 vii CONTENTS 11.2.3 Relaxations of alternate formulations of the approximate seeded graph matching problem . . . . . . . . 135 11.2.4 The comparison of the rQAP against 1 the alternative formulation rQAP . . . . . . . . . . . . . . . . 140 2 11.2.5 A hybrid formulation: FAQ programming with a smooth transi- tion from rQAP to rQAP . . . . . . . . . . . . . . . . . . . . 142 2 1 12 The Joint Optimization of Fidelity and Commensurability solution to Seeded Graph Matching 150 12.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 12.2 Joint Embedding of Graphs via JOFC for Seeded Graph Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 151 12.2.1 Dissimilarity Measures for Vertices . . . . . . . . . . . . . . . . 156 12.3 Demonstrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 12.3.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 12.3.2 Experiments on real data . . . . . . . . . . . . . . . . . . . . . . 164 12.3.2.1 C. elegans connectome . . . . . . . . . . . . . . . . . 164 12.3.2.2 Enron communication graph . . . . . . . . . . . . . . 168 12.3.2.3 Wikipedia hyperlink subgraph . . . . . . . . . . . . . 173 12.3.2.4 Charitynet graph . . . . . . . . . . . . . . . . . . . . 174 12.3.3 One-to-k matching of vertices . . . . . . . . . . . . . . . . . . . 178 viii CONTENTS 13 Conclusion 181 13.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 13.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Bibliography 187 Vita 198 ix List of Tables 9.1 The entries of the dissimilarity matrix(rounded to two decimal digits) . 82 9.2 Final stress values for the two local minima configurations . . . . . . . . 91 x
Description: