ebook img

The autoregressive stochastic block model with changes in structure PDF

179 Pages·2017·3.36 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The autoregressive stochastic block model with changes in structure

The autoregressive stochastic block model with changes in structure Matthew Ludkin, M.Sci.(Hons.), M.Res Submitted for the degree of Doctor of Philosophy at Lancaster University. November 2017 Declaration I declare that the work in this thesis has been done by myself and has not been submitted elsewhere for the award of any other degree. Matthew Robert Ludkin A version of Chapter 3 has been published as: Ludkin, M., Eckley, I., and Neal, P. (2017). Dynamic stochastic block models: parameter estimation and detection of changes in community structure. Statistics and Computing. https://doi.org/10.1007/s11222-017-9788-9 ii Acknowledgements I would like to thank my supervisors: Peter Neal and Idris Eckley. Their help and guidance has been crucial to the completion of this thesis. I would like to thank the STOR-i Centre for Doctoral Training and the EPSRC for providing a vibrant community in which to conduct research. Thanks also to DSTL and Ralph Manssonwho provided fundingfor thePhD program andhelpful discussionsabout the research within this thesis. Finally, I would like to thank Helen Eden for the care and support provided during my PhD, without which I know it would never have been possible. iii Abstract Network science has been a growing subject for the last three decades, with sta- tistical analysis of networks seing an explosion since the advent of online social networks. An important model within network analysis is the stochastic block model, which aims to partition the set of nodes of a network into groups which behave in a similar way. This thesis proposes Bayesian inference methods for problems related to the stochastic block model for network data. The presented research is formed of three parts. Firstly, two Markov chain Monte Carlo samplers are proposed to sample from the posterior distribution of the number of blocks, blockmembershipsandedge-stateparametersinthestochasticblockmodel. These allow for non-binary and non-conjugate edge models, something not considered in the literature. Secondly, a dynamic extension to the stochastic block model is presented which includes autoregressive terms. This novel approach to dynamic network models allowsthepresentstateofanedgetoinfluencefuturestates,andisthereforenamed the autoregresssive stochastic block model. Furthermore, an algorithm to perform inference on changes in block membership is given. This problem has gained some attention in the literature, but not with autoregressive features to the edge-state distribution as presented in this thesis. Thirdly, an online procedure to detect changes in block membership in the au- toregresssive stochastic block model is presented. This allows networks to be monitored through time, drastically reducing the data storage requirements. On top of this, the network parameters can be estimated together with the block iv memberships. Finally, conclusions are drawn from the above contributions in the context of the network analysis literature and future directions for research are identified. v Contents Declaration ii Acknowledgements iii Abstract iv List of Figures viii List of Tables xi 1. Introduction 1 1.1. Modelling networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1. Notation and definitions . . . . . . . . . . . . . . . . . . . . 3 1.1.2. Mathematical models for networks . . . . . . . . . . . . . . 4 1.1.3. Statistical network models . . . . . . . . . . . . . . . . . . . 7 1.1.4. The stochastic block model . . . . . . . . . . . . . . . . . . 12 1.2. Dynamic stochastic block models . . . . . . . . . . . . . . . . . . . 16 1.3. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2. Arbitrary edge-states and unknown number of blocks in the stochastic block model 24 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2. Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.2.1. Prior for block structure . . . . . . . . . . . . . . . . . . . . 31 2.3. Dirichlet process sampler . . . . . . . . . . . . . . . . . . . . . . . . 34 vi Contents 2.4. Split-merge sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.5. Simulated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.5.1. Example networks . . . . . . . . . . . . . . . . . . . . . . . 48 2.5.2. Assessing convergence . . . . . . . . . . . . . . . . . . . . . 56 2.6. Real data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.6.1. Macaque . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.6.2. Enron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.6.3. Stack Overflow . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.7. Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3. Autoregressive stochastic block model with changes in block member- ship 67 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2. The autoregressive stochastic block model . . . . . . . . . . . . . . 71 3.2.1. Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.2.2. Posterior distribution . . . . . . . . . . . . . . . . . . . . . . 73 3.2.3. Identifiability . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3. Reversible jump MCMC . . . . . . . . . . . . . . . . . . . . . . . . 78 3.3.1. Sampling scheme . . . . . . . . . . . . . . . . . . . . . . . . 78 3.3.2. Updating change points and augmented edge states . . . . . 80 3.4. Initialisation of sampler state . . . . . . . . . . . . . . . . . . . . . 84 3.5. Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.6. Application: Communities of mice . . . . . . . . . . . . . . . . . . . 90 3.7. Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4. Online monitoring of block membership in the autoregressive stochas- tic block model 98 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2. Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 vii Contents 4.3. SMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.3.1. Sufficient statistics . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.2. SMC algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.4. Simulated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.4.1. Comparison to offline methods . . . . . . . . . . . . . . . . . 120 4.5. Application to dynamic contact network . . . . . . . . . . . . . . . 123 4.5.1. Mice network . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.6. Closing remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5. Perspectives and future directions 129 A. Appendix for arbitrary edge-states and unknown number of blocks in the stochastic block model 133 A.1. Enron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 B. Appendix for autoregressive stochastic block model with changes in block membership 136 C. Appendixformonitoringblockmembershipintheautoregressivestochas- tic block model 144 C.1. Posterior plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 C.2. Trace of block membership per mouse . . . . . . . . . . . . . . . . . 148 Bibliography 148 viii List of Figures 2.1. Comparisonofblockstructures. TopleftCRP(1),toprightCRP(5). Bottom left: DMA(1,5), bottom right: DMA(5,5). . . . . . . . . . . 34 2.2. Posterior summaries for block membership in Bernoulli example network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.3. Posterior summaries for block membership in Poisson example net- work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.4. Posterior summaries for block membership in normal example net- work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.5. Posterior summaries for block membership in negative binomial ex- ample network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.6. Trace plots for number of blocks in example networks. Two chains aresimulatedineachcase: The“lowerchain”withallnodesinitially in one block (orange line) and the “upper chain” with all nodes initially assigned to different blocks (blue line). . . . . . . . . . . . . 58 2.7. Posterior summaries for block membership in Macaque brain net- work using Dirichlet process sampler. . . . . . . . . . . . . . . . . . 59 2.8. Posterior summaries for block membership in Macaque brain net- work using Dirichlet process sampler. . . . . . . . . . . . . . . . . . 60 2.9. Posterior summaries for block membership in Enron network with Poisson edge-state model. . . . . . . . . . . . . . . . . . . . . . . . 62 2.10.Posterior summaries for block membership in Enron network with negative binomial edge-state model. . . . . . . . . . . . . . . . . . . 62 ix List of Figures 2.11.Posterior summaries for block membership in Stack Overflow network. 64 3.1. Possible changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2. Elbow plot for determining the number of communities with which to initialise the sampler. . . . . . . . . . . . . . . . . . . . . . . . . 92 3.3. ARSBM: Maximum a posteriori community membership of each mouse through time. Community labels: 1 - red, 2 - yellow, 3 - green, 4 - sky blue, 5 - dark blue, 6 - purple. . . . . . . . . . . . . . 95 3.4. dynSBM: Maximum a posteriori community membership of each mouse through time. Community labels: 1 - red, 2 - yellow, 3 - green, 4 - sky blue, 5 - dark blue, 6 - purple. . . . . . . . . . . . . . 96 4.1. v-measure for simulated networks against method. 1 - Gibbs from mean, 2 - Gibbs from previous particle, 3 - store augmented states. 118 4.2. Mean absolute deviation for simulated networks against method. 1 - Gibbs from mean, 2 - Gibbs from previous particle, 3 - store augmented states. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.3. Bias for simulated networks against method. 1 - Gibbs from mean, 2 - Gibbs from previous particle, 3 - store augmented states. . . . . 120 4.4. v-measure against true λ values with method 2. . . . . . . . . . . . 121 4.5. Comparison of v-measure for simulated networks under SMC and RJMCMC algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.6. Comparison of bias in parameters under SMC (left) and RJMCMC (right) for example networks. . . . . . . . . . . . . . . . . . . . . . 125 4.7. Maximum a posteriori block memberships of mice. . . . . . . . . . 127 A.1. Posterior summaries for block membership in Enron network with Poisson edge-state model and strong prior. . . . . . . . . . . . . . . 134 A.2. Posterior summaries for block membership in Enron network with negative binomial edge-state model and strong prior. . . . . . . . . 134 x

Description:
Posterior summaries for block membership in negative binomial ex- ample network. javascript json angularjs ionic-framework re- actjs mongodb
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.