ebook img

Advanced Mapping of Environmental Data: Geostatistics, Machine Learning and Bayesian Maximum Entropy PDF

324 Pages·2008·7.097 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Advanced Mapping of Environmental Data: Geostatistics, Machine Learning and Bayesian Maximum Entropy

Advanced Mapping of Environmental Data Advanced Mapping of Environmental Data Geostatistics, Machine Learning and Bayesian Maximum Entropy Edited by Mikhail Kanevski Series Editor Pierre Dumolard First published in Great Britain and the United States in 2008 by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd John Wiley & Sons, Inc. 6 Fitzroy Square 111 River Street London W1T 5DX Hoboken, NJ 07030 UK USA www.iste.co.uk www.wiley.com © ISTE Ltd, 2008 The rights of Mikhail Kanevski to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Cataloging-in-Publication Data Advanced mapping of environmental data : geostatistics, machine learning, and Bayesian maximum entropy / edited by Mikhail Kanevski. p. cm. Includes bibliographical references and index. ISBN 978-1-84821-060-8 1. Geology--Statistical methods. 2. Machine learning. 3. Bayesian statistical decision theory. I. Kanevski, Mikhail. QE33.2.S82A35 2008 550.1'519542--dc22 2008016237 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN: 978-1-84821-060-8 Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire. Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Chapter 1. Advanced Mapping of Environmental Data: Introduction . . . . . 1 M. KANEVSKI 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2. Environmental data analysis: problems and methodology. . . . . . . . . . . 3 1.2.1. Spatial data analysis: typical problems. . . . . . . . . . . . . . . . . . . . 3 1.2.2. Spatial data analysis: methodology. . . . . . . . . . . . . . . . . . . . . . 5 1.2.3. Model assessment and model selection . . . . . . . . . . . . . . . . . . . 8 1.3. Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3.1. Books, tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3.2. Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 2. Environmental Monitoring Network Characterization and Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 D. TUIA and M. KANEVSKI 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2. Spatial clustering and its consequences. . . . . . . . . . . . . . . . . . . . . 20 2.2.1. Global parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.2. Spatial predictions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3. Monitoring network quantification. . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.1. Topological quantification. . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2. Global measures of clustering. . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2.1. Topological indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2.2. Statistical indices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.3. Dimensional resolution: fractal measures of clustering. . . . . . . . . 26 2.3.3.1. Sandbox method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 vi Advanced Mapping of Environmental Data 2.3.3.2. Box-counting method . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.3.3.3. Lacunarity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.4. Validity domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.5. Indoor radon in Switzerland: an example of a real monitoring network. . 36 2.5.1. Validity domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5.2. Topological index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5.3. Statistical indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.5.3.1. Morisita index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.5.3.2. K-function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.5.4. Fractal dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.5.4.1. Sandbox and box-counting fractal dimension. . . . . . . . . . . . 40 2.5.4.2. Lacunarity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.6. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.7. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chapter 3. Geostatistics: Spatial Predictions and Simulations . . . . . . . . . 47 E. SAVELIEVA, V. DEMYANOV and M. MAIGNAN 3.1. Assumptions of geostatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2. Family of kriging models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.1. Simple kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.2. Ordinary kriging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.3. Basic features of kriging estimation . . . . . . . . . . . . . . . . . . . . 51 3.2.4. Universal kriging (kriging with trend). . . . . . . . . . . . . . . . . . . 56 3.2.5. Lognormal kriging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3. Family of co-kriging models . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.1. Kriging with linear regression. . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.2. Kriging with external drift. . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.3. Co-kriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.3.4. Collocated co-kriging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3.5. Co-kriging application example. . . . . . . . . . . . . . . . . . . . . . . 61 3.4. Probability mapping with indicator kriging. . . . . . . . . . . . . . . . . . . 64 3.4.1. Indicator coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.4.2. Indicator kriging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4.3. Indicator kriging applications . . . . . . . . . . . . . . . . . . . . . . . . 69 3.4.3.1. Indicator kriging for 241Am analysis . . . . . . . . . . . . . . . . . 69 3.4.3.2. Indicator kriging for aquifer layer zonation . . . . . . . . . . . . . 71 3.4.3.3. Indicator kriging for localization of crab crowds . . . . . . . . . . 74 3.5. Description of spatial uncertainty with conditional stochastic simulations 76 3.5.1. Simulation vs. estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.5.2. Stochastic simulation algorithms . . . . . . . . . . . . . . . . . . . . . . 77 3.5.3. Sequential Gaussian simulation. . . . . . . . . . . . . . . . . . . . . . . 81 3.5.4. Sequential indicator simulations . . . . . . . . . . . . . . . . . . . . . . 84 Table of Contents vii 3.5.5. Co-simulations of correlated variables. . . . . . . . . . . . . . . . . . . 88 3.6. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Chapter 4. Spatial Data Analysis and Mapping Using Machine Learning Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 F. RATLE, A. POZDNOUKHOV, V. DEMYANOV, V. TIMONIN and E. SAVELIEVA 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.2. Machine learning: an overview. . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2.1. The three learning problems. . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2.2. Approaches to learning from data. . . . . . . . . . . . . . . . . . . . . . 100 4.2.3. Feature selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.2.4. Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2.5. Dealing with uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3. Nearest neighbor methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.4. Artificial neural network algorithms. . . . . . . . . . . . . . . . . . . . . . . 109 4.4.1. Multi-layer perceptron neural network. . . . . . . . . . . . . . . . . . . 109 4.4.2. General Regression Neural Networks . . . . . . . . . . . . . . . . . . . 119 4.4.3. Probabilistic Neural Networks. . . . . . . . . . . . . . . . . . . . . . . . 122 4.4.4. Self-organizing (Kohonen) maps . . . . . . . . . . . . . . . . . . . . . . 124 4.5. Statistical learning theory for spatial data: concepts and examples. . . . . 131 4.5.1. VC dimension and structural risk minimization . . . . . . . . . . . . . 131 4.5.2. Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4.5.3. Support vector machines. . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.5.4. Support vector regression . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.5.5. Unsupervised techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . 141 4.5.5.1. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 4.5.5.2. Nonlinear dimensionality reduction. . . . . . . . . . . . . . . . . . 144 4.6. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 4.7. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Chapter 5. Advanced Mapping of Environmental Spatial Data: Case Studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 L. FORESTI, A. POZDNOUKHOV, M. KANEVSKI, V. TIMONIN, E. SAVELIEVA, C. KAISER, R. TAPIA and R. PURVES 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.2. Air temperature modeling with machine learning algorithms and geostatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 5.2.1. Mean monthly temperature . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.2.1.1. Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5.2.1.2. Variography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 5.2.1.3. Step-by-step modeling using a neural network . . . . . . . . . . . 153 viii Advanced Mapping of Environmental Data 5.2.1.4. Overfitting and undertraining . . . . . . . . . . . . . . . . . . . . . 154 5.2.1.5. Mean monthly air temperature prediction mapping . . . . . . . . 156 5.2.2. Instant temperatures with regionalized linear dependencies . . . . . . 159 5.2.2.1. The Föhn phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.2.2.2. Modeling of instant air temperature influenced by Föhn . . . . . 160 5.2.3. Instant temperatures with nonlinear dependencies. . . . . . . . . . . . 163 5.2.3.1. Temperature inversion phenomenon . . . . . . . . . . . . . . . . . 163 5.2.3.2. Terrain feature extraction using Support Vector Machines . . . . 164 5.2.3.3. Temperature inversion modeling with MLP. . . . . . . . . . . . . 165 5.3. Modeling of precipitation with machine learning and geostatistics. . . . . 168 5.3.1. Mean monthly precipitation . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.3.1.1. Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.3.1.2. Precipitation modeling with MLP. . . . . . . . . . . . . . . . . . . 171 5.3.2. Modeling daily precipitation with MLP . . . . . . . . . . . . . . . . . . 173 5.3.2.1. Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 5.3.2.2. Practical issues of MLP modeling. . . . . . . . . . . . . . . . . . . 174 5.3.2.3. The use of elevation and analysis of the results. . . . . . . . . . . 177 5.3.3. Hybrid models: NNRK and NNRS. . . . . . . . . . . . . . . . . . . . . 179 5.3.3.1. Neural network residual kriging. . . . . . . . . . . . . . . . . . . . 179 5.3.3.2. Neural network residual simulations . . . . . . . . . . . . . . . . . 182 5.3.4. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 5.4. Automatic mapping and classification of spatial data using machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.4.1. k-nearest neighbor algorithm . . . . . . . . . . . . . . . . . . . . . . . . 185 5.4.1.1. Number of neighbors with cross-validation . . . . . . . . . . . . . 187 5.4.2. Automatic mapping of spatial data. . . . . . . . . . . . . . . . . . . . . 187 5.4.2.1. KNN modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 5.4.2.2. GRNN modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 5.4.3. Automatic classification of spatial data . . . . . . . . . . . . . . . . . . 192 5.4.3.1. KNN classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.4.3.2. PNN classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 5.4.3.3. Indicator kriging classification. . . . . . . . . . . . . . . . . . . . . 197 5.4.4. Automatic mapping – conclusions . . . . . . . . . . . . . . . . . . . . . 199 5.5. Self-organizing maps for spatial data – case studies . . . . . . . . . . . . . 200 5.5.1. SOM analysis of sediment contamination. . . . . . . . . . . . . . . . . 200 5.5.2. Mapping of socio-economic data with SOM . . . . . . . . . . . . . . . 204 5.6. Indicator kriging and sequential Gaussian simulations for probability mapping. Indoor radon case study. . . . . . . . . . . . . . . . . . . . . . . . . . . 209 5.6.1. Indoor radon measurements . . . . . . . . . . . . . . . . . . . . . . . . . 209 5.6.2. Probability mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 5.6.3. Exploratory data analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . 212 5.6.4. Radon data variography . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 5.6.4.1. Variogram for indicators . . . . . . . . . . . . . . . . . . . . . . . . 216 Table of Contents ix 5.6.4.2. Variogram for Nscores . . . . . . . . . . . . . . . . . . . . . . . . . 217 5.6.5. Neighborhood parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 218 5.6.6. Prediction and probability maps. . . . . . . . . . . . . . . . . . . . . . . 219 5.6.6.1. Probability maps with IK. . . . . . . . . . . . . . . . . . . . . . . . 219 5.6.6.2. Probability maps with SGS. . . . . . . . . . . . . . . . . . . . . . . 220 5.6.7. Analysis and validation of results. . . . . . . . . . . . . . . . . . . . . . 221 5.6.7.1. Influence of the simulation net and the number of neighbors. . . 221 5.6.7.2. Decision maps and validation of results . . . . . . . . . . . . . . . 222 5.6.8. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 5.7. Natural hazards forecasting with support vector machines – case study: snow avalanches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 5.7.1. Decision support systems for natural hazards. . . . . . . . . . . . . . . 227 5.7.2. Reminder on support vector machines. . . . . . . . . . . . . . . . . . . 228 5.7.2.1. Probabilistic interpretation of SVM. . . . . . . . . . . . . . . . . . 229 5.7.3. Implementing an SVM for avalanche forecasting . . . . . . . . . . . . 230 5.7.4. Temporal forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 5.7.4.1. Feature selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 5.7.4.2. Training the SVM classifier . . . . . . . . . . . . . . . . . . . . . . 232 5.7.4.3. Adapting SVM forecasts for decision support. . . . . . . . . . . . 233 5.7.5. Extending the SVM to spatial avalanche predictions . . . . . . . . . . 237 5.7.5.1. Data preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 5.7.5.2. Spatial avalanche forecasting. . . . . . . . . . . . . . . . . . . . . . 239 5.7.6. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 5.8. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 5.9. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Chapter 6. Bayesian Maximum Entropy – BME. . . . . . . . . . . . . . . . . . 247 G. CHRISTAKOS 6.1. Conceptual framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 6.2. Technical review of BME. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 6.2.1. The spatiotemporal continuum . . . . . . . . . . . . . . . . . . . . . . . 251 6.2.2. Separable metric structures . . . . . . . . . . . . . . . . . . . . . . . . . 253 6.2.3. Composite metric structures. . . . . . . . . . . . . . . . . . . . . . . . . 255 6.2.4. Fractal metric structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 6.3. Spatiotemporal random field theory. . . . . . . . . . . . . . . . . . . . . . . 257 6.3.1. Pragmatic S/TRF tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 6.3.2. Space-time lag dependence: ordinary S/TRF. . . . . . . . . . . . . . . 260 6.3.3. Fractal S/TRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 6.3.4. Space-time heterogenous dependence: generalized S/TRF. . . . . . . 264 6.4. About BME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 6.4.1. The fundamental equations . . . . . . . . . . . . . . . . . . . . . . . . . 267 6.4.2. A methodological outline . . . . . . . . . . . . . . . . . . . . . . . . . . 273 x Advanced Mapping of Environmental Data 6.4.3. Implementation of BME: the SEKS-GUI . . . . . . . . . . . . . . . . . 275 6.5. A brief review of applications . . . . . . . . . . . . . . . . . . . . . . . . . . 281 6.5.1. Earth and atmospheric sciences. . . . . . . . . . . . . . . . . . . . . . . 282 6.5.2. Health, human exposure and epidemiology. . . . . . . . . . . . . . . . 291 6.6. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Preface This volume is a collection of lectures and seminars given at two workshops organized by the Institute of Geomatics and Analysis of Risk (IGAR) at the Faculty of Geosciences and Environment of the University of Lausanne (www.unil.ch/igar): – Workshop I, October 2005: “Data analysis and modeling in environmental sciences towards risk assessment and impact on society”; – Workshop II, October 2006 (S4 network modeling tour): “Machine Learning Algorithms for Spatial Data”. During the first workshop many topics related to natural hazards were considered. One of the lectures was given by Professor G. Christakos on the theory and applications of Bayesian Maximum Entropy (BME). The second workshop was organized within the framework of the S4 (Spatial Simulation for Social Sciences, http://s4.parisgeo.cnrs.fr/index.htm) network modeling tour of young researchers. The main topics considered were related to machine learning algorithms (neural networks of different architectures and statistical learning theory) and their applications in geosciences. Therefore, the book is actually a composition of three topics concerning the analysis, modeling and presentation of spatiotemporal data: geostatistical methods and models, machine learning algorithms and the Bayesian maximum entropy approach. All these three topics have quite different theoretical hypotheses and background assumptions. Usually, they are published in different volumes. Of course, it was not possible to cover both introductory and advanced topics taking into account the limits of the book. Authors were free to select their topics and to present some theoretical concepts along with simulated/illustrative and real case studies. There are some traditional examples of environmental data mapping using different techniques but also

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.