ebook img

Assessing and Improving Prediction and Classification: Theory and Algorithms in C++ PDF

529 Pages·2017·5.378 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Assessing and Improving Prediction and Classification: Theory and Algorithms in C++

Assessing and Improving Prediction and Classification Theory and Algorithms in C++ Timothy Masters Assessing and Improving Prediction and Classification: Theory and Algorithms in C++ Timothy Masters Ithaca, New York, USA ISBN-13 (pbk): 978-1-4842-3335-1 ISBN-13 (electronic): 978-1-4842-3336-8 https://doi.org/10.1007/978-1-4842-3336-8 Library of Congress Control Number: 2017962869 Copyright © 2018 by Timothy Masters This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Cover image by Freepik (www.freepik.com) Managing Director: Welmoed Spahr Editorial Director: Todd Green Acquisitions Editor: Steve Anglin Development Editor: Matthew Moodie Technical Reviewers: Massimo Nardone and Matt Wiley Coordinating Editor: Mark Powers Copy Editor: Kim Wimpsett Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer- sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation. For information on translations, please e-mail [email protected], or visit www.apress.com/ rights-permissions. Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales web page at www.apress.com/bulk-sales. Any source code or other supplementary material referenced by the author in this book is available to readers on GitHub via the book’s product page, located at www.apress.com/9781484233351. For more detailed information, please visit www.apress.com/source-code. Printed on acid-free paper This book is dedicated to Master Hidy Ochiai with the utmost respect, admiration, and gratitude. His incomparable teaching of Washin-Ryu karate has raised my confidence, my physical ability, and my mental acuity far beyond anything I could have imagined. For this I will ever be grateful. Table of Contents About the Author ���������������������������������������������������������������������������������������������������xiii About the Technical Reviewers �������������������������������������������������������������������������������xv Preface ������������������������������������������������������������������������������������������������������������������xvii Chapter 1: A ssessment of Numeric Predictions �������������������������������������������������������1 Notation2 Overview of Performance Measures 3 Consistency and Evolutionary Stability 6 Selection Bias and the Need for Three Datasets9 Cross Validation and Walk-Forward Testing 14 Bias in Cross Validation 15 Overlap Considerations15 Assessing Nonstationarity Using Walk-Forward Testing17 Nested Cross Validation Revisited 18 Common Performance Measures 20 Mean Squared Error 20 Mean Absolute Error 23 R-Squared 23 RMS Error 24 Nonparametric Correlation 24 Success Ratios 26 Alternatives to Common Performance Measures 27 Stratification for Consistency 27 Confidence Intervals 29 The Confidence Set 30 Serial Correlation 32 v Table of ConTenTs Multiplicative Data 32 Normally Distributed Errors 33 Empirical Quantiles as Confidence Intervals 35 Confidence Bounds for Quantiles 37 Tolerance Intervals 40 Chapter 2: A ssessment of Class Predictions ����������������������������������������������������������45 The Confusion Matrix 46 Expected Gain/Loss 46 ROC (Receiver Operating Characteristic) Curves 48 Hits, False Alarms, and Related Measures 48 Computing the ROC Curve 50 Area Under the ROC Curve 56 Cost and the ROC Curve 59 Optimizing ROC-Based Statistics 60 Optimizing the Threshold: Now or Later? 61 Maximizing Precision 64 Generalized Targets 65 Maximizing Total Gain 66 Maximizing Mean Gain 67 Maximizing the Standardized Mean Gain 67 Confidence in Classification Decisions 69 Hypothesis Testing 70 Confidence in the Confidence 75 Bayesian Methods 81 Multiple Classes 85 Hypothesis Testing vs Bayes’ Method 86 Final Thoughts on Hypothesis Testing 91 Confidence Intervals for Future Performance 98 vi Table of ConTenTs Chapter 3: R esampling for Assessing Parameter Estimates ��������������������������������101 Bias and Variance of Statistical Estimators 102 Plug-in Estimators and Empirical Distributions 103 Bias of an Estimator 104 Variance of an Estimator 105 Bootstrap Estimation of Bias and Variance 106 Code for Bias and Variance Estimation109 Plug-in Estimators Can Provide Better Bootstraps 112 A Model Parameter Example 116 Confidence Intervals 120 Is the Interval Backward? 125 Improving the Percentile Method 128 Hypothesis Tests for Parameter Values 135 Bootstrapping Ratio Statistics 137 Jackknife Estimates of Bias and Variance 148 Bootstrapping Dependent Data 151 Estimating the Extent of Autocorrelation 152 The Stationary Bootstrap 155 Choosing a Block Size for the Stationary Bootstrap 158 The Tapered Block Bootstrap 163 Choosing a Block Size for the Tapered Block Bootstrap 170 What If the Block Size Is Wrong? 172 Chapter 4: R esampling for Assessing Prediction and Classification ��������������������185 Partitioning the Error 186 Cross Validation 189 Bootstrap Estimation of Population Error 191 Efron’s E0 Estimate of Population Error 195 Efron’s E632 Estimate of Population Error 198 Comparing the Error Estimators for Prediction 199 Comparing the Error Estimators for Classification 201 Summary203 vii Table of ConTenTs Chapter 5: M iscellaneous Resampling Techniques ����������������������������������������������205 Bagging 206 A Quasi-theoretical Justification 207 The Component Models 209 Code for Bagging 210 AdaBoost 215 Binary AdaBoost for Pure Classification Models 215 Probabilistic Sampling for Inflexible Models 223 Binary AdaBoost When the Model Provides Confidence 226 AdaBoostMH for More Than Two Classes 234 AdaBoostOC for More Than Two Classes 243 Comparing the Boosting Algorithms 259 A Binary Classification Problem 259 A Multiple-Class Problem 262 Final Thoughts on Boosting 263 Permutation Training and Testing 264 The Permutation Training Algorithm 266 Partitioning the Training Performance 267 A Demonstration of Permutation Training 270 Chapter 6: C ombining Numeric Predictions ���������������������������������������������������������279 Simple Average 279 Code for Averaging Predictions 281 Unconstrained Linear Combinations 283 Constrained Linear Combinations 286 Constrained Combination of Unbiased Models 291 Variance-Weighted Interpolation 293 Combination by Kernel Regression Smoothing 295 Code for the GRNN 300 Comparing the Combination Methods 305 viii Table of ConTenTs Chapter 7: C ombining Classification Models ��������������������������������������������������������309 Introduction and Notation 310 Reduction vs Ordering 311 The Majority Rule 312 Code for the Majority Rule 313 The Borda Count 316 The Average Rule 318 Code for the Average Rule 318 The Median Alternative 320 The Product Rule 320 The MaxMax and MaxMin Rules 320 The Intersection Method 321 The Union Rule 328 Logistic Regression 332 Code for the Combined Weight Method 335 The Logit Transform and Maximum Likelihood Estimation 340 Code for Logistic Regression 344 Separate Weight Sets 348 Model Selection by Local Accuracy 350 Code for Local Accuracy Selection 352 Maximizing the Fuzzy Integral 362 What Does This Have to Do with Classifier Combination? 364 Code for the Fuzzy Integral 366 Pairwise Coupling 374 Pairwise Threshold Optimization 383 A Cautionary Note 384 Comparing the Combination Methods 385 Small Training Set, Three Models 386 Large Training Set, Three Models 387 Small Training Set, Three Good Models, One Worthless 388 Large Training Set, Three Good Models, One Worthless 389 ix Table of ConTenTs Small Training Set, Worthless and Noisy Models Included 390 Large Training Set, Worthless and Noisy Models Included 391 Five Classes 392 Chapter 8: G ating Methods �����������������������������������������������������������������������������������393 Preordained Specialization 393 Learned Specialization 395 After-the-Fact Specialization 395 Code for After-the-Fact Specialization 395 Some Experimental Results 403 General Regression Gating 405 Code for GRNN Gating408 Experiments with GRNN Gating 415 Chapter 9: I nformation and Entropy ���������������������������������������������������������������������417 Entropy 417 Entropy of a Continuous Random Variable 420 Partitioning a Continuous Variable for Entropy 421 An Example of Improving Entropy 426 Joint and Conditional Entropy 428 Code for Conditional Entropy 432 Mutual Information433 Fano’s Bound and Selection of Predictor Variables 436 Confusion Matrices and Mutual Information 438 Extending Fano’s Bound for Upper Limits 440 Simple Algorithms for Mutual Information 444 The TEST_DIS Program 449 Continuous Mutual Information 452 The Parzen Window Method 452 Adaptive Partitioning 461 The TEST_CON Program 475 x Table of ConTenTs Predictor Selection Using Mutual Information 476 Maximizing Relevance While Minimizing Redundancy 476 The MI_DISC and MI_CONT Programs 479 A Contrived Example of Information Minus Redundancy 480 A Superior Selection Algorithm for Binary Variables 482 Screening Without Redundancy 487 Asymmetric Information Measures 495 Uncertainty Reduction 495 Transfer Entropy: Schreiber’s Information Transfer 499 References ������������������������������������������������������������������������������������������������������������509 Index ���������������������������������������������������������������������������������������������������������������������513 xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.