Table Of ContentAN INTRODUCTION TO
BOOTSTRAP METHODS
WITH APPLICATIONS
TO R
AN INTRODUCTION TO
BOOTSTRAP METHODS
WITH APPLICATIONS
TO R
Michael R. Chernick
Lankenau Institute for Medical Research, Wynnewood, PA
Thomas Jefferson University, Philadelphia, PA
Robert A. LaBudde
Least Cost Formulations Ltd., Norfolk, VA
Old Dominion University, Norfolk, VA
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax
(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be
addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030,
(201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifi cally disclaim any implied warranties of
merchantability or fi tness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be suitable
for your situation. You should consult with a professional where appropriate. Neither the publisher nor
author shall be liable for any loss of profi t or any other commercial damages, including but not limited to
special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at
(317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic formats. For more information about Wiley products, visit our web site at
www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Chernick, Michael R.
An introduction to bootstrap methods with applications to R / Michael R. Chernick, Robert A. LaBudde.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-46704-6 (hardback)
1. Bootstrap (Statistics) 2. R (Computer program language) I. LaBudde, Robert A., 1947– II. Title.
QA276.8.C478 2011
519.5'4–dc22
2011010972
Printed in the United States of America.
10 9 8 7 6 5 4 3 2 1
CONTENTS
PREFACE xi
ACKNOWLEDGMENTS xv
LIST OF TABLES xvii
1
INTRODUCTION 1
1.1 Historical Background 1
1.2 Defi nition and Relationship to the Delta Method and Other
Resampling Methods 3
1.2.1 Jackknife 6
1.2.2 Delta Method 7
1.2.3 Cross-Validation 7
1.2.4 Subsampling 8
1.3 Wide Range of Applications 8
1.4 The Bootstrap and the R Language System 10
1.5 Historical Notes 25
1.6 Exercises 26
References 27
2
ESTIMATION 30
2.1 Estimating Bias 30
2.1.1 Bootstrap Adjustment 30
2.1.2 Error Rate Estimation in Discriminant Analysis 32
2.1.3 Simple Example of Linear Discrimination and
Bootstrap Error Rate Estimation 42
2.1.4 Patch Data Example 51
2.2 Estimating Location 53
2.2.1 Estimating a Mean 53
2.2.2 Estimating a Median 54
2.3 Estimating Dispersion 54
2.3.1 Estimating an Estimate’s Standard Error 55
2.3.2 Estimating Interquartile Range 56
v
vi CONTENTS
2.4 Linear Regression 56
2.4.1 Overview 56
2.4.2 Bootstrapping Residuals 57
2.4.3 Bootstrapping Pairs (Response and Predictor Vector) 58
2.4.4 Heteroscedasticity of Variance: The Wild Bootstrap 58
2.4.5 A Special Class of Linear Regression Models:
Multivariable Fractional Polynomials 60
2.5 Nonlinear Regression 60
2.5.1 Examples of Nonlinear Models 61
2.5.2 A Quasi-Optical Experiment 63
2.6 Nonparametric Regression 63
2.6.1 Examples of Nonparametric Regression Models 64
2.6.2 Bootstrap Bagging 66
2.7 Historical Notes 67
2.8 Exercises 69
References 71
3
CONFIDENCE INTERVALS 76
3.1 Subsampling, Typical Value Theorem, and Efron’s
Percentile Method 77
3.2 Bootstrap-t 79
3.3 Iterated Bootstrap 83
3.4 Bias-Corrected (BC) Bootstrap 85
3.5 BCa and ABC 85
3.6 Tilted Bootstrap 88
3.7 Variance Estimation with Small Sample Sizes 90
3.8 Historical Notes 94
3.9 Exercises 96
References 98
4
HYPOTHESIS TESTING 101
4.1 Relationship to Confi dence Intervals 103
4.2 Why Test Hypotheses Differently? 105
4.3 Tendril DX Example 106
4.4 Klingenberg Example: Binary Dose–Response 108
4.5 Historical Notes 109
4.6 Exercises 110
References 111
CONTENTS vii
5
TIME SERIES 113
5.1 Forecasting Methods 113
5.2 Time Domain Models 114
5.3 Can Bootstrapping Improve Prediction Intervals? 115
5.4 Model-Based Methods 118
5.4.1 Bootstrapping Stationary Autoregressive Processes 118
5.4.2 Bootstrapping Explosive Autoregressive Processes 123
5.4.3 Bootstrapping Unstable Autoregressive Processes 123
5.4.4 Bootstrapping Stationary ARMA Processes 123
5.5 Block Bootstrapping for Stationary Time Series 123
5.6 Dependent Wild Bootstrap (DWB) 126
5.7 Frequency-Based Approaches for Stationary Time Series 127
5.8 Sieve Bootstrap 128
5.9 Historical Notes 129
5.10 Exercises 131
References 131
6
BOOTSTRAP VARIANTS 136
6.1 Bayesian Bootstrap 137
6.2 Smoothed Bootstrap 138
6.3 Parametric Bootstrap 139
6.4 Double Bootstrap 139
6.5 The m-Out-of-n Bootstrap 140
6.6 The Wild Bootstrap 141
6.7 Historical Notes 141
6.8 Exercises 142
References 142
7
CHAPTER SPECIAL TOPICS 144
7.1 Spatial Data 144
7.1.1 Kriging 144
7.1.2 Asymptotics for Spatial Data 147
7.1.3 Block Bootstrap on Regular Grids 148
7.1.4 Block Bootstrap on Irregular Grids 148
7.2 Subset Selection in Regression 148
7.2.1 Gong’s Logistic Regression Example 149
7.2.2 Gunter’s Qualitative Interaction Example 153
7.3 Determining the Number of Distributions in a Mixture 155
viii CONTENTS
7.4 Censored Data 157
7.5 P-Value Adjustment 158
7.5.1 The Westfall–Young Approach 159
7.5.2 Passive Plus Example 159
7.5.3 Consulting Example 160
7.6 Bioequivalence 162
7.6.1 Individual Bioequivalence 162
7.6.2 Population Bioequivalence 165
7.7 Process Capability Indices 165
7.8 Missing Data 172
7.9 Point Processes 174
7.10 Bootstrap to Detect Outliers 176
7.11 Lattice Variables 177
7.12 Covariate Adjustment of Area Under the Curve Estimates
for Receiver Operating Characteristic (ROC) Curves 177
7.13 Bootstrapping in SAS 179
7.14 Historical Notes 182
7.15 Exercises 183
References 185
8
WHEN THE BOOTSTRAP IS INCONSISTENT AND HOW TO
REMEDY IT 190
8.1 Too Small of a Sample Size 191
8.2 Distributions with Infi nite Second Moments 191
8.2.1 Introduction 191
8.2.2 Example of Inconsistency 192
8.2.3 Remedies 193
8.3 Estimating Extreme Values 194
8.3.1 Introduction 194
8.3.2 Example of Inconsistency 194
8.3.3 Remedies 194
8.4 Survey Sampling 195
8.4.1 Introduction 195
8.4.2 Example of Inconsistency 195
8.4.3 Remedies 195
8.5 m-Dependent Sequences 196
8.5.1 Introduction 196
8.5.2 Example of Inconsistency When Independence Is Assumed 196
8.5.3 Remedy 197
Description:2.4.4 Heteroscedasticity of Variance: The Wild Bootstrap. 58 . An introduction to R programming is also included to prepare the student for the exer-.