ebook img

Permutation Statistical Methods with R PDF

677 Pages·2021·11.238 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Permutation Statistical Methods with R

Kenneth J. Berry Kenneth L. Kvamme Janis E. Johnston Paul W. Mielke Jr. Permutation Statistical Methods with R Permutation Statistical Methods with R Dr.PaulW.Mielke,Jr.,ProfessorEmeritusofStatisticsatColoradoStateUniversityandFellow oftheAmericanStatisticalAssociation,passedawayon20April2019. Kenneth J. Berry Kenneth L. Kvamme (cid:129) (cid:129) Janis E. Johnston Paul W. Mielke, Jr. (cid:129) Permutation Statistical Methods with R 123 Kenneth J.Berry Kenneth L. Kvamme Department ofSociology Department ofAnthropology ColoradoState University University of Arkansas Fort Collins, CO,USA Fayetteville, AR,USA Janis E.Johnston PaulW. Mielke,Jr.(Deceased) Alexandria, VA, USA Fort Collins, CO,USA ISBN978-3-030-74360-4 ISBN978-3-030-74361-1 (eBook) https://doi.org/10.1007/978-3-030-74361-1 RStudioisatrademarksofRStudio,PBC ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNature SwitzerlandAG2021 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseof illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained hereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregard tojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland For our families: Nancy T. Berry, Ellen E. Berry, and Laura B. Berry; JoAnn Kvamme; Kathie Lowry, Marlin Lowry, Brian Lowry, and Brent and Kappi Lowry; and Roberta R. Mielke, William W. Mielke, Emily (Mielke) Spear, and Lynn (Mielke) Basila. Preface Permutation Statistical Methods with R presents exact and Monte Carlo permuta- tionstatisticalmethodsforgeneratingprobabilityvaluesandmeasuresofeffectsize for a variety of tests of differences, measures of correlation and association, and measures of goodness offit. Throughout the book the emphasis is on permutation statistical methods, although the results of permutation analyses are always com- pared and contrasted with the results of conventional statistical analyses, with which the reader is assumed to be familiar. R Scripts are provided for each con- ventional test statistic and associated permutation test statistics throughout the book. Included in tests of differences are one-sample tests, tests of differences for two independent samples, tests of differences for two matched samples, tests of dif- ferences for multiple independent samples, and tests of differences for multiple matched samples. Included in measures of correlation and association are simple linear correlation and regression, measures of effect size based on Pearson’s chi-squared test statistic, and several other measures of effect size designed for the analysisofcontingencytables.Thearrangementofthebookfollowsthestructureof a typical introductory textbook in statistics: introduction, central tendency and variability, one-sample tests, tests for two independent samples, tests for two matched samples, completely-randomized analysis of variance designs, randomized-blocks analysis of variance designs, simple linear regression and cor- relation, and the analysis of goodness offit and contingency. Chapter 1 establishes the structure of the book, introduces the following 10 chapters,andprovidesabriefoverviewofeachchapter.ThepurposeofChap.1is to familiarize the reader with the structure and content of the book and provide a brief introduction to the various permutation tests and measures presented in the following chapters. Chapter 2 provides an introduction to the R programming language and to RStudio.Althoughthebookisprimarilyabookonpermutationstatisticalmethods and not a book on R per se, the fundamentals of R, as they pertain to permutation methods,areimportanttopresentandexplain.WhiletheRprogramminglanguage provides intrinsic (built-in) functions for most conventional statistical tests and vii viii Preface measures, the R scripts for permutation statistical tests and measures are not available as intrinsic functions in R and are, as far as the authors know, unique to this book. Chapter 3 provides an introduction to two models of statistical inference: the population model and the permutation model. Most introductory textbooks in statistics and statistical methods present only the Neyman–Pearson population modelofstatisticalinference.TheNeyman–Pearsonpopulationmodelofstatistical inferenceisnamedforJerzyNeyman(1894–1981)andEgonPearson(1895–1980) and was designed to make inferences about population parameters and provide approximate probability values under the appropriate null hypotheses. The Neyman–Pearson model is characterized by the assumptions of random sampling, normally-distributed population(s), and homogeneity of variances, where appropriate. While the Neyman–Pearson population model will be familiar to most readers and needs little or no introduction, the Fisher–Pitman permutation model of sta- tistical inference is less likely to be familiar to many readers. The Fisher–Pitman permutation model of statistical inference is named for R. A. Fisher (1890–1962) andE. J.G.Pitman(1897–1993). Incontrastto conventional statistical tests based on the Neyman–Pearson population model, tests based on the Fisher–Pitman per- mutation model are distribution-free, entirely data-dependent, appropriate for nonrandom samples, provide exact probability values, and are ideal for small data sets. On the other hand, permutation statistical methods can be computationally intensive, often requiring many millions of calculations. Thus, for the Fisher– Pitman permutation model, exact and Monte Carlo permutation methods are described and compared. Under the Neyman–Pearson population model, squared Euclidean scaling functions are mandated, while under the Fisher–Pitman permu- tation model, ordinary Euclidean scaling functions are shown to provide robust alternatives to conventional squared Euclidean scaling functions. Chapter 4 presents an introduction to measures of central tendency and vari- ability; specifically, the mode, median, and mean for central tendency and the standard deviation and mean absolute deviation for variability. Special attention is paidtothemeanasaminimizingfunctionforthesumofsquareddeviationsandto themedianasaminimizingfunctionforthesumofabsolutedeviations.Finally,an alternative approach based on paired-squared differences between values is describedforthestandarddeviationandvariance.Chapter4presentssixinteractive R scripts for calculating the various measures of central tendency and variability under the Neyman–Pearson population model and the Fisher–Pitman permutation model. Chapter 5 presents an introduction to the permutation analysis of one-sample tests. In general, one-sample tests attempt to invalidate a hypothesized value of a population parameter, such as a population mean. Under the Neyman–Pearson populationmodel,Student’sconventionalone-samplettestisdescribed.Underthe Fisher–Pitman model, a permutation alternative to Student’s one-sample t test is presented. Preface ix The measurement of effect size—the clinical significance in contrast to the statistical significance of a test—has become increasingly important in recent years, withmany journals requiring measures ofeffect sizeinadditiontotheusual tests of significance. Two conventional measures of effect size are described in Chap.5undertheNeyman–Pearsonpopulationmodel:Cohen’sd^andPearson’s r2. Apermutation-based,chance-correctedRmeasureofeffectsizeforone-sampletests based on permutation test statistic d is presented and compared with the two con- ventional measures of effect size under the Neyman–Pearson population model. Finally, a one-sample permutation test for rank scores is developed and compared with Wilcoxon’s signed-ranks test. Chapter 5 presents 10 interactive R scripts for the various analyses of one-sample data, both conventional analyses under the Neyman–Pearsonpopulationmodelandexact andMonteCarloanalyses underthe Fisher–Pitmanpermutationmodel. Chapter6introducespermutation-basedtestsofdifferencesfortwoindependent samples. Two-sample tests are specifically designed totest for experimentaldiffer- encesbetweentwogroups,suchasacontrolgroupandatreatmentgroup.Underthe Neyman–Pearson population model, Student’s conventional two-sample t test is described. Under the Fisher–Pitman permutation model, an alternative test for two independent samples is presented. Four conventional measures of effect size for two-sample tests are described under the Neyman–Pearson population model: Cohen’sd^,Pearson’sr2,Kelley’s(cid:1)2,andHays’x^2.Apermutation-based, chance- correctedmeasureofeffectsizefortwo-sampletestsispresentedandcomparedwith thefourconventionalmeasuresofeffectsize.Finally,atwo-samplepermutationtest for rank-score data is developed and compared with the conventional Wilcoxon– Mann–Whitney two-sample rank-sum test. Chapter 6 presents seven interactive R scriptsforthevariousanalysesoftwo-sampledata,bothconventionalanalysesunder theNeyman–PearsonpopulationmodelandexactandMonteCarloanalysesunderthe Fisher–Pitmanpermutationmodel. Chapter 7 introduces permutation tests of differences for two matched samples, often called matched-pairs tests. Matched-pairs tests are designed to test for experimental differencesbetween twomatchedsamplessuchastwin studies orthe same sample at two time periods, i.e., before-and-after research designs. Under the Neyman–Pearson population model, Student’s conventional matched-pairs t test is described. Under the Fisher–Pitman permutation model, an alternative matched-pairs permutation test is presented. Two conventional measures of effect size are described under the Neyman–Pearson population model: Cohen’s d^ and Pearson’s r2. A permutation-based, chance-corrected measure of effect size for matched-pairs is presented and compared with the two conventional measures of effect size. Finally, a matched-pairs permutation test for rank-score data is devel- oped and compared with Wilcoxon’s signed-ranks test. Chapter 7 presents seven interactive R scripts for the various analyses of matched-pairs data, both conven- tionalanalysesundertheNeyman–PearsonpopulationmodelandexactandMonte Carlo analyses under the Fisher–Pitman permutation model. x Preface Chapter 8 introduces permutation tests of differences for multiple independent samples,oftencalledfully-orcompletely-randomizedanalysisofvariancedesigns. Completely-randomized designs test for experimental differences among multiple treatmentgroups,suchascolorpreferencesortastetestsinexperimentaldesignsor political parties or religious denominations in survey designs. Under the Neyman– Pearson population model, Fisher’s conventional completely-randomized F test is described and, under the Fisher–Pitman permutation model, an alternative completely-randomized test is presented. Five conventional measures of effect size for multiple samples are described undertheNeyman–PearsonpopulationmodelinChap.8:Cohen’sd^,Pearson’sr2, Kelley’s ^g2, Hays’x^2 for fixed-effects models, and Hays’ x^2 for random-effects F R models. A permutation-based, chance-corrected measure of effect size for multiple independent samples is presented and compared with the five conventional mea- sures of effect size. Finally, a multi-sample permutation test for rank-score data is developedandcomparedwiththeKruskal–Wallisone-wayanalysisofvariancefor ranks test. Chapter 8 presents six interactive R scripts for the various analyses of completely-randomized data, both conventional asymptotic analyses under the Neyman–Pearson population model and Monte Carlo analyses under the Fisher– Pitman permutation model. Chapter 9 introduces permutation tests of differences for multiple matched samples, often called randomized-blocks designs. Randomized-blocks designs test for experimental differences among the same or matched subjects over multiple treatments. Under the Neyman–Pearson population model, Fisher’s conventional randomized-blocks F test is described. Under the Fisher–Pitman permutation model, an alternative randomized-blocks test is presented. Two conventional measuresofeffectsizearedescribedundertheNeyman–Pearsonpopulationmodel: Hays’ x^2 and Pearson’s g2. A permutation-based, chance-corrected measure of effect size for multiple matched pairs is presented and compared with the two conventional measures of effect size. Finally, a multi-sample permutation test for rank-score data is developed and compared with Friedman’s two-way analysis of variance for ranks. Chapter 9 presents six interactive R scripts for the various analyses of randomized-blocks data, both conventional analyses under the Neyman–Pearson population model and Monte Carlo analyses under the Fisher– Pitman permutation model. Chapter 10 introduces permutation tests for simple linear regression and corre- lation. Under the Neyman–Pearson population model Pearson’s conventional product-moment correlation coefficient is described. Under the Fisher–Pitman permutationmodel,analternativetestofcorrelationispresented.Theconventional measure of effect size for correlation data is Pearson’s r2 coefficient of determi- xy nation for variables x and y. A permutation-based, chance-corrected measure of effect size is presented and compared with Pearson’s r2 measure. Finally, a per- xy mutation test for rank-score correlation is developed and compared with Spearman’s rank-order correlation coefficient, Kendall’s sa and sb measures of ordinal association, and Spearman’s footrule measure of agreement. Chapter 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.