Table Of ContentJonathon D. Brown
Advanced
Statistics for the
Behavioral Sciences
A Computational Approach with R
Advanced Statistics for the Behavioral Sciences
Jonathon D. Brown
Advanced Statistics
for the Behavioral Sciences
A Computational Approach with R
JonathonD.Brown
DepartmentofPsychology
UniversityofWashington
Seattle,WA,USA
ISBN978-3-319-93547-8 ISBN978-3-319-93549-2 (eBook)
https://doi.org/10.1007/978-3-319-93549-2
LibraryofCongressControlNumber:2018950841
©SpringerNatureSwitzerlandAG2018
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthe
materialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,
broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation
storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology
nowknownorhereafterdeveloped.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication
doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant
protectivelawsandregulationsandthereforefreeforgeneraluse.
The publisher, the authors, and the editorsare safeto assume that the adviceand informationin this
bookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor
theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany
errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional
claimsinpublishedmapsandinstitutionalaffiliations.
ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG
Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland
Preface
“Mythinkingisfirstandlastandalwaysforthesakeofmydoing.”
—WilliamJames
As insightful as he was, William James was not referring to the twenty-first-
century relation between computer-generated statistical analyses and scientific
research. Nevertheless, his insistence that thinking is always for doing speaks to
thatassociation.Inbygonedays,statisticianswereresearchers—pursuingtheirown
lineofinquiryoremployedbycompaniestoidentifyproductivepractices—andthe
statisticalanalysestheydevelopedweretoolstohelpthemunderstandthephenom-
ena they were studying. Today, statistical analyses are increasingly developed and
refined by individuals who have received training in computer science, and their
expertise lies in writing efficient and elegant computer code. As a result, ordinary
researcherswholackabackgroundincomputerprogrammingareaskedtoaccepton
faiththeblack-boxoutputthatemergesfromthesophisticatedstatisticalmodelsthey
increasinglyuse.
Thisbookisdesignedtobridgethegapbetweencomputerscienceandresearch
application. Many of the analyses are advanced (e.g., regularization and the lasso,
numerical optimization with the Nelder-Mead simplex, and mixed modeling with
penalized least squares), but the presentation is relaxed, with an emphasis on
understanding where the numbers come from and how they can be interpreted. In
short,thefocusison“thinkingforthesakeofdoing.”
v
vi Preface
Organization
Thebookisdividedintothreesections.
Linearalgebra Biasandefficiency Nonlinearmodels
1.Linearequations 6.Generalizedleastsquares 10.Optimizationandnonlinear
2.Leastsquaresestima- 7.Robustregression leastsquares
tion 8.Modelselectionandshrinkage 11.Generalizedlinearmodels
3.Linearregression estimators 12.Survivalanalysis
4.Eigendecomposition 9.Cubicsplinesandadditive 13.Time-seriesanalysis
5.Singularvalue models 14.Mixed-effectsmodels
decomposition
I begin with linear algebra for two reasons. First, and most obviously, linear
algebraunderliesmoststatisticalanalyses;second,understandingthemathematical
operations involved in Gaussian elimination and backward substitution provides a
basisforunderstandinghowmodernstatisticalsoftwarepackagesapproachstatisti-
cal analyses (e.g., why the QR decomposition is used to solve linear regression
problems). An emphasis on numerical analysis, which occurs throughout the text,
representsoneofthebook’smostdistinctivefeatures.
Using ℛ
All oftheanalyses inthis book were performed using ℛ, a free software program-
minglanguageandsoftwareenvironmentforstatisticalcomputingandgraphicsthat
can be downloaded at http://www.r-project.org. However, instead of relying on
canned functions or user-created packages that must be downloaded and installed,
I have provided my own code so that readers can see for themselves how the
analyses are performed. Moreover, each analysis uses a small (n ¼ 12) data set to
encouragereaderstotracktheoperations“inrealtime,”witheachdatasettellinga
coherentstorywithinterpretableresults.
The codes I have included are not intended to supplant packaged ones in ℛ.
Instead,theyareofferedasapedagogicaltool,designedtodemystifytheoperations
that underlie each analysis. Toward that end, they are written with an eye toward
simplicity, occupying no more than one manuscript page of text. Few of them
contain checks for anomalous cases, so they should be used only for the particular
analyses for which they are intended. At the end of each section, the relevant
functions available in ℛ are identified, ensuring that readers can see how each
analysis is performed and have access to the state-of-the-art code that is properly
usedforeachstatisticalmodel.
Mostofthecodesarecontainedwithineachchapter,allowingreaderstocopyand
paste them into ℛ while they are working through the problems in the book.
Occasionallyacodeiscalledfromapreviouschapter,inwhichcaseIhavespecified
Preface vii
afolderlocation:'C:\\ASBS\\code.R'(AdvancedStatisticsfortheBehavioral
Sciences)asaplaceholder.Ihavenot,however,createdanℛpackageforthecodeas
theyaremeanttobeusedonlyfortheproblemswithinthebook.
Intended Audience
This book is intended for graduate students in the behavioral sciences who have
taken an introductory graduate level course. It consists of 14 chapters, making it
suitable for a 15-week semester ora 10-week quarter. This book should also be of
interest to intellectually curious researchers who have been using a particular
statisticalmethodintheirresearch(e.g.,mixed-effectsmodels)withoutfullyunder-
standingthemathematicsbehindtheapproach.Myhopeisthatresearcherswillmore
readily embrace advanced statistical analyses once the underlying operations have
beenilluminated.
Seattle,WA,USA JonathonD.Brown
Contents
PartI LinearAlgebra
1 LinearEquations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 RowReductionMethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 GaussianElimination. . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Pivoting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 RCode:GaussianEliminationandBackward
Substitution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.4 Gauss-JordanElimination. . . . . . . . . . . . . . . . . . . . . 9
1.1.5 LUDecomposition. . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.6 RCode:LUDecomposition. . . . . . . . . . . . . . . . . . . 14
1.1.7 CholeskyDecomposition. . . . . . . . . . . . . . . . . . . . . 15
1.1.8 RCode:CholeskyDecompositionofaSymmetric
Matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2 MatrixMethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.1 Determinant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.2 RCode:Determinant. . . . . . . . . . . . . . . . . . . . . . . . 21
1.2.3 DeterminantsandLinearDependencies. . . . . . . . . . . 21
1.2.4 RCode:ReducedRowEchelonFormandLinear
Dependencies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.2.5 UsingtheDeterminanttoSolveLinearEquations. . . . 23
1.2.6 RCode:Cramer’sRule. . . . . . . . . . . . . . . . . . . . . . . 24
1.2.7 MatrixInverse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.2.8 RCode:CalculateInverseUsingReducedRow
EchelonForm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.2.9 Norms,Errors,andtheConditionNumber
ofaMatrix. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 26
1.2.10 RCode:ConditionNumberandNormRatio. . . . . . . 33
ix
x Contents
1.3 IterativeMethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.1 Jacobi’sMethod. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3.2 Gauss-SeidelMethod. . . . . . . . . . . . . . . . . . . . . . . . 35
1.3.3 Convergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.3.4 RCode:Gauss-Seidel. . . . . . . . . . . . . . . . . . . . . . . . 37
1.4 ChapterSummary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2 LeastSquaresEstimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1 LineofBestFit.. . . . . . .. . . . . . .. . . . . .. . . . . . .. . . . . . .. 39
2.1.1 DerivingaLineofBestFit. . . . . . . . . . . . . . . . . . . . 39
2.1.2 MinimizingtheSumofSquaredDifferences. . . . . . . 41
2.1.3 NormalEquations. . . . . . . . . . .. . . . . . . . . . . . .. . . 42
2.1.4 AnalyticSolution. . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.2 SolvingtheNormalEquations. . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2.1 TheQRDecomposition. . . . . . . . . . . . . . . . . . . . . . 43
2.2.2 AdvantagesofanOrthonormalSystem. . . . . . . . . . . 44
2.2.3 HatMatrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.4 Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.6 RCode:QRSolver. . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 PerformingtheQRDecomposition. . . . . . . . . . . . . . . . . . . . . . 49
2.3.1 Gram-SchmidtOrthogonalization. . . . . . . . . . . . . . . 49
2.3.2 RCode:QRDecomposition;Gram-Schmidt
Orthogonalization. . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.3 GivensRotations. . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.3.4 RCode:QRDecomposition;GivensRotations. . . . . . 58
2.3.5 HouseholderReflections. . . . . . . . . . . . . . . . . . . . . . 58
2.3.6 RCode:QRDecomposition;Householder
Reflectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.3.7 ComparingtheDecompositions. . . . . . . . . . . . . . . . . 61
2.3.8 RCode:QRDecompositionComparison. . . . . . . . . . 62
2.4 LinearRegressionanditsAssumptions. . . . . . . . . . . . . . . . . . . 62
2.4.1 Linearity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4.2 NatureoftheVariables. . . . . . . . . . . . . . . . . . . . . . . 64
2.4.3 ErrorsandtheirDistribution. . . . . . . . . . . . . . . . . . . 65
2.4.4 RegressionCoefficients. . . . . . . . . . . . . . . . . . . .. . . 67
2.5 OLSEstimationandtheGauss-MarkovTheorem. . . . . . . . . . . 67
2.5.1 ProvingtheOLSEstimatesareUnbiased. . . . . . . . . . 68
2.5.2 ProvingtheOLSEstimatesareEfficient. . . . . . . . . . . 69
2.6 MaximumLikelihoodEstimation. . . . . . . . . . . . . . . . . . . . . . . 71
2.6.1 LogLikelihoodFunction. . . . . . . . . . . . . . . . . . . . . 71
2.6.2 RCode:MaximumLikelihoodEstimation. . . . . . . . . 74
2.7 BeyondOLSEstimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.8 ChapterSummary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Contents xi
3 LinearRegression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1 SimpleLinearRegression. . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.1 InspectingtheResiduals. . . . . . . . . . . . . . . . . . . . . . 79
3.1.2 DescribingtheModel’sFittotheData. . . . . . . . . . . . 80
3.1.3 TestingtheModel’sFittotheData. . . . . . . . . . . . . . 80
3.1.4 VarianceEstimates. . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.1.5 TestsofSignificance. . . . . . .. . . . . . . . . . . . . . . . . . 82
3.1.6 ConfidenceIntervals. . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1.7 RCode:ConfidenceIntervalSimulation. . . . . . . . . . 83
3.1.8 ConfidenceRegions. . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1.9 Forecasting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.1.10 RCode:SimpleLinearRegression. . . . . . . . . . . . . . 87
3.1.11 RCode:SimpleLinearRegression:Graphs. . . . . . . . 88
3.2 MultipleRegression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.2.1 RegressionModel. . . . .. . . . . . .. . . . . . .. . . . . .. . 90
3.2.2 RegressionCoefficients. . . . . . . . . . . . . . . . . . . .. . . 92
3.2.3 VarianceEstimates,SignificanceTests,and
ConfidenceIntervals. . . . . . . . . . . . . . . . . . . . . . . . . 94
3.2.4 ModelComparisonsandChangesinR2. . . . . . . . . . . 95
3.2.5 ComparingPredictors. . . . . . . . . . . . . . . . . . . . . . . . 97
3.2.6 Forecasting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.2.7 RCode:MultipleRegression. . .. . . .. . . .. . .. . . .. 99
3.3 Polynomials,Cross-Products,andCategoricalPredictors. . . . . . 99
3.3.1 PolynomialRegression. . . . . . . . . . . . . . . . . . . . . . . 100
3.3.2 RCode:PolynomialRegression. . . . . . . . . . . . . . . . 105
3.3.3 Cross-ProductTerms. . . . . . . . . . . . . . . . . . . . . . . . 105
3.3.4 RCode:Cross-ProductTermsandSimpleSlopes. . . . 109
3.3.5 Johnson-NeymanProcedure. . . . . . . . . . . . . . . . . . . 110
3.3.6 RCode:Johnson-NeymanProcedure. . . . . . . . . . . . . 111
3.3.7 CategoricalPredictors. . . . . . . . . . . . . . . . . . . . . . . . 111
3.3.8 RCode:ContrastCodesforCategoricalPredictors. . . 113
3.3.9 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3.4 ChapterSummary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4 EigenDecomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.1 Diagonalization. . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 117
4.1.1 EigenvectorMultiplication. . . . . . . . . . . . . . . . . . . . 117
4.1.2 TheCharacteristicEquation. . . . . . . . . . . . . . . . . . . 119
4.1.3 RCode:EigenDecompositionofa2(cid:2)2Matrixwith
RealEigenvalues. . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.1.4 PropertiesofaDiagonalizedMatrix. . . . . . . . . . . . . . 121
4.2 EigenvalueCalculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.2.1 BasicQRAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . 122
4.2.2 RCode:QRAlgorithmUsingGram-Schmidt
Orthogonalization. . . . . . . . . . . . . . . . . . . . . . . . . . . 124