Introduction to Statistical Modelling and Inference The complexity of large-scale data sets (“Big Data”) has stimulated the development of advanced computational methods for analysing them. There are two different kinds of methods to aid this. The model-based method uses probability models and likelihood and Bayesian theory, while the model-free method does not require a probability model, likelihood or Bayesian theory. These two approaches are based on different philosophical principles of probability theory, espoused by the famous statisticians Ronald Fisher and Jerzy Neyman. Introduction to Statistical Modelling and Inference covers simple experimental and survey designs, and probability models up to and including generalised linear (regression) models and some extensions of these, including finite mixtures. A wide range of examples from different application fields are also discussed and analysed. No special software is used, beyond that needed for maximum likelihood analysis of generalised linear models. Students are expected to have a basic mathematical background in algebra, coordinate geometry and calculus. Features • Probability models are developed from the shape of the sample empirical cumulative distribution function (cdf) or a transformation of it. • Bounds for the value of the population cumulative distribution function are obtained from the Beta distribution at each point of the empirical cdf. • Bayes’s theorem is developed from the properties of the screening test for a rare condition. • The multinomial distribution provides an always-true model for any randomly sampled data. • The model-free bootstrap method for finding the precision of a sample estimate has a model-based parallel – the Bayesian bootstrap – based on the always-true multinomial distribution. • The Bayesian posterior distributions of model parameters can be obtained from the maximum likelihood analysis of the model. This book is aimed at students in a wide range of disciplines including Data Science. The book is based on the model-based theory, used widely by scientists in many fields, and compares it, in less detail, with the model-free theory, popular in computer science, machine learning and official survey analysis. The development of the model-based theory is accelerated by recent developments in Bayesian analysis. Murray Aitkin earned his BSc, PhD and DSc from Sydney University, Australia, in Mathematical Statistics. Dr Aitkin completed his post-doctoral work at the Psychometric Laboratory, University of North Carolina, Chapel Hill. He has held teaching/lecturing positions at Virginia Polytechnic Institute, the University of New South Wales and Macquarie University along with research professor positions at Lancaster University (three years, UK Social Science Research Council) and the University of Western Australia (five years, Australian Research Council). He has been a Professor of Statistics at Lancaster University, Tel Aviv University and the University of Newcastle. He has been a visiting researcher and also held consulting positions at the Educational Testing Service (Fulbright Senior Fellow 1971–1972 and Senior Statistician 1988–1989). He was the Chief Statistician from 2000 to 2002 at the Education Statistics Services Institute, American Institutes for Research, Washington DC, and advisor to the National Center for Education Statistics, US Department of Education. He is a Fellow of the American Statistical Association, an Elected Member of the International Statistical Institute, and an Honorary Member of the Statistical Modelling Society. He is an Honorary Professorial Associate at the University of Melbourne (Department of Psychology 2004–2008, Department [now School] of Mathematics and Statistics 2008–present). Introduction to Statistical Modelling and Inference Murray Aitkin University of Melbourne, Australia First edition published 2023 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN © 2023 Murray Aitkin CRC Press is an imprint of Taylor & Francis Group, LLC Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume respon- sibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. ISBN: 9781032105710 (hbk) ISBN: 9781032105734 (pbk) ISBN: 9781003216025 (ebk) DOI: 10.1201/9781003216025 Typeset in CMR by Deanta Global Publishing Services, Chennai, India Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1 Introduction........................................................... 1 1.1 What is Statistical Modelling? . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 What is Statistical Analysis? . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 What is Statistical Inference? . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 What is (or are) Big Data?............................................ 3 3 Data and research studies.............................................. 5 3.1 Lifetimes of radio transceivers . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2 Clustering of V1 missile hits in South London . . . . . . . . . . . . . . . . 5 3.3 Court case on vaccination risk . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.4 Clinical trial of Depepsen for the treatment of duodenal ulcers . . . . . . . 6 3.5 Effectiveness of treatments for respiratory distress in newborn babies . . . 7 3.6 Vitamin K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.7 Species counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.8 Toxicology in small animal experiments . . . . . . . . . . . . . . . . . . . . 9 3.9 Incidence of Down’s syndrome in four regions . . . . . . . . . . . . . . . . 9 3.10 Fish species in lakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.11 Absence from school . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.12 Hostility in husbands of suicide attempters . . . . . . . . . . . . . . . . . . 11 3.13 Tolerance of racial intermarriage . . . . . . . . . . . . . . . . . . . . . . . . 12 3.14 Hospital bed use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.15 Dugong growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.16 Simulated motorcycle collision . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.17 Global warming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.18 Social group membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 The StatLab database ................................................. 19 4.1 Types of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 StatLab population questions . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Sample surveys – should we believe what we read?..................... 23 5.1 Women and Love . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2 Would you have children? . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 Representative sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.4 Bias in the Newsday sample . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.5 Bias in the Women and Love sample . . . . . . . . . . . . . . . . . . . . . 25 v vi Contents 6 Probability............................................................ 29 6.1 Relative frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.2 Degree of belief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3 StatLab dice sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.4 Computer sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.4.1 Natural random processes . . . . . . . . . . . . . . . . . . . . . . 31 6.5 Probability for sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6.5.1 Extrasensory perception . . . . . . . . . . . . . . . . . . . . . . . 31 6.5.2 Representative sampling . . . . . . . . . . . . . . . . . . . . . . . 33 6.6 Probability axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.6.1 Dice example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.6.2 Coin tossing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.7 Screening tests and Bayes’s theorem . . . . . . . . . . . . . . . . . . . . . 36 6.8 The misuse of probability in the Sally Clark case . . . . . . . . . . . . . . 39 6.9 Random variables and their probability distributions . . . . . . . . . . . . 42 6.9.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.10 Sums of independent random variables . . . . . . . . . . . . . . . . . . . . 45 7 Statistical inference I – discrete distributions .......................... 47 7.1 Evidence-based policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 7.2 The basis of statistical inference . . . . . . . . . . . . . . . . . . . . . . . . 47 7.3 The survey sampling approach . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.4 Model-based inference theories . . . . . . . . . . . . . . . . . . . . . . . . . 51 7.5 The likelihood function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7.6 Binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 7.6.1 The binomial likelihood function. . . . . . . . . . . . . . . . . . . 53 7.6.1.1 Sufficient and ancillary statistics . . . . . . . . . . . . . 54 7.6.1.2 The maximum likelihood estimate (MLE). . . . . . . . 57 7.7 Frequentist theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 7.7.1 Parameter transformations . . . . . . . . . . . . . . . . . . . . . . 60 7.7.2 Ambiguity of notation . . . . . . . . . . . . . . . . . . . . . . . . 63 7.8 Bayesian theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.8.1 Bayes’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.8.2 Summaries of the posterior distribution . . . . . . . . . . . . . . . 65 7.8.3 Conjugate prior distributions. . . . . . . . . . . . . . . . . . . . . 67 7.8.4 Improving frequentist interval coverage . . . . . . . . . . . . . . . 67 7.8.5 The bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.8.6 Non-informative prior rules. . . . . . . . . . . . . . . . . . . . . . 68 7.8.7 Frequentist objections to flat priors . . . . . . . . . . . . . . . . . 69 7.8.8 General prior specifications . . . . . . . . . . . . . . . . . . . . . . 69 7.8.9 Are parameters really just random variables? . . . . . . . . . . . 70 7.9 Inferences from posterior sampling . . . . . . . . . . . . . . . . . . . . . . 70 7.9.1 The precision of posterior draws . . . . . . . . . . . . . . . . . . . 71 7.10 Sample design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.11 Parameter transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 7.12 The Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7.12.1 Poisson likelihood and ML . . . . . . . . . . . . . . . . . . . . . . 77 7.12.2 Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 7.12.3 Prediction of a new Poisson value . . . . . . . . . . . . . . . . . . 79 7.12.4 Side effect risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Contents vii 7.12.4.1 Frequentist analysis . . . . . . . . . . . . . . . . . . . . 80 7.12.4.2 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . 81 7.12.5 A two-parameter binomial distribution . . . . . . . . . . . . . . . 81 7.12.5.1 Frequentist analysis . . . . . . . . . . . . . . . . . . . . 82 7.12.5.2 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . 84 7.13 Categorical variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 7.13.1 The multinomial distribution. . . . . . . . . . . . . . . . . . . . . 85 7.14 Maximum likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 7.15 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 7.15.1 Posterior sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 87 7.15.2 Sampling without replacement . . . . . . . . . . . . . . . . . . . . 89 8 Comparison of binomials............................................... 91 8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.2 Example – RCT of Depepsen for the treatment of duodenal ulcers . . . . . 92 8.2.1 Frequentist analysis: confidence interval. . . . . . . . . . . . . . . 93 8.2.2 Bayesian analysis: credible interval . . . . . . . . . . . . . . . . . 94 8.3 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8.4 RCT continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 8.5 Bayesian hypothesis testing/model comparison . . . . . . . . . . . . . . . . 97 8.5.1 The null and alternative hypotheses, and the two models . . . . . 97 8.6 Other measures of treatment difference . . . . . . . . . . . . . . . . . . . . 100 8.6.1 Frequentist analysis: hypothesis testing . . . . . . . . . . . . . . . 102 8.6.2 How are the hypothetical samples to be drawn? . . . . . . . . . . 103 8.6.3 Conditional testing . . . . . . . . . . . . . . . . . . . . . . . . . . 104 8.7 The ECMO trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 8.7.1 The first trial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 8.7.2 Frequentist analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 106 8.7.3 The likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 8.7.3.1 Bayesian Analysis . . . . . . . . . . . . . . . . . . . . . 108 8.7.4 The second ECMO study . . . . . . . . . . . . . . . . . . . . . . . 109 9 Data visualisation...................................................... 111 9.1 The histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 9.2 The empirical mass and cumulative distribution functions . . . . . . . . . 113 9.3 Probability models for continuous variables . . . . . . . . . . . . . . . . . . 113 10 Statistical inference II – the continuous exponential, Gaussian and uniform distributions.................................................. 117 10.1 The exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 117 10.2 The exponential likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 10.3 Frequentist theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 10.3.1 Parameter transformations . . . . . . . . . . . . . . . . . . . . . . 119 10.3.2 Frequentist asymptotics . . . . . . . . . . . . . . . . . . . . . . . 121 10.4 Bayesian theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 10.4.1 Conjugate priors. . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 10.5 The Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 10.6 The Gaussian likelihood function . . . . . . . . . . . . . . . . . . . . . . . 125 10.7 Frequentist inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 10.8 Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 10.8.1 Prior arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 viii Contents 10.9 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 10.10 Frequentist hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . 129 10.10.1 µ vs µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 1 2 10.10.2 µ vs µ̸=µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 0 0 10.11 Bayesian hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . 130 10.11.1 µ vs µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 1 2 10.11.2 µ vs µ̸=µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 0 0 10.11.2.1 Use the credible interval . . . . . . . . . . . . . . . . . 132 10.11.2.2 Use the likelihood ratio . . . . . . . . . . . . . . . . . . 132 10.11.2.3 The integrated likelihood . . . . . . . . . . . . . . . . . 133 10.12 Pivotal functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 10.13 Conjugate priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 10.14 The uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 10.14.1 The location-shifted uniform distribution . . . . . . . . . . . . . . 136 11 Statistical Inference III – two-parameter continuous distributions...... 137 11.1 The Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 11.2 Frequentist analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 11.3 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 11.3.1 Inference for σ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 11.3.2 Inference for µ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 11.3.2.1 Simulation marginalisation . . . . . . . . . . . . . . . . 140 11.3.3 Parametric functions . . . . . . . . . . . . . . . . . . . . . . . . . 140 11.3.4 Prediction of a new observation . . . . . . . . . . . . . . . . . . . 142 11.4 The lognormal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 11.4.1 The lognormal density . . . . . . . . . . . . . . . . . . . . . . . . 143 11.5 The Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 11.5.1 The Weibull likelihood . . . . . . . . . . . . . . . . . . . . . . . . 145 11.5.2 Frequentist analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 146 11.5.3 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 11.5.4 The extreme value distribution . . . . . . . . . . . . . . . . . . . 148 11.5.5 Median Rank Regression (MRR) . . . . . . . . . . . . . . . . . . 148 11.5.6 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 11.6 The gamma distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 11.7 The gamma likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 11.7.1 Frequentist analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 151 11.7.2 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 12 Model assessment...................................................... 157 12.1 Gaussian model assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 157 12.2 Lognormal model assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 158 12.3 Exponential model assessment . . . . . . . . . . . . . . . . . . . . . . . . . 158 12.4 Weibull model assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 12.5 Gamma model assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 13 The multinomial distribution .......................................... 167 13.1 The multinomial likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 13.2 Frequentist analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 13.3 Bayesian analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 13.4 Criticisms of the Haldane prior . . . . . . . . . . . . . . . . . . . . . . . . 171 Contents ix 13.4.1 The Dirichlet process prior . . . . . . . . . . . . . . . . . . . . . . 173 13.4.2 Posterior sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 173 13.5 Inference for multinomial quantiles . . . . . . . . . . . . . . . . . . . . . . 175 13.6 Dirichlet posterior weighting . . . . . . . . . . . . . . . . . . . . . . . . . . 176 13.7 The frequentist bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 13.7.1 Two-category sample . . . . . . . . . . . . . . . . . . . . . . . . . 179 13.8 Stratified sampling and weighting . . . . . . . . . . . . . . . . . . . . . . . 180 14 Model comparison and model averaging................................ 183 14.1 Comparison of two fully specified models . . . . . . . . . . . . . . . . . . . 183 14.2 General model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 14.2.1 Known parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 185 14.2.2 Unknown parameters . . . . . . . . . . . . . . . . . . . . . . . . . 185 14.3 Posterior distribution of the likelihood . . . . . . . . . . . . . . . . . . . . 186 14.4 The deviance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 14.5 Asymptotic distribution of the deviance . . . . . . . . . . . . . . . . . . . 190 14.6 Nested models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 14.7 Model choice and model averaging . . . . . . . . . . . . . . . . . . . . . . . 192 15 Gaussian linear regression models...................................... 195 15.1 Simple linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 15.1.1 Vitamin K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 15.2 Model assessment through residual examination . . . . . . . . . . . . . . . 197 15.3 Likelihood for the simple linear regression model . . . . . . . . . . . . . . 199 15.4 Maximum likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 15.4.1 Vitamin K example . . . . . . . . . . . . . . . . . . . . . . . . . . 203 15.5 Bayesian and frequentist inferences . . . . . . . . . . . . . . . . . . . . . . 203 15.6 Model-robust analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 15.6.1 The robust variance estimate. . . . . . . . . . . . . . . . . . . . . 206 15.7 Correlation and prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 15.7.1 Correlation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 15.7.2 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 15.7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 15.7.4 Prediction as a model assessment tool. . . . . . . . . . . . . . . . 209 15.8 Probability model assessment . . . . . . . . . . . . . . . . . . . . . . . . . 209 15.9 “Dummy variable” regression . . . . . . . . . . . . . . . . . . . . . . . . . 210 15.10 Two-variable models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 15.11 Model assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 15.12 The p-variable linear model . . . . . . . . . . . . . . . . . . . . . . . . . . 215 15.13 The Gaussian multiple regression likelihood . . . . . . . . . . . . . . . . . 215 15.13.1 Absence from school . . . . . . . . . . . . . . . . . . . . . . . . . 216 15.14 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 15.14.1 ANOVA, ANCOVA and MR . . . . . . . . . . . . . . . . . . . . . 217 15.14.1.1 ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 218 15.14.1.2 Backward elimination . . . . . . . . . . . . . . . . . . . 218 15.14.1.3 ANCOVA . . . . . . . . . . . . . . . . . . . . . . . . . 219 15.15 Ridge regression, the Lasso and the “elastic net” . . . . . . . . . . . . . . 219 15.16 Modelling boy birthweights . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 15.17 Modelling girl intelligence at age ten and family income . . . . . . . . . . . 222 15.18 Modelling of the hostility data . . . . . . . . . . . . . . . . . . . . . . . . . 226