This page intentionally left blank 902059_00_fm.indd 3 08/09/16 8:19 PM Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries. Published in Canada by Oxford University Press 8 Sampson Mews, Suite 204, Don Mills, Ontario M3C 0H5 Canada www.oupcanada.com Copyright © Oxford University Press Canada 2017 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First Edition published in 2008 Second Edition published in 2013 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Permissions Department at the address above or through the following url: www.oupcanada.com/permission/permission_request.php Every effort has been made to determine and contact copyright holders. In the case of any omissions, the publisher will be pleased to make suitable acknowledgement in future editions. Library and Archives Canada Cataloguing in Publication Haan, Michael, 1974–, author An introduction to statistics for Canadian social scientists / Michael Haan and Jenny Godley. — Third edition. Includes bibliographical references and index. ISBN 978–0–19–902059–1 (paperback) 1. Social sciences—Statistical methods—Textbooks. I. Godley, Jenny, author II. Title. HA29.H22 2016 300.72’7 C2016-902883-6 Cover image: © iStock/sorbetto Oxford University Press is committed to our environment. Wherever possible, our books are printed on paper which comes from responsible sources Printed and bound in the United States of America 1 2 3 4 — 20 19 18 17 902059_00_fm.indd 4 08/09/16 8:19 PM Contents Boxes xii Preface xv Highlights of the Third Edition xvi About the Authors xvii Acknowledgements xvii Part I | Introduction and Univariate Statistics 1 1 Why Should I Want to Learn Statistics? 2 Learning Objectives 2 Introduction 2 Why Do So Many People Dislike Statistics? 3 When Did People Start to Think Statistically? 5 If I Don’t Plan to Use Statistics in My Career, Should I Still Learn about Them? 5 Organization of This Book 6 Conclusion 8 Glossary Terms 8 2 How Much Math Do I Need to Learn Statistics? 9 Learning Objectives 9 BEDMAS and the Order of Operations 9 Fractions and Decimals 11 Exponents 11 Logarithms 11 Data, Variables, and Observations 11 Levels of Measurement 12 When Four Levels of Measurement Become Three . . . or Even Two 15 Conclusion 15 Glossary Terms 15 Practice Questions 15 902059_00_fm.indd 5 08/09/16 8:19 PM vi Contents 3 Univariate Statistics 17 Learning Objectives 17 Frequencies 17 Translating Frequencies 19 Rules for Creating Bar Charts 19 Rates and Ratios 21 Percentages and Percentiles 22 Conclusion 24 Glossary Terms 24 Practice Questions 24 4 Introduction to Probability 26 Learning Objectives 26 Introduction 26 Some Necessary Terminology 27 Sample Space 27 Random Variables 28 Trials and Experiments 28 The Law of Large Numbers 28 Types of Probabilities 28 Empirical versus Theoretical Probabilities 29 Discrete Probabilities 29 The Probability of Unrelated Events 30 The Probability of Related Events 31 Mutually Exclusive Probabilities 32 Non–Mutually Exclusive Probabilities 33 Continuous Probabilities 34 Conclusion 34 Glossary Terms 34 Practice Questions 34 5 The Normal Curve 36 Learning Objectives 36 The History of the Normal (Gaussian) Distribution 36 Illustrating the Normal Curve 37 Some Useful Terms for Describing Distributions 40 Conclusion 42 Glossary Terms 42 Practice Questions 42 902059_00_fm.indd 6 08/09/16 8:19 PM Contents vii 6 Measures of Central Tendency and Dispersion 44 Learning Objectives 44 Introduction 44 Measures of Central Tendency 44 Mode 45 Median 45 Mean 45 Measures of Variability 48 Range 48 Mean Deviation 48 Variance and the Standard Deviation 49 Conclusion 51 Glossary Terms 52 Practice Questions 52 Note 53 7 Standard Deviations, Standard Scores, and the Normal Distribution 54 Learning Objectives 54 Introduction 54 How Does the Standard Deviation Relate to the Normal Curve? 54 More on the Normal Distribution 55 An Extension of the Standard Deviation: The Standard Score 59 One-Tailed Assessments 64 Probabilities and the Normal Distribution 68 Conclusion 70 Glossary Terms 70 Practice Questions 70 8 Sampling 72 Learning Objectives 72 Introduction 72 Probability Samples 73 Simple Random Sample 73 Systematic Random Sample 74 Stratified/Hierarchical Random Sample 75 Cluster Sample 76 Non-Probability/Non-Random Sampling Strategies 76 Convenience Sample 76 Snowball Sample 76 Quota Sample 77 902059_00_fm.indd 7 08/09/16 8:19 PM viii Contents Sampling Error 77 Tips for Reducing Sampling Error 78 Conclusion 78 Glossary Terms 79 Practice Questions 79 9 Generalizing from Samples to Populations 81 Learning Objectives 81 Introduction 81 The Sample Distribution of Means and the Central Limit Theorem 81 Confidence Intervals 83 The t-Distribution 84 What Is a Degree of Freedom? 87 One-Tailed versus Two-Tailed Estimates 88 The Sample Distribution of Proportions 88 Using Degrees of Freedom and the t-Distribution to Estimate Population Proportions 89 The Binomial Distribution 90 Conclusion 91 Glossary Terms 91 Practice Questions 91 Part II | Bivariate Statistics 95 10 Testing Hypotheses: Comparing Large and Small Samples to a Known Population 96 Learning Objectives 96 Introduction 96 What’s a Hypothesis? 97 One-Tailed and Two-Tailed Hypothesis Tests 99 The Return of Gosset: Student’s t-Distribution 102 Hypothesis Testing with One Small Sample and a Population 104 Calculating Confidence Intervals in the One-Sample Case 107 Single Sample Proportions 108 Measuring Association with the Same Group Measured Twice 109 Conclusion 111 Glossary Terms 112 Practice Questions 112 11 Testing Hypotheses: Comparing Two Samples 114 Learning Objectives 114 Introduction 114 902059_00_fm.indd 8 08/09/16 8:19 PM Contents ix The Standard Error of the Difference between Means 115 Comparing Proportions with Two Samples 119 One- and Two-Tailed Tests, Again 121 Conclusion 122 Glossary Terms 123 Practice Questions 123 12 Bivariate Statistics for Nominal Data 124 Learning Objectives 124 Introduction 124 Analysis with Two Nominal Variables 125 The Chi-Square Test of Statistical Significance 127 Measures of Association for Nominal Data 131 Phi 131 Cramer’s V 134 The Proportional Reduction of Error: Lambda 134 Conclusion 138 Glossary Terms 138 Practice Questions 139 13 Bivariate Statistics for Ordinal Data 141 Learning Objectives 141 Introduction 141 Contingency Tables/Cross-Tabulations 142 Kruskal’s Gamma (γ) 144 Somers’ d 149 Kendall’s Tau-b 151 Spearman’s rho 152 What about Statistical Significance? 156 Conclusion: Which One to Use? 157 Glossary Terms 158 Practice Questions 158 14 Bivariate Statistics for Interval/Ratio Data 160 Learning Objectives 160 Introduction 160 Pearson’s r: The Correlation Coefficient 160 A Rough Interpretation of r 162 A Visual Representation of r 163 What r Tells Us about Explained Variance 165 902059_00_fm.indd 9 08/09/16 8:19 PM x Contents A More Precise Interpretation of r 167 The Correlation Matrix 167 Using a t-Test to Assess the Significance of r 168 What to Do When Your Independent and Dependent Variables Are Measured at Different Levels of Measurement 169 Measuring Association between Interval/Ratio and Nominal or Ordinal Variables: Using the Lowest Common Measure of Association 170 Conclusion 171 Glossary Terms 171 Practice Questions 171 15 One-Way Analysis of Variance 173 Learning Objectives 173 Introduction 173 What Is ANOVA? 173 The Sum of Squares: An Easier Way 176 The F-Distribution 178 Is This New? 183 Limitations of ANOVA 183 Conclusion 184 Glossary Terms 185 Practice Questions 185 Part III | Multivariate Techniques 187 16 Regression 1—Modelling Continuous Outcomes 188 Learning Objectives 188 Introduction 188 Ordinary Least-Squares Regression: The Idea 188 Onward from Bivariate Correlation: Multivariate Analysis 189 Regression: The Formula 191 Multiple Regression 198 Standardized Partial Slopes (Beta Weights) 201 The Multiple Correlation Coefficient 202 Requirements/Assumptions of Ordinary Least Squares Regression 203 Creating and Working with Dummy Variables 204 Interpreting Dummy Variable Coefficients 205 Inference and Regression 206 Conclusion: A Final Note on OLS Regression 210 Glossary Terms 210 Practice Questions 210 902059_00_fm.indd 10 08/09/16 8:19 PM