The PSPP Guide: An Introduction to Statistical Analysis Second Edition Christopher P. Halter CreativeMinds Press Group San Diego, CA PSPP Guide Copyright © 2017 by Christopher P. Halter All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission of the CreativeMinds Press Group ISBN: 0692866043 ISBN-13: 978-0692866047 i i CONTENTS Chapter 1 An Introduction to the Guide Second Edition 1 Notes about the statistics guide 1 Notes about the data 2 The Philosophy Behind This Book and the Open Source Community 3 Chapter 2 Overview of Statistical Analysis in Social Science 4 Why use statistics in Social Science research? 4 What is Continuous and Categorical Data? 5 Parametric versus Non-Parametric Data 10 Confidence Intervals (CI) 11 P-Value 12 Effect Size 14 Effect Size Calculations 15 Chapter 3 The PSPP Statistical Analysis Environment 17 What is PSPP? 17 Data Visualization 20 Chapter 4 Getting Started with PSPP 22 Preparing the Data and Making Decisions 22 Creating Your Variable Codebook 22 Creating Variable/Data Names in PSPP 25 Entering Data Directly into PSPP 29 Opening Data Files with PSPP (.sav files) 30 Importing Data Files into PSPP from a Spreadsheet (.ods files) 32 Chapter 5 Descriptive Statistics 36 What are descriptive statistics? 36 Creating Descriptive Statistics in PSPP for Categorical Data 36 Creating Visual Representations for Categorical Data 38 Creating Descriptive Statistics in PSPP for Continuous Data 39 Creating Visual Representations for Continuous Data 40 Exploring the Data 44 Chapter 6 Graphs: Scatterplot, Histogram, Barchart 48 Scatterplots 48 Histograms 49 Barcharts 51 Chapter 7 Relationship Analysis with Chi-Square 54 Chi-Square Analysis (Categorical Differences) 54 Using the Chi-Square Function in PSPP 54 Interpreting Output Tables: Chi Square 57 Chi-Square Crosstabs Table Analysis 60 i PSPP Guide Chapter 8 Relationship Analysis with t-Test 68 t-Test Analysis (Continuous Differences, two groups) 68 One Sample t-Test using PSPP 69 Independent Samples t-Test using PSPP 71 Paired Samples t-Test using PSPP 78 Chapter 9 Relationship Analysis with ANOVA 81 Analysis of Variance (ANOVA) 81 One-Way ANOVA 81 Interpreting Output Tables: One-Way ANOVA 84 Introduction to Planned Contrasts 87 Conducting One-Way ANOVA with Planned Contrasts 87 ANOVA with Planned Contrasts for Orthogonal Polynomial Trends 99 Analyzing the ANOVA Output Tables for Orthogonal Polynomial Trends 102 Chapter 10 Univariate Analysis: General Linear Model (GLM) 104 Using Univariate Analysis for the General Linear Model (GLM) 105 Using Univariate Analysis for Two-Way (Factorial) ANOVA 107 Chapter 11 Associations with Correlation 110 Correlation Analysis with PSPP 110 Chapter 12 Associations with Regression (Linear) 114 Regression Analysis with PSPP 114 Interpreting Output Tables: Regression 117 Chapter 13 Associations with Regression (Binomial Logistic) 119 A Simple Binomial Logistic Regression Example with Categorical Data 121 A Simple Binomial Logistic Regression Example with Continuous Data 124 Chapter 14 Reliability 127 Reliability Using PSPP for Agreement 128 Reliability Using PSPP for Accuracy 130 Chapter 15 Factor Analysis 132 What is Factor Analysis? 132 Determining the Number of Factors to Extract 133 Conducting Factor Analysis with PSPP 136 Chapter 16 Why is Statistics So Confusing? 143 The Research Process 143 Exploring the data 144 The General Linear Model (GLM) 146 One-Way ANOVA with Confidence Intervals 147 One-Way ANOVA with Contrasts for Trends 148 Our Findings from the Data 148 Chapter 17 Concluding Thoughts 149 Resources 152 Analysis Memos 153 i i Second Edition High School & Beyond Codebook 155 High School & Beyond Sample Data Set 156 Reliability for Agreement Codebook 164 Reliability for Agreement Dataset 165 Reliability for Accuracy Codebook 166 Reliability for Accuracy Dataset 167 Test Scores Codebook 168 Test Scores Dataset 169 Effect Size Tables 170 Box & Whisker Plots Using OpenOffice Spreadsheets 174 Additional Resources 183 References 185 Index 186 ii i PSPP Guide iv Second Edition ACKNOWLEDGMENTS Whether knowingly or unknowingly, those of us using technology owe a great deal to the open source software community. It is through projects such as PSPP, OpenOffice, Linux, and others that useful applications can be freely distributed. The programmers who make up this community of professionals offer their time and effort for nothing more than the ability to share something worthwhile with the rest of us. THANK YOU. v Chapter 1 An Introduction to the Guide Second Edition Notes about the statistics guide So let’s get this out of the way right from the start. This is NOT a math book. The PSPP Guide will not contain beautiful, complex statistical equations. It will not explain the formulas and mathematics behind the statistical tests. It will not provide step by step mathematical guidance to reproduce the statistical results by hand. So what is the purpose of this book? I am glad you asked. The purpose of this guide is to assist the novice social science and educational researcher in interpreting statistical output data using the PSPP Statistical Analysis application. Through the examples and guidance, you will be able to select the statistical test that is appropriate for your data, apply the inferential test to your data, and interpret a statistical test’s output table. The Guide goes into the uses of some of the most commonly used statistical tests and discusses some of the limitations of those tests, i.e. Chi-square, t-Test, ANOVA, 1 PSPP Guide Correlation, and Regressions (Linear and Binomial). The ANOVA description included procedures for conducting the One-Way ANOVA with Planned Contrasts so that you may test a specific hypothesis concerning group interactions, as well as the General Linear Model (GLM) for other types of ANOVA analysis. Exploratory Factor Analysis has been included in this guide as a valuable procedure for data reduction. The use of Reliability tests will be discussed as a way to verify the reliability of coding data between researchers. Statistical tests are designed to handle either parametric or non-parametric data. The differences between these types of data will be discussed in a later chapter. The majority of the tests included in PSPP are designed for parametric data analysis and will be the focus of this book. PSPP also contains a handful of non-parametric data tools, but these will not be discussed here. The sample window views and output tables shown in this guide were mainly created from PSPP 0.10.x, the graphical user interface version of PSPP called PSPPIRE. PSPP is officially described as a “replacement” application for IBM’s Statistical Package for the Social Sciences (SPSS). However, PSPP does not have any official acronym expansion. The developers of PSPP have some suggestions, such as; • Perfect Statistics Professionally Presented. • Probabilities Sometimes Prevent Problems. • People Should Prefer PSPP. The examples shown in this guide represent a subset of the data obtained in the 1988-2000 High School & Beyond (HS&B) study commissioned by the Center on Education Policy (CEP). The sample datasets contains 200 cases and are intended to provide statistical analysis practice and not to draw any conclusions about the sample population. Notes about the data The High School & Beyond study was commissioned by the Center on Education Policy (CEP) and conducted by researcher Harold Wenglinsky. The study was based on the statistical analyses of a nationally representative, longitudinal database of students and schools from the National Educational Longitudinal Study of 1988- 2000 (NELS). The study focused on a sample of low-income students from inner- city high schools. The study compared achievement and other education-related outcomes for students in different types of public and private schools, including comprehensive public high schools (the typical model for the traditional high school); public magnet schools and “schools of choice;” various types of Catholic parochial schools and other religious schools; and independent, secular private 2