A Handbook of Statist ical Analyses using Stata Third Edition © 2004 by CRC Press LLC A Handbook of Statist ical Analyses using Stata Third Edition Sophia Rabe-Hesketh Brian Everitt CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London New York Washington, D.C. © 2004 by CRC Press LLC Library of Congress Cataloging-in-Publication Data Rabe-Hesketh, S. A handbook of statistical analyses using Stata / Sophia Rabe-Hesketh, Brian S. Everitt.— [3rd ed.]. p. cm. Includes bibliographical references and index. ISBN 1-58488-404-5 (alk. paper) 1. Stata. 2. Mathematical statistics—Data processing. I. Everitt, Brian. II. Title. QA276.4.R33 2003 519.5′0285′5369—dc22 2003065361 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher. The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such copying. Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation, without intent to infringe. Visit the CRC Press Web site at www.crcpress.com © 2004 by CRC Press LLC No claim to original U.S. Government works International Standard Book Number 1-58488-404-5 Library of Congress Card Number 2003065361 Printed in the United States of America 1 2 3 4 5 6 7 8 9 0 Printed on acid-free paper © 2004 by CRC Press LLC Preface Stata is an exciting statistical package that offers all standard and many non-standard methods of data analysis. In addition to general methods such as linear, logistic and Poisson regression and generalized linear models, Stata provides many more specialized analyses, such as generalized estimating equations from biostatistics and the Heckman selection model from econometrics. Stata has extensive capabilities for the analysis of survival data, time series, panel (or longitudinal) data, and complex survey data. For all estimation problems, inferences can be made more robust to model misspecification using bootstrapping or robust standard errors based on the sandwich estimator. In each new release of Stata, its capabilities are significantly enhanced by a team of excellent statisticians and developers at Stata Corporation. Although extremely powerful, Stata is easy to use, either by point- and-clickorthroughitsintuitivecommandsyntax. Appliedresearchers, students, and methodologists therefore all find Stata a rewarding envi- ronment for manipulating data, carrying out statistical analyses, and producing publication quality graphics. Stataalsoprovidesapowerfulprogramminglanguagemakingiteasy to implement a ‘tailor-made’ analysis for a particular application or to write more general commands for use by the wider Stata community. In fact we consider Stata an ideal environment for developing and dis- seminating new methodology. First, the elegance and consistency of the programming language appeals to the esthetic sense of methodol- ogists. Second, it is simple to make new commands behave in every way like Stata’s own commands, making them accessible to applied re- searchers andstudents. Third,Stata’s emailing list Statalist, The Stata Journal, the Stata Users’ Group Meetings, and the Statistical Software Components (SSC) archive on the internet all make exchange and dis- cussion of new commands extremely easy. For these reasons Stata is © 2004 by CRC Press LLC constantly kept up-to-date with recent developments, not just by its own developers, but also by a very active Stata community. This handbook follows the format of its two predecessors, A Hand- book of Statistical Analysisusing S-PLUSandA Handbook of Statistical Analysis using SAS. Each chapter deals with the analysis appropriate for a particular application. A brief account of the statistical back- ground is included in each chapter including references to the litera- ture,buttheprimaryfocusisonhowtouseStata, andhowtointerpret results. Our hope is that this approach will provide a useful comple- ment to the excellent but very extensive Stata manuals. The majority of the examples are drawn from areas in which the authors have most experience, but we hope that current and potential Stata users from outside these areas will have little trouble in identifying the relevance of the analyses described for their own data. This third edition contains new chapters on random effects mod- els, generalized estimating equations, and cluster analysis. We have also thoroughly revised all chapters and updated them to make use of new features introduced in Stata 8, in particular the much improved graphics. Particular thanks are due to Nick Cox who provided us with exten- sive general comments for the second and third editions of our book, andalso gave usclear guidanceas tohow bestto useanumberof Stata commands. We are also grateful to Anders Skrondal for commenting on several drafts of the current edition. Various people at Stata Cor- poration have been very helpfulin preparingboth thesecond and third editions of this book. We would also like to acknowledge the usefulness of the Stata Netcourses in the preparation of the first edition of this book. All the datasets can be accessed on the internet at the following Web sites: (cid:1) http://www.stata.com/texts/stas3 (cid:1) http://www.iop.kcl.ac.uk/IoP/Departments/ BioComp/stataBook.shtml S. Rabe-Hesketh B. S. Everitt London © 2004 by CRC Press LLC Dedication To my parents, Birgit and Georg Rabe Sophia Rabe-Hesketh To my wife, Mary Elizabeth Brian S. Everitt © 2004 by CRC Press LLC Contents 1ABriefIntroductiontoStata 1.1Getti nghelpandi nformation 1.2RunningStata 1.3Conventionsusedinthisbook 1.4Datasetsi nSta ta 1.5Statacommands 1.6Datamanagement 1.7Estimation 1.8Graphics 1.9Sta taasacalculato r 1.10Briefintroductiontoprogramming 1.11KeepingStatauptodate 1.12Exercises 2 Data Description and Simple Inference: Female PsychiatricPatients 2.1Descriptionofda ta 2.2Groupcomparisonandcorrelations 2.3AnalysisusingStata 2.4Exercises 3 Multiple Regression: Determinants of Pollution in U.S.Cities 3.1Descriptionofda ta 3.2Themultipleregressionmodel 3.3AnalysisusingStata 3.4Exercises 4AnalysisofVarianceI:TreatingHypertension © 2004 by CRC Press LLC 4.1Descriptionofda ta 4.2Analysiso fvariancemodel 4.3AnalysisusingStata 4.4Exercises 5 Analysis of Variance II: Effectiveness of Slimming Clinics 5.1Descriptionofda ta 5.2Analysiso fvariancemodel 5.3AnalysisusingStata 5.4Exercises 6 Logistic Regression: Treatment of Lung Cancer andDiagnosisofHeartAttacks 6.1Descriptionofda ta 6.2Thelogisticregressionmodel 6.3AnalysisusingStata 6.4Exercises 7 Generalized Linear Models: Australian School Children 7.1Descriptionofda ta 7.2Generalizedlinearmodels 7.3AnalysisusingStata 7.4Exercises 8 Summary Measure Analysis of Longitudinal Data: TheTreatmentofPost-NatalDepression 8.1Descriptionofda ta 8.2Theanalysisoflongitudinaldata 8.3AnalysisusingStata 8.4Exercises 9 Random Effects Models: Thought disorder and schizophrenia 9.1Descriptionofda ta 9.2Randomeffectsmodels 9.3AnalysisusingStata 9.4Thoughtdisorderdata 9.5Exercises 10 Generalized Estimating Equations: Epileptic SeizuresandChemotherapy 10.1Introduction 10.2Generalizedestimatingequations © 2004 by CRC Press LLC 10.3AnalysisusingStata 10.4Exercises 11SomeEpidemiology 11.1Descriptionofda ta 11.2Introductiontoepidemiology 11.3AnalysisusingStata 11.4Exercises 12 Survival Analysis: Retention of Heroin Addicts in MethadoneMaintenanceTreatment 12.1Descriptionofda ta 12.2Survivalanalysis 12.3AnalysisusingStata 12.4Exercises 13 Maximum Likelihood Estimation: Age of Onset of Schizophrenia 13.1Descriptionofda ta 13.2Finitemixturedistributions 13.3AnalysisusingStata 13.4Exercises 14 Principal Components Analysis: Hearing MeasurementusinganAudiometer 14.1Descriptionofda ta 14.2Principalcomponentanalysis 14.3AnalysisusingStata 14.4Exercises 15 Cluster Analysis: Tibetan Skulls and Air PollutionintheUSA 15.1Descriptionofda ta 15.2Clusteranalysis 15.3AnalysisusingStata 15.4Exercises Appendix:AnswerstoSelectedExercises References © 2004 by CRC Press LLC Distributors for Stata The distributor for Stata in the United States is: Stata Corporation 4905 Lakeway Drive College Station, TX 77845 email: [email protected] Web site: http://www.stata.com Telephone: 979-696-4600 In the United Kingdom the distributor is: Timberlake Consultants Unit B3, Broomsleigh Business Park Worsley Bridge Road London SE26 5BN email: [email protected] Web site: http://www.timberlake.co.uk Telephone: 44(0)-20-8697-3377 For a list of distributors in other countries, see the Stata Web page. © 2004 by CRC Press LLC
Description: