ebook img

Exploratory Data Analysis in Business and Economics: An Introduction Using SPSS, Stata, and Excel PDF

234 Pages·2014·10.06 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Exploratory Data Analysis in Business and Economics: An Introduction Using SPSS, Stata, and Excel

Thomas Cleff Exploratory Data Analysis in Business and Economics An Introduction Using SPSS, Stata, and Excel Exploratory Data Analysis in Business and Economics ThiSisaFMBlankPage Thomas Cleff Exploratory Data Analysis in Business and Economics An Introduction Using SPSS, Stata, and Excel ThomasCleff PforzheimUniversity Pforzheim,Germany Chapters1–6translatedfromtheGermanoriginal,Cleff,T.(2011).DeskriptiveStatistikund moderneDatenanalyse: Einecomputergest€utzteEinf€uhrungmitExcel,PASW(SPSS)und Stata.2.u¨berarb.u.erw.Auflage2011#GablerVerlag,SpringerFachmedienWiesbaden GmbH,2011 ISBN978-3-319-01516-3 ISBN978-3-319-01517-0(eBook) DOI10.1007/978-3-319-01517-0 SpringerChamHeidelbergNewYorkDordrechtLondon LibraryofCongressControlNumber:2013951433 #SpringerInternationalPublishingSwitzerland2014 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionor informationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerpts inconnectionwithreviewsorscholarlyanalysisormaterialsuppliedspecificallyforthepurposeofbeing enteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthework.Duplication ofthispublicationorpartsthereofispermittedonlyundertheprovisionsoftheCopyrightLawofthe Publisher’s location, in its current version, and permission for use must always be obtained from Springer.PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter. ViolationsareliabletoprosecutionundertherespectiveCopyrightLaw. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexempt fromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. While the advice and information in this book are believed to be true and accurate at the date of publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with respecttothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScienceþBusinessMedia(www.springer.com) Preface Thistextbook,ExploratoryDataAnalysisinBusinessandEconomics:AnIntroduction UsingSPSS,Stata,andExcel,aimstofamiliarizestudentsofeconomicsandbusiness aswellaspractitionersinfirmswiththebasicprinciples,techniques,andapplications ofdescriptivestatisticsanddataanalysis.Drawingonpracticalexamplesfrombusiness settings, it demonstrates the basic descriptive methods of univariate and bivariate analyses. The textbook covers a range of subject matter, from data collection and scalingtothepresentationandunivariateanalysisofquantitativedata,andalsoincludes analyticproceduresforassessingbivariaterelationships.Inthisway,itaddressesallof thetopicstypicallycoveredinauniversitycourseondescriptivestatistics. Inwritingthisbook,Ihaveconsistentlyendeavouredtoprovidereaderswithan understandingofthethinkingprocessesunderlyingdescriptivestatistics. Ibelieve this approach will be particularly valuable to those who might otherwise have difficulty with the formal method of presentation used by many textbooks. In numerousinstances,Ihavetriedtoavoidunnecessaryformulas,attemptinginstead to provide the reader with an intuitive grasp of a concept before deriving or introducing the associated mathematics. Nevertheless, a book about statistics and data analysis that omits formulaswould be neither possible nordesirable. Indeed, whenever ordinary language reaches its limits, the mathematical formula has always been the best tool to express meaning. To provide further depth, I have included practice problems and solutions at the end of each chapter, which are intendedtomakeiteasierforstudentstopursueeffectiveself-study. Thebroadavailabilityofcomputersnowmakesitpossibletoteachstatisticsinnew ways.Indeed,studentsnowhaveaccesstoarangeofpowerfulcomputerapplications, from Excel to various statistics programmes. Accordingly, this textbook does not confine itself to presenting descriptive statistics, but also addresses the use of programmes such as Excel, SPSS, and Stata. To aid the learning process, datasets havebeenmadeavailableatspringer.com,alongwithothersupplementalmaterials, allowingalloftheexamplesandpracticeproblemstoberecalculatedandreviewed. I want to take this opportunity to thank all those who have collaborated in makingthisbookpossible.Firstandforemost,IwouldliketothankLucaisSewell ([email protected]) for translating this work from German into English. Itisnosmallfeattorenderanacademictextsuchasthisintoprecisebutreadable English. Well-deserved gratitude for their critical review of the manuscript and valuablesuggestionsgoestoBirgitAschhoff,ChristophGrimpe,BerndKuppinger, v vi Preface BettinaM€uller,BettinaPeters,WolfgangSch€afer,KatjaSpecht,FritzWegner,and Kirsten W€ust, as well as many other unnamed individuals. Any errors or shortcomings that remain are entirely my own. I would also like to express my thanks to Alice Blanck at Springer Science + Business Media for her assistance with this project. Finally, this book could not have been possible without the ongoingsupportofmyfamily.Theydeservemyveryspecialgratitude. Please do not hesitate to contact me directly with feedback or any suggestions youmayhaveforimprovements([email protected]). Pforzheim ThomasCleff March2013 Contents 1 StatisticsandEmpiricalResearch . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 DoStatisticsLie? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 TwoTypesofStatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 TheGenerationofKnowledgeThroughStatistics . . . . . . . . . . . . 4 1.4 ThePhasesofEmpiricalResearch . . . . . . . . . . . . . . . . . . . . . . . 6 1.4.1 FromExplorationtoTheory . . . . . . . . . . . . . . . . . . . . . . 6 1.4.2 FromTheoriestoModels . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.3 FromModelstoBusinessIntelligence . . . . . . . . . . . . . . . 11 2 DisarraytoDataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1 DataCollection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 LevelofMeasurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 ScalingandCoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4 MissingValues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5 OutliersandObviouslyIncorrectValues . . . . . . . . . . . . . . . . . . . 21 2.6 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3 UnivariateDataAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1 FirstStepsinDataAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 MeasuresofCentralTendency . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.1 ModeorModalValue . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.3 GeometricMean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.4 HarmonicMean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.5 TheMedian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.6 QuartileandPercentile . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3 TheBoxplot:AFirstLookatDistributions . . . . . . . . . . . . . . . . 42 3.4 DispersionParameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.4.1 StandardDeviationandVariance . . . . . . . . . . . . . . . . . . 46 3.4.2 TheCoefficientofVariation . . . . . . . . . . . . . . . . . . . . . . 48 3.5 SkewnessandKurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.6 RobustnessofParameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.7 MeasuresofConcentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 vii viii Contents 3.8 UsingtheComputertoCalculateUnivariateParameters . . . . . . . 55 3.8.1 CalculatingUnivariateParameterswithSPSS . . . . . . . . . 55 3.8.2 CalculatingUnivariateParameterswithStata . . . . . . . . . . 56 3.8.3 CalculatingUnivariateParameterswithExcel2010 . . . . . 57 3.9 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4 BivariateAssociation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.1 BivariateScaleCombinations . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 AssociationBetweenTwoNominalVariables . . . . . . . . . . . . . . . 61 4.2.1 ContingencyTables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.2 Chi-SquareCalculations . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.3 ThePhiCoefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.4 TheContingencyCoefficient . . . . . . . . . . . . . . . . . . . . . . 70 4.2.5 Cramer’sV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.2.6 NominalAssociationswithSPSS . . . . . . . . . . . . . . . . . . 72 4.2.7 NominalAssociationswithStata . . . . . . . . . . . . . . . . . . . 75 4.2.8 NominalAssociationswithExcel . . . . . . . . . . . . . . . . . . 76 4.2.9 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 AssociationBetweenTwoMetricVariables . . . . . . . . . . . . . . . . 80 4.3.1 TheScatterplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.2 TheBravais-PearsonCorrelationCoefficient . . . . . . . . . . 83 4.4 RelationshipsBetweenOrdinalVariables . . . . . . . . . . . . . . . . . . 86 4.4.1 Spearman’sRankCorrelationCoefficient (Spearman’srho) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4.2 Kendall’sTau(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.5 MeasuringtheAssociationBetweenTwoVariables withDifferentScales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.5.1 MeasuringtheAssociationBetweenNominalandMetric Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.5.2 MeasuringtheAssociationBetweenNominalandOrdinal Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.5.3 AssociationBetweenOrdinalandMetricVariables . . . . . 100 4.6 CalculatingCorrelationwithaComputer . . . . . . . . . . . . . . . . . . 101 4.6.1 CalculatingCorrelationwithSPSS . . . . . . . . . . . . . . . . . 101 4.6.2 CalculatingCorrelationwithStata . . . . . . . . . . . . . . . . . . 102 4.6.3 CalculatingCorrelationwithExcel . . . . . . . . . . . . . . . . . 104 4.7 SpuriousCorrelations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.7.1 PartialCorrelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.7.2 PartialCorrelationswithSPSS . . . . . . . . . . . . . . . . . . . . 109 4.7.3 PartialCorrelationswithStata . . . . . . . . . . . . . . . . . . . . . 109 4.7.4 PartialCorrelationwithExcel . . . . . . . . . . . . . . . . . . . . . 110 4.8 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5 RegressionAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.1 FirstStepsinRegressionAnalysis . . . . . . . . . . . . . . . . . . . . . . 115 5.2 CoefficientsofBivariateRegression . . . . . . . . . . . . . . . . . . . . . 116 Contents ix 5.3 MultivariateRegressionCoefficients . . . . . . . . . . . . . . . . . . . . 122 5.4 TheGoodnessofFitofRegressionLines . . . . . . . . . . . . . . . . . 123 5.5 RegressionCalculationswiththeComputer . . . . . . . . . . . . . . . 125 5.5.1 RegressionCalculationswithExcel . . . . . . . . . . . . . . . . 125 5.5.2 RegressionCalculationswithSPSSandStata . . . . . . . . . 126 5.6 GoodnessofFitofMultivariateRegressions . . . . . . . . . . . . . . . 128 5.7 RegressionwithanIndependentDummyVariable . . . . . . . . . . . 129 5.8 LeverageEffectsofDataPoints . . . . . . . . . . . . . . . . . . . . . . . . 131 5.9 NonlinearRegressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.10 ApproachestoRegressionDiagnostics . . . . . . . . . . . . . . . . . . . 135 5.11 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6 TimeSeriesandIndices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 6.1 PriceIndices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 6.2 QuantityIndices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 6.3 ValueIndices(SalesIndices) . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 6.4 DeflatingTimeSeriesbyPriceIndices . . . . . . . . . . . . . . . . . . . . 158 6.5 ShiftingBasesandChainingIndices . . . . . . . . . . . . . . . . . . . . . . 159 6.6 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7 ClusterAnalysis . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.1 HierarchicalClusterAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 7.2 K-MeansClusterAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 7.3 ClusterAnalysiswithSPSSandStata . . . . . . . . . . . . . . . . . . . . . 177 7.4 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 8 FactorAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 8.1 FactorAnalysis:Foundations,Methods,Interpretations . . . . . . . . 183 8.2 FactorAnalysiswithSPSSandStata . . . . . . . . . . . . . . . . . . . . . 191 8.3 ChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 9 SolutionstoChapterExercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.