ebook img

The basic practice of statistics PDF

152 Pages·2003·1.68 MB·english
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The basic practice of statistics

W. H. Freeman Publishers - The Basic Practi... http://www.whfreeman.com/highschool/book.as... Preview this Book The Basic Practice of Statistics Request Exam Copy Third Edition David S. Moore (Purdue U.) Go To Companion Site Download Text chapters in .PDF format. You will need Adobe Acrobat Reader version 3.0 or above to view these June 2003, cloth, preview materials. (Additional instructions below.) 0-7167-9623-6 Exploring Data: Variables and Distributions Companion Site Chapter 1 - Picturing Distributions with Graphs (CH 01.pdf; 300KB) Chapter 2 - Describing Distributions with Numbers (CH 02.pdf; 212KB) Summary Features Chapter 3 - Normal Distributions (CH 03.pdf; 328KB) New to This Edition Exploring Data: Relationships Media Chapter 4 - Scatterplots and Correlation (CH 04.pdf; 300KB) Supplements Chapter 5 - Regression (CH 05.pdf; 212KB) Table of Chapter 6 - Two-Way Tables (CH 06.pdf; 328KB) Contents Preview Materials These copyrighted materials are for promotional purposes only. They may Other Titles by: not be sold, copied, or distributed. David S. Moore Download Instructions for Preview Materials in .PDF Format We recommend saving these files to your hard drive by following the instructions below. PC users 1. Right-click on a chapter link below 2. From the pop-up menu, select "Save Link", (if you are using Netscape) or "Save Target" (if you are using Internet Explorer) 3. In the "Save As" dialog box, select a location on your hard drive and rename the file, if you would like, then click "save".Note the name and location of the file so you can open it later. Macintosh users 1. Click and hold your mouse on a chapter link below 2. From the pop-up menu, select "Save Link As" (if you are using Netscape) or "Save Target As" (if you are using Internet Explorer) 3. In the "Save As" dialog box, select a location on your hard drive and rename the file, if you would like, then click "save". Note the name and location of the file so you can open it later. 1 of 2 05/03/04 19:56 P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 Exploring Data T hefirststepinunderstandingdataistohearwhatthedatasay,to“let the statistics speak for themselves.” But numbers speak clearly only whenwehelpthemspeakbyorganizing,displaying,summarizing,and asking questions. That’s data analysis. The six chapters in Part I present the ideas and tools of statistical data analysis. They equip you with skills that are immediatelyusefulwheneveryoudealwithnumbers. Thesechaptersreflectthestrongemphasisonexploringdatathatcharacter- izesmodernstatistics.Althoughcarefulexplorationofdataisessentialifweare totrusttheresultsofinference,dataanalysisisn’tjustpreparationforinference. Tothinkaboutinference,wecarefullydistinguishbetweenthedataweactually haveandthelargeruniversewewantconclusionsabout.TheBureauofLabor Statistics, for example, has data about employment in the 55,000 households contactedbyitsCurrentPopulationSurvey.Thebureauwantstodrawconclu- sionsaboutemploymentinall110millionU.S.households.That’sacomplex problem. From the viewpoint of data analysis, things are simpler. We want to exploreandunderstandonlythedatainhand.Thedistinctionsthatinference requires don’t concern us in Chapters 1 to 6. What does concern us is a sys- tematicstrategyforexaminingdataandthetoolsthatweusetocarryoutthat strategy. Partofthatstrategyistofirstlookatonethingatatimeandthenatrelation- ships.InChapters1,2,and3youwillstudyvariablesand theirdistributions. Chapters4,5,and6concernrelationships among variables. 0 P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 PART I EXPLORING DATA: VARIABLES AND DISTRIBUTIONS Chapter 1 Picturing Distributions with Graphs Chapter 2 Describing Distributions with Numbers Chapter 3 The Normal Distributions EXPLORING DATA: RELATIONSHIPS Chapter 4 Scatterplots and Correlation Chapter 5 Regression Chapter 6 Two-Way Tables EXPLORING DATA REVIEW 1 P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 2 P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 CHAPTER 1 es) g a m I y ett G pts/ e c n o C ort p Alls m/ a h g n I ell arr D ( Picturing Distributions Inthischapterwecover... Individualsandvariables with Graphs Categoricalvariables: piechartsandbargraphs Quantitativevariables: histograms Interpretinghistograms Quantitativevariables: Statistics is the science of data. The volume of data available to us is over- stemplots whelming.EachMarch,forexample,theCensusBureaucollectseconomicand Timeplots employmentdatafrommorethan200,000people.Fromthebureau’sWebsite youcanchoosetoexaminemorethan300itemsofdataforeachperson(and more for households): child care assistance, child care support, hours worked, usual weekly earnings, and much more. The first step in dealing with such a floodofdataistoorganizeourthinkingaboutdata. Individualsandvariables Anysetofdatacontainsinformationaboutsomegroupofindividuals.Thein- formationisorganizedinvariables. INDIVIDUALSANDVARIABLES Individualsaretheobjectsdescribedbyasetofdata.Individualsmaybe people,buttheymayalsobeanimalsorthings. Avariableisanycharacteristicofanindividual.Avariablecantake differentvaluesfordifferentindividuals. 3 P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 (cid:1) 4 CHAPTER1 PicturingDistributionswithGraphs A college’s student data base, for example, includes data about every cur- rentlyenrolledstudent.Thestudentsaretheindividualsdescribedbythedata set. For each individual, the data contain the values of variables such as date ofbirth,gender(femaleormale),choiceofmajor,andgradepointaverage.In practice,anysetofdataisaccompaniedbybackgroundinformationthathelps usunderstandthedata.Whenyouplanastatisticalstudyorexploredatafrom someoneelse’swork,askyourselfthefollowingquestions: 1. Who? What individuals do the data describe? How many individuals appearinthedata? 2. What? How many variables do the data contain? What are the exact definitions of these variables? In what units of measurement is each variable recorded? Weights, for example, might be recorded in pounds, Aredataartistic? inthousandsofpounds,orinkilograms. DavidGalenson,aneconomist 3. Why? What purpose do the data have? Do we hope to answer some attheUniversityofChicago, specific questions? Do we want to draw conclusions about individuals usesdataandstatistical otherthantheonesweactuallyhavedatafor?Arethevariablessuitable analysistostudyinnovation fortheintendedpurpose? amongpaintersfromthe nineteenthcenturytothe Somevariables,likegenderandcollegemajor,simplyplaceindividualsinto present.Economicsjournals categories.Others,likeheightandgradepointaverage,takenumericalvalues publishhiswork.Arthistory forwhichwecandoarithmetic.Itmakessensetogiveanaverageincomefora journalssenditback company’semployees,butitdoesnotmakesensetogivean“average”gender. unread.“Fundamentally We can, however, count the numbers of female and male employees and do antagonistictotheway humanistsdotheirwork,”said arithmeticwiththesecounts. thechairofarthistoryat Chicago.Ifyouareastudentof thehumanities,readingthis CATEGORICALANDQUANTITATIVEVARIABLES statisticstextmayhelpyou startanewwaveinyourfield. Acategorical variableplacesanindividualintooneofseveralgroupsor categories. Aquantitative variabletakesnumericalvaluesforwhicharithmetic operationssuchasaddingandaveragingmakesense. Thedistributionofavariabletellsuswhatvaluesittakesandhowoften ittakesthesevalues. EXAMPLE1.1 Aprofessor’sdataset Hereispartofthedatasetinwhichaprofessorrecordsinformationaboutstudent performanceinacourse: P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 Individualsandvariables 5 Theindividualsdescribedarethestudents.Eachrowrecordsdataononeindividual. Eachcolumncontainsthevaluesofonevariableforalltheindividuals.Inaddition tothestudent’sname,thereare7variables.Schoolandmajorarecategoricalvari- ables. Scores on homework, the midterm, and the final exam and the total score arequantitative.Gradeisrecordedasacategory(A,B,andsoon),buteachgrade alsocorrespondstoaquantitativescore(A=4,B=3,andsoon)thatisusedto calculatestudentgradepointaverages. Most data tables follow this format—each row is an individual, and each col- umnisavariable.Thisdatasetappearsinaspreadsheetprogramthathasrowsand spreadsheet columnsreadyforyouruse.Spreadsheetsarecommonlyusedtoenterandtransmit data and to do simple calculations such as adding homework, midterm, and final scorestogettotalpoints. APPLYYOURKNOWLEDGE 1.1 Fuel economy. Hereisasmallpartofadatasetthatdescribesthefuel economy(inmilespergallon)of2002modelmotorvehicles: Makeand Vehicle Transmission Numberof City Highway model type type cylinders MPG MPG · · · AcuraNSX Two-seater Automatic 6 17 24 AudiA4 Compact Manual 4 22 31 BuickCentury Midsize Automatic 6 20 29 DodgeRam1500 Standardpickuptruck Automatic 8 15 20 · · · (a) Whataretheindividualsinthisdataset? (b) Foreachindividual,whatvariablesaregiven?Whichofthese variablesarecategoricalandwhicharequantitative? 1.2 A medical study. Datafromamedicalstudycontainvaluesofmany variablesforeachofthepeoplewhowerethesubjectsofthestudy. Whichofthefollowingvariablesarecategoricalandwhichare quantitative? (a) Gender(femaleormale) (b) Age(years) (c) Race(Asian,black,white,orother) (d) Smoker(yesorno) (e) Systolicbloodpressure(millimetersofmercury) (f) Levelofcalciumintheblood(microgramspermilliliter) P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 (cid:1) 6 CHAPTER1 PicturingDistributionswithGraphs Categoricalvariables:piechartsandbargraphs Statisticaltoolsandideashelpusexaminedatainordertodescribetheirmain exploratory data analysis features.Thisexaminationiscalledexploratorydataanalysis.Likeanexplorer crossingunknownlands,wewantfirsttosimplydescribewhatwesee.Hereare twobasicstrategiesthathelpusorganizeourexplorationofasetofdata: (cid:1) Beginbyexaminingeachvariablebyitself.Thenmoveontostudythe relationshipsamongthevariables. (cid:1) Beginwithagraphorgraphs.Thenaddnumericalsummariesofspecific aspectsofthedata. We will follow these principles in organizing our learning. Chapters 1 to 3 presentmethodsfordescribingasinglevariable.Westudyrelationshipsamong severalvariablesinChapters4to6.Ineachcase,webeginwithgraphicaldis- plays,thenaddnumericalsummariesformorecompletedescription. Theproperchoiceofgraphdependsonthenatureofthevariable.Theval- ues of a categorical variable are labels for the categories, such as “male”and “female.” The distribution of a categorical variable lists the categories and giveseitherthecountorthepercentofindividualswhofallineachcategory. EXAMPLE1.2 Garbage Theformalnameforgarbageis“municipalsolidwaste.”Hereisabreakdownofthe materialsthatmadeupAmericanmunicipalsolidwastein2000.1 Weight Material (milliontons) Percentoftotal Foodscraps 25.9 11.2% Glass 12.8 5.5% Metals 18.0 7.8% Paper,paperboard 86.7 37.4% Plastics 24.7 10.7% Rubber,leather,textiles 15.8 6.8% Wood 12.7 5.5% Yardtrimmings 27.7 11.9% Other 7.5 3.2% Total 231.9 100.0 It’s a good idea to check data for consistency. The weights of the nine materials addto231.8milliontons,notexactlyequaltothetotalof231.9milliontonsgiven roundoff error inthetable.Whathappened?Roundofferror:Eachentryisroundedtothenearest tenth,andthetotalisroundedseparately.Theexactvalueswouldaddexactly,but theroundedvaluesdon’tquite. pie chart The pie chart in Figure 1.1 shows us each material as a part of the whole. For example, the “plastics”slice makes up 10.7% of the pie because 10.7% of municipal solid waste consists of plastics. The graph shows more clearly than the numbers the predominance of paper and the importance of food scraps, P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 Categoricalvariables:piechartsandbargraphs 7 Figure1.1 Piechartof Metals materialsinmunicipalsolid Glass waste,byweight. Paper Food scraps Other Yard trimmings Plastics Wood Rubber, leather, textiles plastics,andyardtrimmingsinourgarbage.Piechartsareawkwardtomakeby hand,butsoftwarewilldothejobforyou. We could also make a bar graph that represents each material’s weight by bar graph the height of a bar. To make a pie chart, you must include all the categories thatmakeupawhole.Bargraphsaremoreflexible.Figure1.2(a)isabargraph ofthepercentofeachmaterialthatwasrecycledorcompostedin2000.These percentsarenotpartofawholebecauseeachreferstoadifferentmaterial.We couldreplacethepiechartinFigure1.1byabargraph,butwecan’tmakeapie charttoreplaceFigure1.2(a).Wecanoftenimproveabargraphbychanging theorderofthegroupswearecomparing.Figure1.2(b)displaystherecycling datawiththematerialsinorderofpercentrecycledorcomposted.Figures1.1 and1.2togethersuggestthatwemightpaymoreattentiontorecyclingplastics. Bar graphs and pie charts help an audience grasp the distribution quickly. Theyare,however,oflimitedusefordataanalysisbecauseitiseasytounder- stand data on a single categorical variable without a graph. We will move on toquantitativevariables,wheregraphsareessentialtools. APPLYYOURKNOWLEDGE 1.3 The color of your car. Hereisabreakdownofthemostpopularcolors forvehiclesmadeinNorthAmericaduringthe2001modelyear:2 Color Percent Color Percent Silver 21.0% Mediumred 6.9% White 15.6% Brown 5.6% Black 11.2% Gold 4.5% Blue 9.9% Brightred 4.3% Green 7.6% Grey 2.0% (a) Whatpercentofvehiclesaresomeothercolor? (b) Makeabargraphofthecolordata.Woulditbecorrecttomakea piechartifyouaddedan“Other”category? P1:FBQ PB286A-01 PB286-Moore-V3.cls March4,2003 18:19 (cid:1) 8 CHAPTER1 PicturingDistributionswithGraphs The height of this bar is 45.4 0 because 45.4% of paper 6 municipal waste was recycled. 0 5 0 4 d e cl y c e r 0 t 3 n e c r e P 0 2 0 1 0 Food Glass Metals Paper Plastics Textiles Wood Yard Other Material (a) 0 6 0 5 d 0 cle 4 y c e r t cen 30 r e P 0 2 0 1 0 Yard Paper Metals Glass Textiles Other Plastics Wood Food Material (b) Figure1.2 Bargraphscomparingthepercentsofeachmaterialinmunicipalsolid wastethatwererecycledorcomposted.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.