ebook img

Business statistics in practice : using modeling, data, and analytics PDF

911 Pages·2017·36.947 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Business statistics in practice : using modeling, data, and analytics

Bruce L. Bowerman Miami University Richard T. O’Connell Miami University Emily S. Murphree Miami University Business Statistics in Practice Using Modeling, Data, and Analytics EIGHTH EDITION with major contributions by Steven C. Huchendorf University of Minnesota Dawn C. Porter University of Southern California Patrick J. Schur Miami University bow49461_fm_i–xxi.indd 1 20/11/15 4:06 pm BUSINESS STATISTICS IN PRACTICE: USING DATA, MODELING, AND ANALYTICS, EIGHTH EDITION Published by McGraw-Hill Education, 2 Penn Plaza, New York, NY 10121. Copyright © 2017 by McGraw-Hill Education. All rights reserved. Printed in the United States of America. Previous editions © 2014, 2011, and 2009. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of McGraw-Hill Education, including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning. Some ancillaries, including electronic and print components, may not be available to customers outside the United States. This book is printed on acid-free paper. 1 2 3 4 5 6 7 8 9 0 DOW/DOW 1 0 9 8 7 6 ISBN 978-1-259-54946-5 MHID 1-259-54946-1 Senior Vice President, Products & Markets: Kurt L. Strand Vice President, General Manager, Products & Markets: Marty Lange Vice President, Content Design & Delivery: Kimberly Meriwether David Managing Director: James Heine Senior Brand Manager: Dolly Womack Director, Product Development: Rose Koos Product Developer: Camille Corum Marketing Manager: Britney Hermsen Director of Digital Content: Doug Ruby Digital Product Developer: Tobi Philips Director, Content Design & Delivery: Linda Avenarius Program Manager: Mark Christianson Content Project Managers: Harvey Yep (Core) / Bruce Gin (Digital) Buyer: Laura M. Fuller Design: Srdjan Savanovic Content Licensing Specialists: Ann Marie Jannette (Image) / Beth Thole (Text) Cover Image: ©Sergei Popov, Getty Images and ©teekid, Getty Images Compositor: MPS Limited Printer: R. R. Donnelley All credits appearing on page or at the end of the book are considered to be an extension of the copyright page.             Library of Congress Control Number: 2015956482 The Internet addresses listed in the text were accurate at the time of publication. The inclusion of a website does not indicate an endorsement by the authors or McGraw-Hill Education, and McGraw-Hill Education does not guarantee the accuracy of the information presented at these sites.      www.mhhe.com bow49461_fm_i–xxi.indd 2 20/11/15 4:06 pm ABOUT THE AUTHORS Bruce L. Bowerman   Bruce L. companies in the Midwest. In 2000 Professor O’Connell Bowerman is emeritus professor received an Effective Educator award from the of information systems and Richard T. Farmer School of Business Administration. analytics at Miami University in Together with Bruce L. Bowerman, he has written 23 Oxford, Ohio. He received his textbooks. These include Forecasting, Time Series, and Ph.D. degree in statistics from Regression: An Applied Approach (also coauthored Iowa State University in 1974, with Anne B. Koehler); Linear Statistical Models: and he has over 40 years of An Applied Approach; Regression Analysis: Unified experience teaching basic sta- Concepts, Practical Applications, and Computer Imple- tistics, regression analysis, time series forecasting, mentation (also coauthored with Emily S. Murphree); survey sampling, and design of experiments to both and Experimental Design: Unified Concepts, P ractical undergraduate and graduate students. In 1987 Professor Applications, and Computer Implementation  (also Bowerman received an Outstanding Teaching award coauthored with Emily S. Murphree). Professor from the Miami University senior class, and in 1992 he O’Connell has published a number of articles in the received an Effective Educator award from the Richard area of innovative statistical education. He is one of the T. Farmer School of Business Administration. Together first college instructors in the United States to integrate with Richard T. O’Connell, Professor Bowerman has statistical process control and process improvement written 23 textbooks. These include Forecasting, Time methodology into his basic business statistics course. Series, and Regression: An Applied Approach (also He (with Professor Bowerman) has written several coauthored with Anne B. Koehler); Linear Statistical articles advocating this approach. He has also given Models: An Applied Approach; Regression Analysis: presentations on this subject at meetings such as the Unified Concepts, Practical Applications, and Com- Joint Statistical Meetings of the American Statisti- puter Implementation (also coauthored with Emily cal Association and the Workshop on Total Quality S. Murphree); and Experimental Design: Unified Management: Developing Curricula and Research Concepts, Practical Applications, and Computer Imple- Agendas ( sponsored by the Production and Operations mentation (also coauthored with Emily S. Murphree). Management Society). Professor O’Connell received The first edition of Forecasting and Time Series earned an M.S. degree in decision sciences from Northwest- an Outstanding Academic Book award from Choice ern University in 1973. In his spare time, Professor magazine. Professor Bowerman has also published a O’Connell enjoys fishing, c ollecting 1950s and 1960s number of articles in applied stochastic process, time rock music, and following the Green Bay Packers and series forecasting, and statistical education. In his spare Purdue University sports. time, Professor Bowerman enjoys watching movies and sports, playing tennis, and designing houses. Emily S. Murphree Emily S. Murphree is emerita professor Richard T. O’Connell   Richard of statistics at Miami University T. O’Connell is emeritus pro- in Oxford, Ohio. She received fessor of information systems her Ph.D. degree in statistics and analytics at Miami Univer- from the University of North sity in Oxford, Ohio. He has Carolina and does research in more than 35 years of experi- applied probability. Professor ence teaching basic statistics, Murphree received Miami’s statistical quality control and College of Arts and Science Distinguished Educator process improvement, regres- Award in 1998. In 1996, she was named one of O xford’s sion analysis, time series forecasting, and design of ex- Citizens of the Year for her work with Habitat for Hu- periments to both undergraduate and graduate business manity and for organizing annual Sonia Kovalevsky students. He also has extensive consulting experience Mathematical Sciences Days for area high school girls. and has taught workshops dealing with statistical pro- In 2012 she was recognized as “A Teacher Who Made a cess control and process improvement for a variety of Difference” by the University of Kentucky. iii bow49461_fm_i–xxi.indd 3 20/11/15 4:06 pm AUTHORS’ PREVIEW 1.4 Random Sampling, Three Case Studies That Illustrate Statistical Inference 17 Business Statistics in Practice: Using Data, Modeling, Figure 1.6 A Histogram of the 50 Mileages and the Normal Probability Curve and Analytics, Eighth Edition, provides a unique and flexible framework for teaching the introductory course (a) A histogram of the 50 mileages (b) The normal probability curve 25 22 22 in business statistics. This framework features: 20 18 16 • A new theme of statistical modeling introduced in Percent1150 10 Chapter 1 and used throughout the text. 6 5 4 2 • A substantial and innovative presentation of 0 29.5 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 29.5 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 business analytics and data mining that provides Mpg Mpg instructors with a choice of different teaching all mileages achieved by the new midsize cars, the population histogram would look “bell- shapedI.”n T hCis lheadas ups ttoe “srmsoo 2th oaut”n thde s a3m pwle hiesto bgraemg anidn re ptreosen ft toher pmopulaatiloln yof discuss the options. all mileages by the bell-shaped probability curve in Figure 1.6 (b). One type of bell-shaped sprtobaabtiilisty tciurcvea isl a agranpha olf ywhsati iss c aullesd ethed n oirmnal sprtobaabtiilisty tdiisctriabulti omn (oro ndormeall ing and the probability model), which is discussed in Chapter 6. Therefore, we might conclude that the • Improved and easier to understand discussions sstattaistticiasl mtoidcel adelsc riibninfg ether seamnplce oef 5s0 mtihleaagets icn Taabnle 1b.7 esta tmes thaat dthies s aumpslei hnasg statistical been (approximately) randomly selected from a population of car mileages that is described of probability, probability modeling, traditional mby ao ndormeall spr.o bFabiolityr deistxribautimon. pWel ewi,ll isene inC Chhapaterps 7t eanrd 82 th a(t gthirs astaptishticialc al descrip- model and probability theory allow us to conclude that we are “95 percent” confident that the statistical inference, and regression and time series tsaimvpelin gm erroer itnh esotimdatisng) t hwe poepu lsathiono mwean mhiloeagwe b yt thoe scamoplen mseatnr muilceagte tish noe histogram more than .23 mpg. Because we have seen in Example 1.4 that the mean of the sample of n 5 modeling. o50f m icleaagers inm Tabille e1.7a igs 3e1.5s6 msphg, othiws imnpli eis ntha t Cwe harea 9p5 pteercern t 1con,fi daennt thdat tihen Chapter 3 true population mean EPA combined mileage for the new midsize model is between 31.56 2 .23 5 31.33 mpg and 31.56 1 .23 5 31.79 mpg.10 Because we are 95 percent confident that • Continuing case studies that facilitate student (thne puopmulateionr miceana ElP Ad ceomsbcinerdi mpilteaigve ies a tm leaset 3t1h.33o mdpgs, w)e whavee s truonsg esta ttishticails histBogIram evidence that this not only meets, but slightly exceeds, the tax credit standard of 31 mpg and learning by presenting new concepts in the tthous thhate thle pne we mxidspizel amoidnel detsehrvees thEe tamx crpediitr.ical Rule. As illustrated in Throughout this book we will encounter many situations where we wish to make a sta- context of familiar situations. Ftistiicgal uinfrereen c3e a.b1ou5t o,n et ohr miosre rpoupullaetio ngs biyv uesinsg stamopllee drataa. nWcheneev e irn wet emarkve als provid- assumptions about how the sample data are selected and about the population(s) from which ithne sgam peles dtatia mare aselteectesd, woe far et shpeceif yi“ngl ao swtatiseticsalt m”o deal tnhadt w il“l lheadi gto hmaekisngt ” mileages • Business improvement conclusions— highlighted what we hope are valid statistical inferences. In Chapters 13, 14, and 15 these models become tcohmapletx tanhd eno t nonely wspe cimfy thied prsobiazbielity cdiastrrib umtionso ddesceribli nsg thheo saumpllded pbopeula -expected to in yellow and designated by icons in the tions but also specify how the means of the sampled populations are related to each other BI gthreoutg hi onne ocr moomre pbrediinctoer vdar iacbliets.y Fo ra exnamdpl eh, wieg mhighwt realatye m deanr, iovr eixpnecgted:, page  margins—that explicitly show how sr1ea5llae0tse oaf rae spproodnusec tv taor itahbeCl eph raseupdctihect roa 3sr vsaalreiasb tloe so anDdee vosecrrr timipstionirvgee epSxrtepadetiincsdttioictrus :rv Nea ruaimnabdel erpisrc iascloe M .t hIenat tho worddee src aatnnod Some Predictive Analytics explain and predict values of the response variable, we sometimes use a statistical technique statistical  analysis leads to practical business called regression analysis and specify a regression model. FTigheu irdee a3 o.f1 b5u il d Einstgim aa mteodd Teoll etora hnceelp I netxeprvlaailsn iann tdh ep Creadr iMcti liesa ngoe tC naeswe. Sir Isaac Newton’s decisions. equations describing motion and gravitational attraction help us understand bodies in motion and are used today by scientists plotting the trajectories of spacecraft. Despite their successful use, however,H tihsetsoeg eraquma toiofn tsh aer e5 0o nMlyi leapapgreosximations to the exact nature of • Many new exercises, with increased emphasis on motion. Seventeen2t5h-century Newton2ia2n p2h2ysics has been superseded by the more sophis- ticated twentieth-century physics of Einstein and Bohr. But even with the refinements of students doing complete statistical analyses on 20 18 16 their own. 10 The exact reasoningPercent beh11in50d and meani6ng of this statement is given in C1h0apter 8, which discusses confidence intervals. • Use of Excel (including the Excel add-in MegaStat) 5 2 4 0 and Minitab to carry out traditional statistical 29.5 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 analysis and descriptive analytics. Use of JMP and Mpg the Excel add-in XLMiner to carry out predictive Estimated tolerance interval for 30.8 32.4 the mileages of 68.26 percent of analytics. all individual cars Estimated tolerance interval for 30.0 33.2 the mileages of 95.44 percent of all individual cars We now discuss how these features are implemented in Estimated tolerance interval for the book’s 18 chapters. 29.2 34.0 tahlle i nmdiilveiadgueasl coafr 9s9.73 percent of Figure 3.15 depicts these estimated tolerance intervals, which are shown below the histogram. Chapters 1, 2, and 3: Introductory concepts and Chapters 1B,ec au2se, t hea dinffedren ce3 be:t weSeni xthe uoppepr atndi olownera liml itss oef ecacth ieostimnatsed todleriasnc-e in- BI terval is fairly small, we might conclude that the variability of the individual car mileages statistical modeling. Graphical and numerical cussing buas_roiunnd ethes esst imaatned amelayn tmiilceasge oaf 3n1.d6 m pdg ias ftaiarly smmalli. nFuirtnhergmo.r e, Ttheh ineter val [x 6 3s] 5 [29.2, 34.0] implies that almost any individual car that a customer might pur- descriptive methods. In Chapter 1 we discuss Disney Parckhasse thCis yaeasr weil l iobsta inu as meildeag ei bnetw eaenn 2 9.o2 mp_ptgi aondn 34a.0l m spge.ction of Before continuing, recall that we have rounded x and s to one decimal point accuracy data, variables, populations, and how to select ran- Chapter 1 toin oirndetr rtoo sidmuplicfye ou rh inoitiwal e xba_muplse ionf thees Esm pairincala Rluyle.t Iifc, inss teaadn, wde cdalcautlaate the Empirical Rule intervals by using x 5 31.56 and s 5 .7977 and then round the interval end- dom and other types of samples (a topic formerly mining are upoisntes dto onteo d eacimnaal pllaycez aecc urbaciyg at tdhea entda o.f thTe chalicsul atcioanss, wee ocbtoainn thsei sdam-e in- tervals as obtained above. In general, however, rounding intermediate calculated results can discussed in Chapter 7). A new section introduces sta- ers how Walelatd toD iniacscnuraetey fin aWl resoulrtsl. Bde caiunse oOf thrisl, tahrnoudghoou,t thFis lbooork iwde awi,ll auvoside gsre atly rounding intermediate results. tistical modeling by defining what a statistical model MagicBands Wwe onerxtn n obte ytha mt if awen ayctu aollyf ciotu_snt tvhei nsuimtboerr osf thteo 5 0_c moillealgeesc int Tmablae 3s.1- that is and by using The Car Mileage Case to preview sive amount[ax_ rs e 6 co o3nstf]a i5nre [de2 9ian.2 le, a-3c4th.i 0om]f, wtheee fii nnltdeo rtvhcaalts at h[tx e is6eo isnn]t e5,rv a[r3ls0i c.d8o,n i3tan2i.n4g,] ,r e[sx pp 6eac t2itvste]e l5y,r 3[n340,, .40 8,a ,3 a3nn.d2d ]5, 0 a nodf the 50 mileages. The corresponding sample percentages—68 percent, 96 percent, and 100 specifying a normal probability model describing the purchase hispetrocenrty— arde calotsae t.o tThe htheeosreetic adl paertceant ahgees—l6p8. 2D6 pierscennte, 9y5.4 4i mpercpenrt,o anvd e99 .73 percent—that apply to a normally distributed population. This is further evidence that the mileages obtained by a new midsize car model (see visitor expepropiuelantiocn eofs al l mainleadge s tisa (aiplporoxri miattesly ) nmormaalrlyk deisttriibnutegd anmd theuss thsaat thge eEmsp iri- cal Rule holds for this population. Figure 1.6): to different tyTop ceonscl uode fth ivs eixsamiptloe, rwse .n oAte thta t ithtes a uEtompakcero hats sptuadierdk th,e cDomibsinned eciyty and highway mileages of the new model because the federal tax credit is based on these com- bined mileages. When reporting fuel economy estimates for a particular car model to the public, however, the EPA realizes that the proportions of city and highway driving vary from purchaser to purchaser. Therefore, the EPA reports both a combined mileage estimate and separate city and highway mileage estimates to the public (see Table 3.1(b) on page 137). iv bow49461_fm_i–xxi.indd 4 20/11/15 4:06 pm 2.8 Descriptive Analytics (Optional) 93 Figure 2.35 A Dashboard of the Key Performance Indicators for an Airline Flights on Time Average Load Average Load Factor Breakeven Load Factor Arrival Departure 90% Midwest 50607080901005060708090100 85% Northeast 50607080901005060708090100 80% Pacific 75% 50607080901005060708090100 South 70% 50607080901005060708090100 Jan Feb Mar Apr MayJune July Aug Sept Oct Nov Dec Fleet Utilization Costs Fuel Costs Total Costs 10 Regional Short-Haul International 8 808755 7900 91500 808755 9700 91500 808755 9700 91500 $ 100,000,000 462 0 Jan FebMar AprMayJuneJulyAugSeptOct NovDec 2.8 Descriptive Analytics (Optional) 93 Figure 2.36 Excel Output of a Bullet Graph of Disney’s Predicted Waiting Times (in minutes) for Figure 2.35 A Dashboard of the Key Performance Indicators for an Airline the Seven Epcot Rides Posted at 3 p.m. on February 21, 2015 DS DisneyTimes 94 Chapter 2 Descriptive Statistics: Tabular and Graphical Methods and Descriptive Analytics Flights on Time Average Load Average Load Factor Breakeven Load Factor Arrival Departure 90% Figure 2.3M7Mis iss isoT(a i0noNhn :ne 5 eES: mNxS pPcouapoe mca&loec Obr eF,o eu 1rrgritae rp5onenufged tFensR aoaifrt ,i an2 gT 5rse aeGnmodao tpdh ,oe 3f M 5the eaV neN rRuyma Gtbioneogrds f ,o o4rf 5ERaa Ecthixn cogefsl l Seaennvtd,e t5nh 5Re iM dSeuespa anetr R bEa)p tacinnogdt s NMoidrtwheesatst5500660077008800990011000055006600770088009900110000 8805%% (a) The LniuvminSgb peWar icothef s Trhahiptei nELgaasrnt hdand the mean ratings DS DisneyRatings Pacific 50607080901005060708090100 75% RSoidarein‘ TesSt Toraarcink' Number2 5o7f2 Ratings Mea4n. 8R1a5ting South 50607080901005060708090100 70%Jan Feb Mar Apr MayJuneJuly Aug Sept Oct Nov Dec 94 Chapter 2 TSpeastcD eTesrhasiccpkr iEppartreitsvhee n tSetda btiys tCihces:v0 rToalebtula2r0 and G4ra0p2 h 06i49c57al6 0Metho8d0s and 1D041e0..23sc41r79iptive Analytics Fleet Utilization Co1s0ts Fuel Costs Total Costs Living With The Land 725 2.186 Regional Short-Haul International gactwo orFsa aly1iioptgm0rihns0usb g rromca olten ii((moman gbMMT2)nhu)i p iienTe ssEt.tassseg hxh3iiS roose ceteef)7nne oar n ::lbtso p uh SSoauwm rmppenueT(a liaa0 ltt bh dnsccedhp oee iee5it EauncbNr ogNxrg t tgjeoP kcerreruolmaefoed aecm fgnelorop t ngatrpOwbri hh,e&vet reuie1.eai enr tFm Tni .pgt5rot ri iushaeHtfen o tnFeraeRg oya d onmarb iw sfderttm au ,iid aenmpt2le hlvgT aea5eersenets s r madeu.gG, n m rerTsotdeaahiaoh gpnttpediohnh s,roe bi a3aoff butM y fi5ttuln iahelDlng erelaVggest ine1 1N st e rgs541 nRutygh865r,Dema 977arGooytSapbiror ’nptDeh so gohr(s ids pb0s f ,dnlor jo 4oteeoerfo c cd e5yER ta2isRaiac t v0Ecatetnhiex etdnmo ci,dog n tewi i fsgwlnnc l Sheas ouaetnnimthicvtde332t,ehei s...pnt4175n )hai011u g s5Retr862p o e itrM pdi eSvmteepuehesrparee r eanesyl setp r Ru belfEroatsn)ep entadtociesgnnoif dcgd tfi(t ts 8b veh 0dyee 808755 7900 91500 808755 9700 91500 18F820i875g5 u9700r9e150 03.2$ 100,000,0008 4682 (ConCthinaupetedr ) 3 Descriptive Statistics: Numerical Methods and Some Predictive Analytics dfcooarsn htshbideoe aarri dtrhRSl ioeiindna Sebre oi)Funa d‘lir lgioenu 'td rgeirs a2pp.l3ahy5Tpsr e eor(sersbte epTjpnerretraceesctdisekvenentsitn iTNrngeheg pemt h rtSoehe es e&pae seFnpN rrWteuiceermeidcnthnd ebbtsneaytrg2a s5eoghMg7sfe2ro iesRosr esaftoni toovfinn neo:gr -Snttsiip-cmtaaicmele baerl aMraicrverkaia vl4lnsia. 8n Rla1sean5 stad.i nn Fddg4oe. 8rpd aeerpxtaaurmrteupsr leiens, (e) XLMiner classifi0cJaatnioFne btrMeaer uAspinrgM acyoJuupnoenJu rlyedAeugmSpetpitonO cttraNionvinDge cdata (fN) oGdreoswing t%he Etrreroer in (e) the MidweTSspetsa,tc ewTsrhahicipkc Ehpa raertsrhee n stbCehyhdo e wbvyrno Cl ehbteevlroowle.t 2 06L4957iving Spaceship41..234179 3.6 Card 0 33.33333 LMMiviissinssiigoo nnW:: iSStMpphaai dTccwhee eeogs Lrrtaeanenngde ArrivalMoriasnsgioen: SpaceD 1 e754pWTL286a597ahrintethud re Earth 233...141801686 2.5 Figure 2.36 Ethxec eSle Ovuetnp Eupt cooft a R Biduellse Pt oGsrtaep,dh a 0ot .f35 D ;pi .stmnh.e aoytn’ s i F1sPe,5rb eCrduiaacrtredyd 25 W1, a20it0i1n5g T i0mD.Se5 sD (iisnn me$yinT0uimt.e5es;)s tfhora t 9is, Card 5 1 123 21402.1..85636363637 The Seas with Ne5m0o &60 Fri7e0nd8s 0 90100 50 6017105780 90100 2.712 1.3 Purchases Purchases 4 4.166667 ,51.17 51.17 $51.17 ,22.52 22.52 $22.52 (b) Excel output of the treemap Nemo & Friends 13 2 2 7 Classification Confusion Matrix hsral8iip0ugedp hTpmrlteho epbrexcrs imoesam nwi ratoln taioven Sn“blreoysji’aeia ss7rctz liin5toisai' ibfvtpanjeroeec.grctgcroteiresv nyTpbCep t ”ryeh e ossewrscrtfev ecT aregnhaorstiacrle oedcottdetnuoki a oeohcl famns tTNMovthrheeieeiassdeem nsdw 8 Sgiboo0eie eunt & asn:lhp w tlsFSeee rWprterie nactagintcei hhradenirasrttepr iihov nn,faMgLWTL p blriahmviesegisuntei ahensdi ttx ndig h otrwathntkt:iie sSwms .p7tre ea5SErircD eepanedp ar otescaheirner csrs teifhivnnimopabt leedsry oiy bse sesi n econm432 tn...ov865ho te taerin meananitercli.ah i nnT pgtehh’uoeese opmuueslnlayrt avrinidsdeiMt MsLisoFiisvs iisonrpiionSg n: gp Wo:Sra SpicuatshepaT sacTtrtehechsieie epoSet n gr ToEd aLrraaenaag2rregn citnnhedkbs'.03 yo7 p^fD 52s0tihh100s3oen5w e4 0s0yse vaae6 t0n tp^3r 5eE pe81p220.mmc5 oa.1 1pt0o0 nriil dlFupe^e 5ssbt.0 r02rOau5ta t0i,hnry3eg3 r. 492 fi5g1Pcru,t3 arci32hpc.9a0i5hsoe1isuc5$3ss,3 3.95 A01ctual Class Pre0di1c05ted Cl1as81s aTdirlsespelomaya uipnsfs o erWmsae t itnohenx eitn dari ssicedurisiessn torgfe ec mlpuasaptestr,e twde hrreicnchta nhdgelalepst ,va wis huiiatcl hizc reeo ptwrlelose evncatr atia swb lhetoso1l.e .3 T. mTreheeam skiazpeess plan- discussed in the optional section on descript1ive analyt-1 Error Report noofui stnh reeg cret acndtagnelegcsle wis sirtehipiorne nsthesen t,t ra e fiaemrssta pvi aasrcia cboslredh,i naongd wt otr enae sm ebcaopyns du svtea hcrioaelbo lre ft.o oF colhrla eorxacwatmerpiilznee, g tshu ep bvpaoursie-s inesgas rsay pmhsb ociol cmonps tah reei btnhuelc lseiltn uggrladep pher.i m Thgaery ab muulelegats ugerreas pt,oh aos ft aDprgiasent,re oykr’ sol pbirjeencdtieicvtsee,d, w whdaiicthain igts atriem predess uresnietels dlfi -bvp^yed 5 o44w5n 1 grapp^ 5h-235 .667 C0lass # Case16s # Errors1 % Erro6r.25 i(mas ap purroelvy heympotheenticta lc exoamnpclel)u thsati Doinsn efy rgoavme vi sCitorhs aat pEptceort t h1e :voluntary oppor- colors raincgisng, froamn ddark gdreaens toh rbedo anad rsidgnsif yicngo shmort b(0i ton 2i0n mgin utges)r tao vperhy liocngs ( 80i llustrating a 1 8 0 0 tunity to use their personal computers or smartphones to rate as many of the seven Epcot to 100 minutes) predicted waiting times. This bullet graph does not compare the predicted Overall 24 1 4.166667 ridTesh ea sa idrelisniere’sd oobnj eac tsicvael ew farso mto 0h atvoe 5 8. 0H peerrec, e0n tr eopf rmesiednwtse s“tpeorno ra,”rr i1v arlesp brees eonnt st im“fea.i rT,”h e2 waiting btimuess tion ane osbsje’csti vek. Heoywe vepr, ethre fboullretm graaphns cloeca tedi nin dthiec upapteor lrefst .o f tFhe or example, alr“iepsgpuphrprteo ebsxrerbionm.w”tsa …nF te“ i“lggysoua Ao7rtei5ds ,2fp”sa. e3c 3rta7c o(rer aenym)pt” rg oeriafesv egaetnicstot tesntuh a“orelv f nm eoturhiymdfe w bgbfeeouasrolt ldoeecrt,f” n tgr ,ar4a atr Ciprrniehvgp,has rbl easaus ntten hdnta httntish s we e“7 eem5lrx e ecp1 aeoenlrn3l c ertean inmNttit,n ”edg eoia sfenw osidnr n 5estoha t rce er hepa airrriceldhsinee t enho’tenss dfoars hthbeo aar Fidrl iiinnge )F udigoru dreeis  2p.l32ay5 .o(3rbej5perce tisveiensst( ig rneg)p a rtXeh sLeeM dnpteieanrdces ebrn hybta sebghseotosr pto avrf uerornntdie-ctd aim lc sbelal haascrsorkii fivlwiacnlase tisai.no nFdnog rdt reeepxeeaarm ituugpsrlieenhs,g t t he“ vflaliidgahtiotn doanta (h) Pruning the tree in (e) 8ao 0up tpaperuitrtinc coeu nfl taOa or tbdrrjeaeleycmat. ivan(Tped.,h ewoshe e drreae ttahp ea orsei zrceto eamndpdl ectooelloynr fi ocfMt itthieoa urrse.cc)t ahFnigg lu6er ef, o 2r2. 3a7 0p(ab1r)t i5schu—olawrs r itdhee rEexpcree-l cthoen sMidiedrwt tiehsemt ,b wuelhlei”ct h g braarpeuh sslh rolewepnrte bsegenlotriwnag. pthhe pser caenntadge s tohf orn-etieme “arflrivialgsC ahanrdtd dueptairtluriezs ian tion” gauges Nodes % Error Tdsraeirsnnepgte,lemd ar yefau rsipopnsrmef ioc ntrdiWmvagreaeklt yi nog, enrtexth iethen n dta eoi( ssts acieglur nisnewsiusf my torrifbeni eegctmrl i uaoan sfmpt ergsera,e at widnn hrgreoiascct ahitfnan nghdg e n3tll.hteep7hesa , vr m iwitsshehuae ian cl“h irszc auretep aiptnewrsgrebo esf, oe”vD—n reoa tctr irhsai ioae 5tnwb , rhT lhirleedeoeasevsl.:. e etC T.Tll a )Tsrh sitehfieocee a cmt wisooinahlzo TpiBertressese s and IRegression Trees (fOpotiorna l)an airli1n77Aerr.ival Departure 0.5 43 66..2255 ➔Best Pruned oo(t(truurasifusedisn esgtee ih a mntsar eyD“ eip a af lcrutyatrpeotrerhiia.ce gn nustNrlegaeyagsnr noet la aheeg tr etedyasmlht e n pytewrtsehgooi earaiert t hnebt hhp” ope esierftr nia ari axe cscttst dioao fohnellnlfeenog o eaa ta trtxl nhsrrsa atnceee fi(moae “rsnrmrmvaSep sentvtlapohr egeopvuye)in uat natae “grhrgcrifnoiaars,acdo tbiosiceo drDlarns,re,d”ye ”i,as i, s’ ona mrdanfe nrgnrea do d1ryrt(pa mo t, tg optar lel eaanhedd aev veoas me etnrenFrvo lkeiicva) gds osbPp,iiuga esrnesatrrt r1ii to . iedst0ieotuta0on s e o3rstfus vor narer.hs lUa2at e apoecrta6gtaaoriwe oat lasda deldEt Jnbr soM e pl“arPmdbed gr scOt.y) oouao i ,“t F ntop datfbc uohyatdthuh reie o,aro t”fet e.acr f) ht” vxfaC oh o tlcoaaMlehiusotml sereuiirErafi r p dnncsnoxisalezttcyfceia ove a e,nrt et lhyltTrsern heee ua oe ee opp Esfp monvppeprp av otacothwrehsepori enCee--sta rd Upgrade Data DS CacrdUupgsraCdse hMsiadiwxpes tt50emr60 e73t0h 8o0cdo90sn10 t0oa5if0n 6ps0P ru750refc1ohd8.10a5u7is9ce0rst 1i0o0vpe tiaonnaally tsicesc.t iTPoun2h1rc21she.5a 2stemhsaett hdoidss- 210 666..222.555 & Min Error rr“aoutbiornies usefdeppfu e teeiodarppdtmsersriue oT msts tartafiope bcosparo nlhu o.uatsfd”tdilby r so aeeaa F teeOnrsra“ di t ei)iggdrrrr n. elaoe Dufaa FesrdForytne meo dia.odqo g ir ,2a(ou t”nusTr ep. e r3espx3n,ahne 7eaa ew ht,s2(rrmles aecykwh.eo3)papy se uh7l—rgldrreesee(e ia ebs,trv Dft ePetd)aereDh noi sgdtesaintam oaisns trfsh nv eer fdi“e0ee eyzlikvc y r s’eontteoesp h ouscran mlneM ymoant 5 p ryudbg.abgaCl l rgeeohnHdca tiroigo necve eahdelol crroayK ,feihaso”rv ,sriir f fieennoa c04c ictggfh v iott rrsdniitieleoc thisogowippaiesomsrtrrrlo so ue er aUpgrade , uirsesss nnwsEla.ehec00)ddf ..57nntpov o05 laFttbtcwoarnssshmioel ggneiu “ot“ual u ,tnpeime tsrnDit xoi eeaofe ncotid ouranhse2ri rnngl ,atl. s”(yol3 eatier yn7 efia1snrpt’f(ath gtasoibidr,tonu” reem)rAwtgr m p ieast cnr hfh.niateui eoehtoAdnmslir eaew os rla 5ern isnhs .dla otr i arsecKtie,te hd hshrn“npi aeae nfoirrt r anieg rtcEdcie esdhreo oex ep,otia” unchrcomcp eeltaa2ndhs-ltl, t ional 1 dbdiyiss ccuuusssinsseegdd t hianre e C nehuxamppletae0rii5rnc e3ad.l Tdinhe seacsnre0 i1 apmptiepvtlehie osddt asat nias0dr0te ipc:sra pcrtiecvai1lo11 uwsalyy CA0lcatussailfi Cclaastsion CPorn0efduisc5itoend MClaa1tsrsix1 ssraaenenndgtc ,eD rt efiisrsoonpmeenyc ’td isva Hreklooy gl,fl rtyeh ween ot Coo(sdtai hglS nntauiufdmpyiiobntsege.r aAro f m trreaea2teinmn graaspt ait nwnogdo nutheldeah rmb te0e.he2 5aelcn po“ nrs asuttpirnuedgrcb tife,o”sd r oc btrhyu e5 b,sr riledseaev k.e iTln)dh gte oea c woslahlcorigtrrese iptive 0 p^ 51035 0 p^ 5 225 1 p^ 5 025 0 p^ 5 675 .857 1 0 10 as(trrsiuednigeemmnas iaaflyrpymei.n Ntrgaia otacetr edmsi ttzeh.oa a inbStn esr ipagaxtt ie on lfegcp a tnhsirteefi a e“srvec detvhraeiyenc l “glrftoiyadeoie,drds, ,” ”Fa oraerwin gd1ra ,a otulenedirve t eterioli)n d b, 2eeag 0 .si0.a0s 3ts rhltaeP6oiutarwecmhs adstnseP sl “aP.abtuh5ePgsrcr3oyh 9ofai.“sl9ose 2eo(tf15s).h adw5eiP,,3fru2”3 r.9.4cc .oh5”9sfa2os 5oeMls ruoaP(0 rlraa)t P nossrobfcyfilee a ttuPlhvruereceh lae oesslemnse.en 5vPatu2ehr6tpc .h1nea8 ss5E egs,3P2rup.r4c5haascesp,o26h.18t5 • Cmloadsesilfiincga t(iso(ein) e X tLSrMeeinece rt mtiroaoinnid n3ge .dl7ait naa gann dda bn etdsht eprr eufngoedrl leForsewsshi iodnenmg a tnfirde greeug rreesssio)n: ErCrloars sRep#o Crtases # Errors % Error uutrsseeee dma t aolap orsgb aetrare irn af rnFegiqgeuu oernef tc2lyo.3 luo7sr(seb d)( rgtaoan vdgeii sntphgle,a sryaa hnyig, eefrr aoorfm cch odilcaoarrkls igsnhSrfpeoloitewrnmPn rutan oietn iro Cteonhldo ar )(Pto,i i nnfibtsfguoRSutrq0 urm.t6ae4hre0.ae tA i4ENo0lAxsNnlulo cm Rtob,eweh rnsl oa fo aStp tplciet4spo t uhwladet tree (for Exercise 3.57(a)) DS Fresh2 01 160 10 16.666607 be displayed as a tree, where different branchings would be used to show the hierarchiAcll aRolw s Overall 16 1 6.25 information). For example, Disney could have visitors voluntarily rate the rides in eCaoucnth G^2 LogWorth AdvExp 4055.0511056.0381753 of its four Orlando parks—Disney’s Magic Kingdom, Epcot, Disney’s Animal KingdoLevmel, Rate ProbCount 6.65 and Disney’s Hollywood Studios. A treemap would be constructed by breaking a la01rge00.. 5455000000..54550000 2128 9 9 Purchases(cid:31)(cid:30)32.45 Purchases(cid:29)32.45 AdvExp AdvExp For Exercise 3.56 (d) and (e) Count G^2 LogWorth Count G^2 LogWorth Lev2e1l20R.4a5te0334Pro1b.90C31o7u6n2t Lev1e9l7.R83a5te2979Pro0b.10C44o9u4n1t 5.525 7.075 Prob. for 0 Prob. for 1 Purchases Card 01 00..1890095500..27096382 174 01 00..9045724600..90277255 181 3 6 8 1 Cust. 1 0.142857 0.857143 43.97 1 PlatProfile(1) PlatProfile(0) Purchases(cid:31)(cid:30)26.185 Purchases(cid:29)26.185 PriceDif Cust. 2 0 1 52.48 0 Cou1n6t7.48G1^32331L0o.g9W37o4r1th9 Cou5nt6.73G0^12167 Cou9nt6.27G8^92777 Cou1n0t G^20 7.45 8.022 0.3 9.52 Level Rate ProbCount Level Rate ProbCount Level Rate ProbCount Level Rate ProbCount 01 00..0963275500..09819028 151 01 00..6400000000..54815491 32 01 00..8181819100..81548182 81 01 10..0000000000..90632755 100 3 5 Purchases(cid:31)(cid:30)39.925 Purchases(cid:29)39.925 Predicted CLoeuv1ne1tl RGat^e20ProbCount CLoeuv5netl5.R00a4te0G2^422ProbCount 8.587 8.826 Value8.826 PriceDif0.3 AdvExp6.9 01 01..0000000000..09369046 110 01 00..2800000000..27455455 14 8.021667 0.1 6.4 v bow49461_fm_i–xxi.indd 5 20/11/15 4:06 pm 192 192 Chapter 3 ChapterD 3e scriptive SDtaetsisctriicpst:i vNeu Smtaetriisctailc sM: Netuhmoedrsi caanld M Seotmhoed Psr eadnidc tSivoem Aen Parleydtiiccstive Analytics the centroids otfh eea ccehn ctlruosidtesr o(fth eaatc ihs ,c tlhues tseirx ( mtheaat nis v, athl-e six mean valb- By using thbe mBemy ubseirnsg o tfh eea mche cmlubsetresr oafn eda tchhe cclluusst-er and the clus- ues on the six pueersc eopnt itohne ssicxa lpeesr ocef pthtieo nc lsucsateler’ss o mf ethme- cluster’s mem- ter centroids, dtiesrc ucsesn ttrhoei dbsa,s dicis dciufsfes rtehnec besa sbiect dwieffeenr ences between bers), the averabgeer sd)i,s tthaen caev eorfa egaec dhi sctlaunscteer ’osf meaecmhb celruss ter’s members the clusters. Altshoe, cdliuscstuesrss .h Aowls oth, idsi skc-umses ahnosw c ltuhsitse kr -means cluster from the clustefrr ocemn ttrhoei dc,l uasntde rt hcee ndtirsotiadn,c aensd b tehtwe edeisnt ances between analysis leads taon tahlye ssias mleea dpsr atcot itchael scaomncel upsraiocntisc al conclusions the cluster centtrhoei dcsl u stDerS cSepnotrrotsiRdsa ti nDgSs SportsRatings about how to imabporuotv eh othwe tpoo ipmuplarroivtiee sth oef pboapsueblaarlilt ies of baseball a Use the outap ut Utos es uthmem oaurtipzuet tthoe smumemmbaerrizs eo tfh eea mche mbers of each and tennis that ahnadv ete bneneisn tohbatta hinaevde ubseienng otbhtea pinreedv iu-sing the previ- cluster. cluster. ously discussedo uhsileyr adcihsiccuasls celdu shtieerriancgh.ical clustering. 3.8 Cluster Analysis and Multidimensional Scaling (Optional) 189 Figure 3.31 Minitab and JMP Outputs of Hierachical Clustering of the Sports Perception Data XLMiner OutXpLuMt finoer rE Oxeurtcpisuet 3fo.6r1 Exercise 3.61 (a) The Minitab output Sport SportCluster ID DCilsuts. tCelru IsDt-1 DiDsti.s tC. lCulsut-s1t-2 DiDsti.s tC. lCulsut-s2t-3 DiDsti.s tC. lCulsut-s3t-4 DiDsti.s tC. lCulsut-s4t- 5 Dist. Clust-5 CoDmepnledtreo gLrinakmage Boxing Boxing 5 5.643574 55.6.6443973441 5.644.2983946199 4.258.3956390959 5.325.3380259945 2.382945 Basketball Basketball 4 3.3507491 3.365.9017493165 6.941.5493963582 4.519.3913586292 1.351.1536097282 5.130782 0.00 192 Chapter 3 Golf DesGcorlifptive Stat2istics: N5u.1m92e02r2ic3al Me5t.1h09.o926d002s53 4a7nd So0m.93e6.8 0P255r4e973d5ictive 3A.8n275a.99l13y75t7ic7s 74.9.5147679791 4.546991 Swimming Swimming 3 4.6569352 4.635.1639857266 3.113.8837664696 1.863.2694093693 6.239.2023396397 3.223697 Similarity 33.33 tubheeesr s co)e,n nt htthreoe ia dsvisex ro apfg eeerac dceihps ttciaolnuncs etse cora f(l eteSBPhsakiaa ncoistighe fni - bgstPch,aol leutlnh sgcetle usris’xste mmSBPrk’ianseeisi gmaenm-nbgPbe aovemllnarsgl-- 513 54b.. 5028Btt81eh18y513re78 cu307celsnuintsrtgoe irtdhs54..s.e 50443,A 28m...d05181lis471e18so935m78c,807307u b867ds384eissr csth uoesf s b e443haa...os051c1iw471h23.c2935.. 578cdt807h866li867uf984i384fss786e t646kree-rmn aceneadsn 1 tbs23.h2 636e..ce578...tl190 w866uc020984lse785u786te413e646sn367r- 999 636...190253020...304785853413224367952999404573 253...304853224952404573 66.67 from the cluster centroid, and tHhoec kdeisytances bHeotwckeeeyn 4 4.35a7n54a4l5ysis le4a.d3s5 7to57 4.t55h9e3 same pr5a.7c4.t05i4c96a38l 6conclu5s.i40o0.n54s96 385658 0.559.1315050851 5.110051 the cluster centroids DS SporHtsaRndabtianlgls Handball 3 3.62a1b23o43ut how3 t.o63 2.i81m324p47r34o2ve the p3.o80p3.9u427la44r82i9ti8es of b0a.9s4e2.7b408a79ll38 96 4.720.6743391609 2.643109 a Use the output to summarizTera tchke & m FeiemldberTsr aocfk e &a cFhie ld 3 3.36a2n23d02 tennis 3t.h3a6t23 h2.40a3v21e1 been ob1t3a..24in031e18d19 u7sing th1e.24 0p.61r0e84v94i7-37 4.620.6444736726 2.647626 100.00 cluster. Bowling Bowling 2 4.08o8u228sl8y discu4s.00s8e.98d62 0h85i8e4r7achica0l .c926l.u506s5t44e87r1in9g. 2.576.0408514908 7.040.2554709825 4.257925 1 4 5 7 9 12 10 3 1 12 8 1 36 Tennis Tennis 3 4.1715358 4.137.3195656853 3.319.3646951354 1.354.4911758442 5.421.7718040278 2.710078 • HieraBrOcXhSiKIcaSWlIM Pc POlNuGHs BtALeLrTEiNnTgR & FaGnOLFdB OWkL-BmSKTeHaKYnFsOO TcBlASuEstering (see SectioFono t3ba.l8l andF otohtbea lflollow4ing fi4g.3u87r04e04s): 4.378.9710104413 7.961.0124710306 6.0217.00402655 15.0.7412254501 5.712401 (b) The JMP output XLMiner Output for Exercise 3.61 MDeenthdorodg 5ra Cmomplete Sport Cluster ID DistC. lCulsutestr-1 DCislFuta.s stCetlrust-C2ompDFlaisstt. ClTueCsatom-m3CphlapDtiesErt aS.T usCemyalummsatr-y4N coDEnaissty. CluOstpN-p52co01n} COhapppter S}ummary 201 196 Ch• aBopxitnger 3 Descriptive Statistics: NuBmoxeinrgical Methods and Some5 PredicCtiluv5s.e6t e4Ar3-n71a4lyticsClu4s5.t7e.86r4-19341 4.184.784.28926.19649.18 35.3.3325.136059 3.63.332.38229.64753.6 2.67 F••••• GBBHFioaooogoswclfktklbuieenatygblrlalle 3.35 Minitab Output of a Factor ABGSwnaoaslikfmlyemstibsian olglf the ApmmfopairllI kilwncieo ah tnohincsbeth t oar Dteifhna aeiilttyn aew gma 423 o(rm4ser l eodFaaba,n ntdccaiot niofinmgrniCCCfnps354udlllga uuu...l U316n asssaa595istttssseeeees026sssorrrodo709 ---cscc922345)uiii132aaacttthiiioo oannns rrrAuCCCuullll2mlleuuuee.1s8sssssa6035. ttt i59zmeee....nbo896991rrrtammfnao163---os234n 408eairavlaI dk357nlwa3ngie 646dro. ehoi 2t576 nooahinN5csbub44e 3h elst..m oe873art tcfle,if2932ih na alit563.1elxitihl8iteyn 5. oee59wgs g.nmsa689eo oers431lm1rse rl i...c. l oe5886eoodaofsa92303rb,nm 5 n(6956.tcdcr5.2faipe06u3o94 on.i5nafi976s44nm839rg3ntt 987..ni256efo 873npitudxemh293gal ason 563 ae amauibsrssesssrpssseao.ool ane1cscIc33k,udi.nii ..46aaca c46s 16pt.0tot ho736i5iio.7.r6omo ro325825.ora.d06nnn 199.dese976e 510urdrrrv987Auu uc673ie5elttmll979neosse.e7273s ssa 7 iz2mnb3o1.ta43nao.s71333n eav183..4ad46na.g 736dreoi582 oanN5432bu ....elsm1523et 3c42fl65,i3 al0633.it3.l7xth03796ie.7 oe9232899gs3nse2231.o217e43sl.r l 713ic oeo183ofsrm (crfpeuTto2nashrh.ntt 3eo eitex6emh3 as3os3.ce m03ube.re923ssrp 223ea.nl aaneIk,dtnr c sr epo ooor mror dTti dedeehhudrv scieeeettnoss s cee natrreo ids P••r PBiniansgce-bPiapolnlagl Component Factor Analysis of the CSkoirinreglation Matroirx thrillers) and5 hierarchiCe5slu. 5(sf2toe8rr 1e-57x3ample,C alu hs42itee..60rrao4-rr59c th8hy8ri 3rlleelr4ast.)6e d1a n2tod.6 hhoi2ew.r5a 8nr6c9e.h2w7i946e .ts6h 1(ef oprr oedxua6cmt. 156pis0.l2)e7.9,4 a3 h9ie4r.2ar6c5hy 5r2e.l3at8e2d39 .t244o2.5 2h6o5w new the3 .p2r2oduct is). U•n Hranodtbaallted Factor Loadings and CommunalitiBeasseball 1 0 4.573068 3.76884 3.928169 5.052507 V••a STrewniimnaimsbinleg Factor1 Factor2 Ping-PFoancgtor3 ExeFrcaicstoers43 for SeCCco4l.mtu0is8omt1en8ur8 n73a.l1it0yClus3#t.1eO1rEb 5s7x7e4rAcivsge. s#D Oi1fsb.ot2s8r6 S46eA6cvgti.o DniDs t3is6.t.10a0n05c3e7 9 Dista3n.4ce3 4243 ccoonnWVVVVVVcfi••laaaaaade uST errrrrrsrknwa ii12345o6ictniikng la ls&l i tFriyleiel auldcpshutrerapdto ea srekes- ,mr eweaalen.000000 sCw...... 451678oci480196lnllu s739787sicdtoeenrris,n itgdh eerbn y,a tuhfiseci n22tJiguo000000 snta......a 603131 lR 153858rieg909165garhlo tc degarrytoaHHTBT c receooamahrnwnycaicnn kidklncieiinb h.ns &ygaagH22i nlFpol000000,i rwe......ow304521ejlhedv729898ciec604094trh., Fhthoaesr C33..O66N67 CE222WDcPoiT000000nhsSc......fia127310utd s281576iessn124799 t4332t3chheee p mpeurecrapennoitsnaegCCCCCg s43344oelllll fuuuuu..o..., 36301afasssss 52687nsttttttshdeeeee71281oe rrrrr52225lc -----itif4410238455eatr t53288rm000000iao......stn748888i o s328782ru.u921755plepCCCCCsolllll?ruuuuut sssssp303ttttteeeeee...r893rrrrr3c369-----e7.12345C33n4406...Ot125325735666aN67g94145e 32173C, EWDcPo0102iTnhs....Scfia93933utd6188 .sie60932ssn 95799ttchh 5012e14834ee .....07235p 49253mpTtccae12532heeu02064rh eenncre44149a pettSrD naa68881aXnouggiVtts89915nimLaeeeoDg68794 gM 0102moosoge ffr....fro,ia 9393e en47rafana6188ie 00nsztttr0932sh eede ppoorre 5799 eet slc uthi4834 trrhBCCCCifdtecceatp07235ara lllee eetrrnumuuuitnne04475aoa tntttsss1c st n..... wotttiuo ai.t57604o sseeefsnrmeu 90 001.uiearrrsdnprm374l57h---n eegps123ao 53448es ao nwas?n5s9304r p dsnts86782e opa pcbcteeiieicofiralicnCotefiiesBCCCCdwoeln bunllld.eectuuu 4aa osrtns s.3gsssunu5twetttDet.lfipdeeee77e,Se52242 d p errrr36o aer.o....D----n166n27e08ns123r1 Vat68c14451an lpe 084Dy03770l 3 eispf01690Rr.ite6-s 50227er 9C-on19658 Cftl ul43uTtccash4..eeh s51te.3nne5et 71ttSr.D eaa77aXr32uggVt-r3601imLee2o-D08 63M1 moog68 805ffrria 084een47rnaie00zttr Ceee pporrC leet suthu rrlhd3tcceupas43aee 3.rnus1ttnn..e.a t51ettt711c oeuo71ai.r62sfsnm-r32 81ias3d-n10m83hn2 g a63o 450e a wa085sns pdsnseopa CcbcteCiieilcofiaulilnote375fiuissdwo3e... s 942tbnd..3c 1eta 217 o.rs se17ru836nueD-r26lfipd153e4S-d81 p654 o3 aeoD38936nn nr504 Vatca lpe Dyl eiCspfRrCite-ls eru l-on542usf753t... s026t...942et592e172r282-r8365815-5130264 546737369 Clu425s...206te925r282-8155260 737 2.3V mairll i7on store loyalty card0 h.4ol3d3ers. Store manage0rs. 5a8re2 intereFstoedo tinb ac2lluls0te.3ri6ng0 their METHOD0S. 4A4N64D APPLICAO4Tv.I3eO8rNa7Sl0l040.855Ove7ra.9ll1M1134E1T3HOD1S.2 A49N0D65 .3A10b3P2 P7CL0IoC0nA6s1iTd.2IeOr4 Nt9h0Se 5rCe3lcuosm1tme.0re4-n42da5t5ion ofC Dlu3V.sD9t e2B5r8 .-b714a61s9e2d4 o0n1 7b3. 4.91C23o85n15si63d9er the r5e7c.2o.47m16m33e54n56d3ation of 5D.V27D6 B3 40b6ased on5 .2249020 5.224902 cusVtoamre r8s into various subgrou0p.s8 w82hose shopping hab0its.0 te5n6d to be similar. Th0ey.2 e4x8pect to 3.68 2In0 th.2e 2p8revious XLMiner outpu0t,. s8h9o5w how the 3li.f6t 8 In the previous XhaLvMingin reern oteudtCp Culu t&,s s tEheo.r w(-15 )h oIwde nthtief yCli lafu5tn .sd0t ie5nr2te-55rp0r7et the 45.2.09h58a2v8i52n0g3 7rented 2C4. 6.&22 92E81. 8(6127) 3Identi5fy2. 2a.62n2d4 29in10te62r7pret the 5.2249002 0 firfoaornwo dpVVVd rct eaaaahhmiarrrisct il 911ukec.me01e nPr-tseqa,r iuhwnaa lhcpiiutslys et t hsoohemtorheepe rprassre etrae sron.e0t00d hb... et388uory 667 ibim532nugpy o pmrrteaapnnayt r cecadot eoigkteoimnr22igse 000sbf ra...lo700iskmi969ces 498ct hlaielko edr ieoelCi-il,c, l osuflanolssauctdrie ,ob eur2agsr,g000, vsa...,e012n grd905iec 306ftear,or iazanendnF, ast 22rtroaatt 000riiCooe...012n .oot760fem 469r1s. p1o1fl 1B1 (hraosu nbTdeeeenda )c mfaolcr uthlae000t er...d778ec.787 oI979nEmteamrsperynedt atthioisn l iofftN Cc ortroaantt riiooen .otfe r1s. 1o1f 1BO1 (hprsChaouap sup&s nbpb deoEeeere tn&d}n f ) co cBfarao l.lC crc( uu 2t&hll)aa e ttES eer.hdde Do.c. wo(Io3nm )tth ehmorSewpe hsrn oaetdhwmt aet et hhiC oiofsoown nrl iofittfhhftd eeCe nsL ucipfet%p Ro raottf if o8o 0ro f sChua p&s pb oEere t&n f o cBra .lC c(u 2&l)a tESe.hd Do. w(o3 )th hoSew hs oathwme e hC ofoownr fitthhdeee nsLucipfet%p Ro raottf if o8o 0ro f 196 •TpurchC oteT VVVVV%d4hyFhu aaaaa ieFaccVisrrrrr ht egp p ia1oax1111ao otreui2345tncsseac etufcrro trttee iooecv3s hhec ris ops 3 ntdsrac.oa .eFnwnn3o’bttrr ae5 akertnar elioc ey ohws7nM ,c s 0000.0 pw5u1.....iriDs0h997605nots0aod011440iet mtu i 830600smtacahectbper n2r oh c iOirapldstuva uotsni nltittieav gvprp esearsu o oa JtsdriSu leuo,sot ctpa22a f2tRnrsc ot0000.0 daio0dig i..... sfhu001661hFaftoctea331013 iptrlt cc0254e57oe3idsy t ito o:ashi l nehrNtn y atdA hmucCCCCCa eantrbmilllllarruuuuuuad u wsr,sssssle gtyttttttiolehlreeeeesrlere ie irrrrreycbss -----22n .ua112345 w mlnoSi0000.0lgisu4 lf,.....hlMp 105160 iattkpen375079nhnonedo538778se itetwph nhrAte othmhopd2gae-t.4.1pRdR 8 .5.2uols(759il..w ce86896saa: eInInDf1 deta22 13l D ClS0000.0. S2Ao2ao.....7n0220005nte144mfia913298t3.4..4e4d c7859118732c(e..e129364e8t n5d85613 PicF7eoer1ane%4ctn3 dti1tosie.BIib(Ac6rLnnO mystn03fOt5 p isoet6aUv e25.6t.rra3 06iecpms9f...or-e19726aeer n1daeAc69897 dpae0ttat1)ioun lnot)2nrrth.anc 0000.0 a(e2hd lx.....pyn a888748)r sta588921oe3ilA28443633Iib(cyvdCLnnO3..4ss.yi,o46fd Oti1 p..tonesa3736eht srr3 0d35852eipmefo-na qrn1 a ecuwaa0ttteRRio li5notn)tuorh.h.nt lw7d aee (C7p:yn 3Fuc24 oI I3)rDafo1..1tnnoa 423lah3.fi.rySvd7163clrs.uiCdl9eed61853 teipAoesoerl7np nnld a1rfcofiyt. ete 4ord2aei FS t2cpen .e8dlnf3en5odgale r6c7aervoc3 e1 3nelux7c23.3%4acwyn03t3Fuc3n... r1t92632itstSotn.iaca122732eiBauiAg.or4dosimcnpbern29 oertp es9stlCgrro reo etla(oelac arrnrsr nyoetlertfi . AuasdTtfaei t FSenpo neTsthdgdnquehrn3 gae euown rdtyae7ducve e )(h .luinc xi acfrhySasatl4c)tnaarhusle etstsieiueccptA lia)sbdeCa opossitnyebam,oo ) o t oerltsnrlnrt:srrrhraestats serewc earenfs t anlogioqreti.lA a lctr husewl ui T stfexa adeli3 anan ltown&tsdth(ctesu o d.dy(ti5CO imfyb.aotl4eoa)hlernsubc l p)1fiSpcaaste.yuud0o r Ltnrrl2peirictr af0banpsaohts4r cioleat 0Rfgeyiertos8 a thescet1s p thdff6ioie aoon a3n.sretrrc(c,u r e x7d“taOesmnoee1ultS.aarlssb1pu gtpc4s aeppee2eranpr 9doiCnt o ba(oo sntrrvilenoetify aauos fcdentsotrlhdqh oiirnea saueay7 rd,sebr )tR n p“a eshlSateaeeulu s eiaslspctp pbet sp tapeeemqseoaendro”nrsnst d suwc ntvofoaei oilaaanflc rlidtt u rl nxahttr liahe laea&sat eemsoleb tRd y5 p olb.salpeeeaf ets c fiprps1 tg.ua ec0qsLtre2nreio”cf0s hrtd 4un ofi oa0R husa8nafne1iat tmndd6ittrso3.hee befmileene owsrlptae fto efirrsgf-r ct eeo r nfi hunamds bfienewrte oerf-r Rotated Factor Loadings and Communalities 2 71.42857143 A B 2 71.428571743 A 7 B 5 1.020408163 7 7 5 1.020408163 PVrairnicmipaaxl RCootmatpioonnent Factor Analysis of the CColurrsetelart ion Matrix#Ob3s 85A.7v14g28. 5D71isAt DCistance v3iewe85d.7 a14n2d85 77rv1aitAeewd e4d8 ajonbd9 araptCpeldic 4a8n tjso bfo 6ar psp0a.l9lie5c2s3a 8pn09ots5s2 ifto7ior nssa loens pthoes 9iftoiollnosw oinn gth 1e5 f ov6lalro0i.wa95bi2ln3e8g0s9 .1525 variables. UVnarrioatbalteed Factor FLaocatdoirn1gs and CoFmacmtourn2alitiCelsuster-F1actor3 45F1act7o7.r747777717080 0CBCommunalityBBCetween 1 45 Fo77r.7m77 7o7f7197 70a180p BCplFicoartmio no79 fl eatptBCeprl i cation l77ett11e6..11r 11 111111L1111u11c11id97ity 6 Luc79idity 11 A77m11b..11i11t11i11o11111n11111 11 Ambition VVVVVVVVVVaaaaaaaaaarrrrrrrrrr i112233445a ble Fac000000000t....4516.....o140294801r146111739740169✓ F222222ac000000000t.........6031o81121153835240r2909130774✓ CCCClllluuuusssstttteeeerrrrF----22a234522222c000000000t....3045o.....130817298r190763604014642✓ 167890F22532222ac000000000t.........78771273o1290011312815r32286....4344412470102868122322....8838✓93935355618873771311109324304457993303348347235BAABAC &&&o CCBmmun000000000a.........774488888li332288778tyCCCCABCAB992211775 lll&e&uuu nCsssCtttteeeerrrr---s123 2341 67890ClAALu4spc7877i.3k51311tap.....e774344ade2322r36be8388a-0853551imr68l7377ai13111084t76577in43044yc23433033c aeCBAABAb l & AA&&uLi34 lCBCspc..ii51tktapey71ader32 b-ea012imr63l77967ai805tinycc CaeABCABb l && ui3 ClCs3.i1tt.e71yr62 -81383 45055555 C111l789..11.u010..753 21112s01990...942t40104e01440217HSE87178r836a11661x-o153611164lpn31993654ee936es76577rmsiteyCan nl789 cus425 esh... 026tie592pHSEr282 ax-o5815lp026ne 77967737eesrmsiteyan n cseh ip111234 GPK55555oreate111es.1.1.n010n..p21112nt09190 ie40104a04140s111l87178s23416161 61116 t39193o GPKjooreiatneesnnpnt ieasls to join VVVVVVaaaaaarrrrrr 566778 000000...784...829963611873478✓✓ 22000000......315122588040652266 Overall2222000000...213...280986568940948✓ 11112312232000000......81040003764004.399616393.32343319380030053ECB && EE 000000......888888822559CCCBC555555lluusstteerr--45 5111 123 S35e8..903l.f253-382c3153o603n3197654380fi5300d ECBe &&nS74 cEEe..e42l19 f -38c58o52979n33fidCBCen52c..26e72 62314667544 10110...9115 241521.29812512D5714911r9241i6310v02e654 105 .22D4r9i0v97920e 15 S544ui011t...a911241b521i981l2511i571t5911y241 631 Suitability VVaarr 89 00.8.08825 200..085469✓ 00.2.04585 200..222189 00..879759 VVaarr 910 00.3.76956✓ 2200..739544 2200.0.19630 200..007540 00..777897 VVVVaaaarrrr 11110112 0000..88..9867103264✓✓ 2220000....001269659839 LO3-10220000..12..1305040650 Chap22t2e00300..r..1201 60.S4569922um m Faarcytor0000 ....A788887757992nalyChsaipst e(rO Supmtmioarnyal and Requires Var 13 0.739✓ 20.329 20.425 0.230 0.888 VVVVVVVaaaaaaarrrrrrr i 111111a234455nce 50000.00....79976..44301145378306569 2222200000.07......001673333109560254814✓ Iibnnyft oearr pmfaraecttt22io ot222rh0000n 0.0ae.... 41051.p.n 0153750ra744o5387l801yvsidiseWswm udeeer ea sbsna ewo.g fWa hcne22oe2 1twnah0000F0.0lti 3sr....st..o0220oa 0a45c l9132dh871e cte7859sa289ettfipniomntdeerraed tnb etacSy hty ehnp. ereWmae pseeleo cdyndpiteuasitfinlniang sitae iao nod1snnd 2tdt hmma0. 00000nec2......er oop7488887atmdo n9258889sepp 43284443 ,ba uwayrlni a.nuidtgts4 iwoi hsnneeg ) v am cWswmea o ur eleesmaeraa eaal sbn psmnrma ea wo.gagpr efWnea ale hddecn-eo e ntwnahltui stlqrstoeooa umca ladhrue Artnesasbteetfiifepniltd emnteedep serrhrae .ed otnlbr eoewtccy haetyf rehpn.tn oerWctmie inpcloseegeoeo ds rndnp hirtesuaaoitfienlnnrwanugdl ta ec i aaotndqton tnd ua ted h mamm bdecreoe ootp aaixmdvlosn-eeuapp as,brna ueraytrdl onii ana-u datwngms ibwodi hsnen leeiagds ve sm kceeausoper ersmrai eacaasl n tp mvm npa caaprledernoeailnddtae -t t arbiaoytlnt tlqu,etoe uesa amnairunnArdnsdtgeeifep ltnd etwehtpc srhsyee .eo lr ewtcaoe rntn otfii inclgenos nd hsaot nrwfude c tqtwo ua maebrerotai xls-euasrn etd oa- wnmdhe iadsskeuperriecs t v pcalerointat trbiaoyln u,t esaninnddge ntwhcyee %V%a VVraiaarrnce 7.050.05.3408003 2.000..611138572 (Optional1)..040.60.1796781 tfhoer dmise tarinb,u1 mt.0iuc02oe..onn00d0si98rd9a tr11nh0ee,a rtla laanyrdte ei msndkoeg dwv eeu adfon rtroic a stoyhbmer1l rrm2eieg0.0es2hlt...ra8t48 i oc121Ttare636 llo dedf itis.f ltWarliubceu stttfthhooitoeerrrnn adsm ssi t steaeu ttanrdhi ndb-f,a u a mttic oeaddntdniossideca  trunshv s,aca astrae rinandairdb baes i emllsevikyteo yerts,dw aheiwle sendfe,ou t rmpo“s r seetueyhrssmiepec snrampitelge eomdnhtr tseti vocaeiaasrar ulli tlerodhe fuitsass.a to sWrtofip bpetahut etiet hoic poenretnnaels slas r ”tttaiusoo ndpoond-is cnfhs in.p tadFh enibisdleercs  tu wtov,sl aesafwreeifirdnaegb csieelevit erynr, ahuwl anmeu smp brieenersierctnae tloe rmdf- e vaasruiroeuss o of pthtieo nrealla ttioopnischsi.p F bierstwt, ewene ied measures of variation (or spread). We defined thiee dr amnegaes,u rest woof vvaarriiaabtlieosn. T(ohre ssep rienacdlu).d Wede thdee ficnoevda rthiaen crea,n gthee, cotrwreol avtaiorina bles. These included the covariance, the correlation Rotated Factor Loadings and Communalities variance, anvdi setwandeadrd a dnevdia rtiaonte, adn d4 w8e jsoawb h aowp pto vl aeisrctiiaamnnacttees, a fnodco rset fasfinacdileaenrtds, dapneodv isathtiieto inloe,a nasnts ds qowunea s ratewhs elhin ofew.o Wtlol eoe stwthiemnia ntieng tr o1dc5uoce efvfid catihreeni at, banled sth.e least squares line. We then introduced the a population variance and standard deviation by using aa psaompupllaet.i on cvoanricaenpcte o af nad w setaignhdtaerdd dmeevaiant iaonnd b ayl suos einxgp laa isnaemd phloew. toc coonmcepput toe f a weighted mean and also explained how to compute aVs aforilmloawxs R: oFtaactitoonr 1, “extroverted personality”; Factor 2, “eWxep leearrineedn 1cth eat” a;F Fgooaorcdm two aory f 3t oa, pi“naptegrlpirrceeate ttahibeo ls neta nldeatrtde Wdre ev ilaetaiornne d tdheast cari pg6toi voed swLtaatuyis cttioci dsi nfitoteryr pg rreot utpheed s tdaantdaa.r dIn d aedvdiaittiioonn1 , 1w de esshcAoriwpmteidvb e isttaiotisntics for grouped data. In addition, we showed WpVeeras roibaneballleiiteyv”e; FtahcaFtoatrc ta4on,r 1“ eacaardleym iiFcna tcartbooirld2ityu.c” tVioanri atFboalec t po2rr 3(eadpwpihceenat irava pFneoacp cuealta)ot nirdo4-no ies s(a pnporo dCtx oilimomsaattemdlry u)ih bnneoaarumlviattilylliyyo dinstrisbuw.the den Si sa a tpoo mpuhlpaotwiol ntio n ics a(glacpu plartodex tihimesa gtteelorym)i nbetorruimc amtlleiya odni snatnridsb ud teemda oinsn sttorda te dh ciotwso i tnotn ecra--lculate the geometric mean and demonstrated its inter- oVna ra n1y factor and th0.u1s1 4is its own2 f0a.c8t3o3r✓, as Factor2 06. 1o1n1 thuese tMhe i Enm2itp20ai.r b1ic 3aoA8l uRptuppleue, atan ridan wn eFc sietug du ierde 0C .3h7e.3b39y4sh ev’s Tuhsee othreem  E,m piprriectaalt iRo7nu .l eF,i Hnanaldol ywn, ewe sest uutdsyeie dd t hCeh neubmysehreicva’ls mThetehoordesm o1,f 2th ipsr ecthGaatiprotaner.s Fpin ally, we used the numerical methods of this chapter ailnVydatirc i2catse d( iisn t rCueh. Vapa0r.t4iea4br0 le3 1) (wforimll2 o0m.f1 5aa0pkpelic asttiaonti lse2tt0itce.3rs9) 4 lsoeawthdehei spmc hoh pgu eilvamaetvsi3o0 iuon.l s2 y uri2nn Ae6iottes nrc vn aaoFl sdm aceacotmnttfietorai irwdnc ih2 neaag t b (trn“heieaelc sipxo0toenyp.pa4u eb 2llriay2itn i-loanrtg’see sfhrraawtvhpchteeii oapmcnhoisl pg gsuohilvfta. et sio uatnosn ugCainnilvyitteeths8ri avcn nasao: l i spmnd ctSeartocotnateidestluiraro eciwnn tsiih o4ntmarngte ttreaodhes ena,f osicpsouslounhrp casiiutmbpelularpy t o isalorantnsraag’nselety s tsfsheriasac ,pch ptfenia oimrcqntuiosoge rohsb f1tao nfa3 ap btalroyne sigdaiPsillivyc, ieottaii tnvactydnsee: i n ndttericoaidsuliocnti otrne teos ,f oculur sitmerp oarntaanlty tseisc,h fnaicqtuoers aonf aplryesdisic, tainvde uesVVneaacrrfe u34”)l. Iann sdu mremlaervy00a,.. 02tn61h16te rfer oism no tt h22m00eu..12 cb24h77e dgififnenreinncge22 b00ae..08nt07wd64e✓ etbnhe . utWhsee a lm7s2o- 400fsoaa.. 90wtc28 itt81hoL-a✓rti, k awanhbedni bla4i dty-yaft a af cseettoa irst 00husi..g88orh87lliuy17n -skgew ead, b niet. ieWs wbee aslt s do siaasswsco tchuiaa9tts,i owsnhi eEronu xlane psd. aeotar fisee tpn iscr heoig bhlya sbkeiwleidt, iyt i sm beso1t 4d aessloKicinaetegionn nruelsess. to join vtaiVotaners 5.s Wtued menigthst ttohe 0br.9ee1fo9 mr✓e ocroenc ilnudt0ee. r1te0h4astt ethde i1n5 tvha2re0i a.1eb6nl2etsi rcean c boe2u r0re.s0d6eu2.c ed to athned f oullso0iw.n8i8gn5g motivating examples—The Crystal Cable fiVvaer u6ncorrelated fa0c.t8o6r4s:✓ “extrove2r0te.1d0 2personality2,”0 “.2e5x9perience,”50 “. 0a0g6Sreeelfa-bcloe npfiedrseonncae0l. i8 t2y5,” 10 Drive 15 Suitability H“Vaoacwra d7eevmeicr ,a boiluityr ,” p0a.rn2e1d7s “eanpptaeatriaonnc0 e..2g”4 6iTvheiss coinncslut2rs0iuo.8cn6t 4ho✓erlsp s tvhae rpi0eo.r0us0os3n nel ofCficaesre f oca0un.s8d 5o5 na real-world example of gender discrimi- BI cthhVVeoaa rri“ c89esesse.n tTiahl icsh airsa 00cb..t90ee18rc85is✓atiucss”e o, fa a22f tj00oe..28br04 a69cp✓opvliecarnint. gM2 to00hr..00ee85o 85voepr,t iifo an ac2ol00m ..i02np41a99-ny ananlyastti wonis h00..ea87s97t 59a ta pharmaceutical company—to illustrate traVo laadrt eu1r0c dtaioten to t uos eb au0 t.sr7ei9ne6 e✓disasg raamna o2lr0y .r3te5igc4rse ssiino nC anhaa2lyp0s.t1ies6 0rto 1 p,r etdhice2t s0fia.0vl5ee0s perfotrmhean cpe roo0n.7b t8ah7eb ility rules. Chapters 5 and 6 give more obpVVaaatsirri so11 12onfa tlh es echcatriaocnte00sr..98i s10ot64inc✓✓s dofe ssaclre22sipp00e..t12oi65vp39lee , athned a nparl22ye00sd..t13 ic04ca50tni vsiem palnif2ay00 lt..y01h45et22- predicctioonn cmiosde00e.. 88ldi75n92igs cussions of discrete and continuous prob- icpaVVbrsoaal errci se11n d34a usC rpeoh btaeynp tutisaeilnr psgr 00et..2hd7433ei c96 atfi✓onvred v ua3rnica obcr22lare00esn..l.33a 26tbe94de faccotovres rien22sd00t..e 45a24id51n oaf nthye oo2rri00gd..25ine31a09rl 15 caorbreillaitteyd00 v..88da88ri84is-tributions (models) and feature practical wViatIhrn 1o 5guent elroasl,s i no af dc0ao.3tan7 9mtiinnuinigt yp.r o2Tj0eh.c7et9 8rw✓ehfeorer ew, et hw2ei0s .hi0n 7t8os tprruedcitcot ra 0cr.e0as8np2o nse veaxriaabmlep a0ln.e7d9s 4i ni llustrating the “rare event approach” to cwhVhoairociahsn etche ew reh airceh a no 5ef.7x t4thr5e5em seilyx loarpget2i .no7u3nm5a1ble br uofs pinoteesn2st.4i a1al4 n0coarlryeltaitcesd 1 sp.3er4ec7d8-ictor vamriaabklei1sn2, g.i2t4 c2aa3 ns tatistical inference. In Chapter 7, The Car b%e Vuaserful to first emp0l.o38y3 factor analy0s.i1s8 t2o reduce the 0la.1r6g1e number o0f .p09o0tential correlated0. 8p1r6e- tions to cover early, as part of the main flow of Chap- Mileage Case is used to introduce sampling distribu- dictor variables to fewer uncorrelated factors that we can use as potential predictor variables. ters 1–3, and which to discuss later. We recommend tions and motivate the Central Limit Theorem (see thasa fto lsloewcst:i oFancsto rc 1h,o “seextnro vtoer tebde p edrsiosncaulistys”e;d F alcatoter r2 , b“eex pceorivenecree”d; F actorF 3i,g “uagrreesea 7b.l1e , 7.3, and 7.5). In Chapter 8, the automaker apfetresro n Calhitya”p; tFearc t1o4r ,4 ,w “haciacdhe mpirce asbeilnittys. ”t hVear ifaublret h2e (ra pppreeardaincctei)v deo es noitn lo aTdh hee aCvilayr Mileage Case uses a confidence interval on any factor and thus is its own factor, as Factor 6 on the Minitab output in Figure 3.34 analytics topics of m ultiple linear regression, logistic procedure specified by the Environmental Protection indicated is true. Variable 1 (form of application letter) loads heavily on Factor 2 (“experi- reengcree”)s.s Iino nsu,m amnadr yn, ethuerrea li sn neottw mourckhs d.ifference between the 7-factor and A4-gfaecntocr yso (luE-PA) to find the EPA estimate of a new mid- tions. We might therefore conclude that the 15 variables can be reduced to stihzee f omlloowdinegl ’s true mean mileage and determine if the five uncorrelated factors: “extroverted personality,” “experience,” “agreeable personality,” Chapters 4–8: Probability and probability mod- new midsize model deserves a federal tax credit (see “academic ability,” and “appearance.” This conclusion helps the personnel officer focus on ethlien “ges.s entDiali sccharreactteer istiacsn” dof a cjoob naptpinlicuanot.u Mso reopvreor, bifa ab coilmitpyan y ana Flyisgt uwriesh 8es. 2at) . BI a later date to use a tree diagram or regression analysis to predict sales performance on the vbiasis of the characteristics of salespeople, the analyst can simplify the prediction modeling procedure by using the five uncorrelated factors instead of the original 15 correlated vari- ables as potential predictor variables. In general, in a data mining project where we wish to predict a response variable and in which there are an extremely large number of potential correlated predictor variables, it can be useful to first employ factor analysis to reduce the large number of potential correlated pre- dictor variables to fewer uncorrelated factors that we can use as potential predictor variables. bow49461_fm_i–xxi.indd 6 20/11/15 4:06 pm 328 Chapter 7 Sampling Distributions In order to obtain a preliminary estimate—to be reported at the auto shows—of the mid- size model’s combined city and highway driving mileage, the automaker subjected the two cars selected for testing to the EPA mileage test. When this was done, the cars obtained mileages of 30 mpg and 32 mpg. The mean of this sample of mileages is 33 x_2 5 _ 3_0_ 12__ 3_2_ 5 31C hmapptger 7 Sampling Distributions This sample mean is the point estimate of the mean hmasil beeaegne w mo rfkoinr gt htoe ipmoppruovlaet gioans moifl esaigxe ps,r we-e cannot assume that we know the true value of production cars and is the preliminary mileage estimthaete p ofopur ltahtieo nn emwea mn midisleizagee m mo fdoer lth teh naet ww masi dsize model. However, engineering data might reported at the auto shows. indicate that the spread of individual car mileages for the automaker’s midsize cars is the When the auto shows were over, the automaker dseacmide efdro tmo mfuordthele tro s mtuoddye lt ahned n yeewar mtoi ydesaizr.e T herefore, if the mileages for previous models model by subjecting the four auto show cars to variouhasd t eas ststa. nWdahrde nd etvhiea tEioPnA eq muaill etoa g.8e mtepsgt ,w ita ms ight be reasonable to assume that the standard performed, the four cars obtained mileages of 29 mp dgev, i3a1ti omn pogf ,t h3e3 mmilpegag, easn fdo r3 4th em npegw. Tmhoudse,l will also equal .8 mpg. Such an assumption would, of course, be questionable, and in most real-world situations there would probably not the mileages obtained by the six preproduction cars were 29 mpg, 30 mpg, 31 mpg, 32 mpg, be an actual basis for knowing s. However, assuming that s is known will help us to illustrate 33 mpg, and 34 mpg. The probability distribution soafm tphliisn gp doipsturilbautitioonn so, fa nsdi xin ilnadteirv cihdaupatle rcs awr e will see what to do when s is unknown. mileages is given in Table 7.1 and graphed in Figure 7.1(a). The mean of the population of C EXAMPLE 7.2 The Car Mileage Case: Estimating Mean Mileage Table 7.1 A Probability Distribution Describing theP aPortp 1ul:a Btiaosnic o Cf oSnixc eInpdtisv id Cuoaln sCiadre rM tihlee aignfiesnite population of the mileages of all of the new midsize cars that could potentially be produced by this year’s manufacturing process. If Individual Car Mileage 29 30 3w1e assume3 2that this p3o3pulation3 i4s normally distributed with mean m and standard deviation Probability 1y6 1y6 1y6 1y6 1y6 1y6 Figure 7.1 A Comparison of Individual Car Table 7.F2ig u Trhee 7Po.3p u l aAt iCoonm opfa rSisaomn polfe ( 1M) tehaen Psopulation of All Individual Car Mileages, (2) the Sampling Distribution of the Sample Mean x When n 5 5, and (3) the Sampling Mileages and Sample Means (a) The population of theD 1is5tr isbaumtiopnl eosf tohfe nSa 5m p2le c Mare an x When n 5 50 (a) A graph of the probability distribution describing the mileages and corresponding sample means population of six individual car mileages Sample (a) ThCMea iprle oapguleasti on of individuaSMl amemialepnalgees The normal distribution describing the 1 29, 30 29.5 population of all individual car mileages, which 2 29, 31 30 has mean (cid:31) and standard deviation (cid:29) 5 .8 0.20 3 29, 32 30.5 1/6 1/6 1/6 1/6 1/6 1/6 4 29, 33 31 Scale of gas mileages 0.15 5 29, 34 31(cid:31).5 Probability0.10 678 (b) Th333e000 ,,,s 333a123m pling distribution o333f011 t..h55e sample mean x¯when n 5 5 9 30, 34 32 The normal distribution describing the population 0.05 1101 3311,, 3323 3312.5 osifz ael li sp o5s, swibhleer esa (cid:31)mx¯p5le (cid:31)m aenadn s(cid:29) xw¯5he n(cid:29)n th5e s.58am5p .le358 0.00 12 31, 34 32.5 29 30 31 32 33 34 Individual Car Mileage 13 32, 33 32.5 14 32, 34 33 15 33, 34 33.5 Scale of sample means,x¯ (b) A graph of the probability distribution describing the (cid:31) population of 15 sample means (b) A probability distribution describing the population(c o) Tf h1e5 s saammplpinlge dmisetraibnust:io tnh oef sthaem spamlinpgle mean x¯when n5 50 distribution of the sample mean 3/15 Sample 0.20 Mean Frequency Probability The normal distribution describing the population 29.5 1 1y15 of all possible sample means when the sample size Probability00..1105 1/15 1/15 2/15 2/15 2/15 2/15 1/15 1/15 33330011 . .55 1223 1223yyyy11115555 is 50, where (cid:31)x¯5 (cid:31) and (cid:29)x¯5 (cid:29)n55.805 .113 32 2 2y15 0.05 32.5 2 2y15 33 1 1y15 0.00 29 29.5 30 30.5S3a1mpl3e1 .M5ea3n2 32.5 33 33.5 34 33.5 1 1y(cid:31)15 Scale of sample means,x¯ 7.1 The Sampling Distribution of the Sample Mean 8.1 335 z-Based Confidence Intervals for a Population Mean: s Known 349 Figure 7.5 The Central Limit Theorem Says That the Larger the Sample Size Is, the More Figure 8.2 Three 95 Percent Confidence Intervals for m Nearly Normally Distributed Is the Population of All Possible Sample Means The probability is .95 that x¯ will be within plus or minus 1.96(cid:31)x¯ 5 .22 of (cid:30) x x x x (a) Several sampled populations n=2 n=2 n=2 n=2 Paoll pinudlaivtiiodnu aolf Samples of n 5 50 .95 x¯ x¯ x¯ x¯ car mileages car mileages m (cid:30) x¯ (cid:31) n 5 50 31.6 2 .22 31.6 31.6 1 .22 n=6 n=6 n=6 n=6 x¯ 5 31.56 31.56 x¯ x¯ x¯ x¯ n 5 50 31.34 31.78 n 5 50 x¯ 5 31.68 31.68 x¯ 5 31.2 31.46 31.90 31.2 n=30 n=30 n=30 n=30 30.98 31.42 _ x¯ x¯ x¯ x¯ 3 In statement 1 we showed that the probability is .95 that the sample mean x will be ( b ) Cdioffrereresnpto snadminpgl ep soipzuelsations of all possible sample means for wthiatth x_ in b peliunsg owr imthiinnu psl u1s.9 o6rs m _x 5in u.2s 2.2 o2f othf em p iosp tuhlea tsiaomn em aesa tnh me .i nItne rsvtaatle [m x_ e6nt .22 2w] ec osnhtoawine-d ing m. Combining these results, we see that the probability is .95 that the sample mean _ How large must the sample size be for the sampling distribution of _x to be approximately x will be such that the interval ntioornm, tahl?e lIanr ggeern tehrea ls, atmhep lme osrizee s mkeuwste db et hfeo rp trhoeb apboipliutyla tdioisnt roibf uatlilo np oosfs itbhlee ssaammppllee dm peoapnus ltao- [x_ 6 1.96s _x ] 5 [x_ 6 .22] Cbhe aappprotxiemartesly 9no–rm1all2y :d isHtribyutped.o Fotrh soemse isasm ptleed spotpiunlatgion.s , Tparwticuolar-lys tahomse ple c aopntpainrso tahec phopeusla tiiosn pmeraens me.nted in the middle of this section described by symmetric distributions, the population of all possible sample means is approxi- pmraotelcy neordmuallyr edisstr.ib utEed xfopr ae farirliym smealln satmaplle sizde. eIns adidgitinon , staudniesd in dicaatne tahatl,y sis A 9(r5a ptehrecer ntth caonn fiadte tnhcee ienntedr,v aals f oinr m previous editions) so that if the sample size is at least 30, then for most sampled populations the population of oafll  pvosasibrliea sanmcplee m. eCanhs iis- aspqprouxiam arteely tneorsmtalsly. distrCibuhtead.p Int ethris 9bo odk,i wshceun-sses Statmemoenrte 3 soayfs tthhate, b esfeorcet wioe nra ncdaomnl yb seele cdt tehve osatmepdle , ttohe rde ies va e.9l5o pproibnabgil ittyh thea t hfoeamyvparepplrrlm yoto xh dieuitm sshtalartimaeebluptystli eeiadn ss,n igt zoherte m ensa asilsms dttapiiatsl intlnrteiggiba susdt ti tis3aoti0rncin,.b wauOdtefil o w cnbo iolulhefr s a_x gyes ,si suipi fmen xotehas ectt h thlsawyate m nthosipertleme hsdaas ml p .foap oplr iu naTlngan tyeidh oiswnsart mrieisb p ueelsexti easoicnzct ecloy.tf ai n _xoos risen- s —on w9n5oe t p sacweuo rinlcfimlte anovitnbm e tomaf-i .ana sF lrlato nyeirn tp tihenbir tsve ohrraevlxysaa sl t opha[na _xo,nt w6wtdhee . c2emsa2shli]lg ithotshhte wa o tib tncitetaoneisnrnvgtt aaci ilonn h[nsx _gt o at6hi wne p . m2p 2r,ot ]oapo anuc dl9au e55ti s doppneeeu rr mccrieeetenna.tt n ocI emfon mnt.h fiIeadnps eedohn itdcnhaeteie sr itr nviiwtzaoeolrsenr vddsa,so l , for m. To better understand this interval, we must realize that, when we actually select the TChe EXTArMasPhLE B7.3a gTh e Ce-baillsineg, C aTse :h Reed uecin-gb Milelainn Bgill PaCymaesnte T,i maend The samtphlea, tw es wuicll cobessersvfe uonlel ypa ruticsuilanr gsa mapnley fr oomf t het hexetr embeolyo lkar’gse nhumybpero otfh poesssiibsle VRaeclaell nthtati na mea’nsa geDmeanyt c oCnsuhltiongc fiorml ahatse i nsCtalaleds ea —new ac ormep uttehr-beasned uelescterodni c in a samtpelesst. iTnhegre fosreu, mwe mwilal orbyta inb onoex pearsti curlaer qcounfiidreensce isntiemrvapl flryom tihde eexntrteimfeylyi lnargge billing system in a Hamilton, Ohio, trucking company. Because of the previously discussed number of possible confidence intervals. For example, recall that when the automaker randomly naedvwant asgeesc otf ithoen ne wth bailltin ge sxysptelma, iannds b etchauese cthre itrtuicckiangl cvomaplaunye’s calinendts apre- rve-alue seletchteed thael staemrpnlea otfi nv 5e 5h0 ycaprso atnhd etesstiesd thbeemi ans gpr etsecrsibteed dby athne dEP tAh, tehne a ultoomoa_kk-er ceptive to using this system, the management consulting firm believes that the new system obtained the sample of 50 mileages given in Table 1.7. The mean of this sample is x 5 awpilpl rredouacec thhe emsea n tboill ptayemsetnit ntimge bay m ohrey thpano 5t0h peersceinst. Taheb moeaun tp ayam enpt toimpe ula- 31.5i6n mgp gi,n an tdh a ehi sstougrmamm conasrtryuc tbedo uxsin gf othris stahmep lec (osere rFeigsupre o2.n9 odni pnagge 6c6r) i intdiiccaatels tiuosinng thme oelda bnill.in gA sy ssteum mwasm apaprroxyim abteolyx eq uvali tso,u bautl nloy le sisl tlhuans, 3t9r adatyis.n Tghe rte-hese thatv tahel upoep urlautiloen oaf naldl i/nodirv ipdu-avl caalru meil e(asgeese is tnhorem anlley xdits trpibaugtede. )I.t follows that a fore, if m denotes the new mean payment time, the consulting firm believes that m will be 95 percent confidence interval for the population mean mileage m of the new midsize model is less than 19.5 days. To assess whether m is less than 19.5 days, the consulting firm has randomly selected a sample of n 5 65 invoices processed using the new billing system and [ _x 6 .22] 5 [31.56 6 .22] _has determined the payment times for these invoices. The mean of the 65 payment times is vii x 5 18.1077 days, which is less than 19.5 days. Therefore, we ask the following question: If 5 [31.34, 31.78] Because we do not know the true value of m, we do not know for sure whether this interval contains m. However, we are 95 percent confident that this interval contains m. That is, we are 95 percent confident that m is between 31.34 mpg and 31.78 mpg. What we mean by “95 percent confident” is that we hope that the confidence interval [31.34, 31.78] is one of the 95 percent of all confidence intervals that contain m and not one of the 5 percent of all bow49461_fm_i–xxi.indd 7 confidence intervals that do not contain m. Here, we say that 95 percent is the confidence 20/11/15 4:06 pm level associated with the confidence interval. 396 Chapter 9 Hypothesis Testing p-value is a right-tailed p-value. This p-value, which we have previously computed, is the area under the standard normal curve to the right of the computed test statistic value z. In the next two subsections we will discuss using the critical value rules and p-values in the summary box to test a “less than” alternative hypothesis (H: m , m) and a “not equal to” a 0 alternative hypothesis (H: m Þ m). Moreover, throughout this book we will (formally or a 0 informally) use the five steps below to implement the critical value and p-value approaches to hypothesis testing. The Five Steps of Hypothesis Testing 1 State the null hypothesis H and the alternative hypothesis H. 0 a 2 Specify the level of significance a. 340 2P lan the sampling pCrhoacpedteurr e9 and select the test statistic. Hypothesis Testing Using a critical value rule: LO9-4 9.3 t Tests about a Population Mean: 4 Use the summary box to find the critical value rule corresponding to the alternative hypothesis. 5Us e cCriotilclaelc vta ltuhees asnadm ple data, cosmp uUten thken vaoluwe onf t he test statistic, and decide whether to reject H by using the pte-svta alcubreoitsui ttco aa plp evorapfoluurlmaet iaro unt le. InIft ewrper deto tnhoet sktantoiwsti csa l( rwehsuiclths .is usually the case), we can base a hypothesis te0st about m on mUesainng w ah epn- sva islu e rule: the sampling distribution of _ u4n knoUwsne. the summary box to find the p-value corresponding to the_ x _ a2_lt_ e__mr__ n ative hypothesis. Collect the sample data, compute the value of the test statistic, and compute the p-valusey. Ïn 5 Reject H at level oIff s tihgen isfiacmanplceed a p iof ptuhlea tpio-vna ilus en oisr mleassl ltyh dains tari. bInutteedrp (roert itfh teh es tsaatmistpilcea ls rizeesu ilst sla.rge—at least 30), 0 then this sampling distribution is exactly (or approximately) a t distribution having n 2 1 degrees of freedom. This leads to the following results: Testing a “less than” alternative hypothesis We have seen in the e-billing case that to study whether the new electronic billing system reduces the mean bill paymeAnt tti mTee sbty a mbooreu tth aan P5o0 ppuerlcaetnito, tnh eM meaanang:e mse Unt nckonnsoulwtinng firm will test H: m 5 19.5 versus H: m , 19.5 (step 1). A Type I error (concluding that H: 0 a a m , 19.5 is true when H: m 5 19.5 is true) would result in the consulting firm overstating Null Test 0 _ Normal population Hypothesis H0: m 5tt mhoe 0o btheenre cfiotsmS otpafa ttnihsiteei scn tehw a tb ta i5rllei n _ cx _gso _y 2ns_Ï sy_ _mni_s_d _t 0ee rmin , gbd oinft hs5t at onll ti2hneg 1 csoumchp aa nsAyyss istneu mwm.hp Bitcieohcn iatsu hsaes t Lhbaeer egcnoe n isnsasoumtlrat plilnleegd sfi aizrnmed desires to have only a 1 percent chance of doing this, the firm will set a equal to .01 (step 2). To perform the hypothesis test, we will randomly select a sample of n 5 65 invoices Crpitaicidal uVsailnueg Rthuele new billing system and calculate tph-Vea mlueea (nR e _xje ocft Hth0 eif ppa-Vyamlueen (cid:31)t t(cid:31)im) es of these Ha: (cid:31) (cid:31) (cid:31)0 invHoa:i c(cid:31)e (cid:29)s. (cid:31)T0hen, becaHua:s e(cid:31) (cid:31)th (cid:31)e 0sample sizeH ai:s (cid:31) l a(cid:31)r g(cid:31)e0, we willH au: t(cid:31)il i(cid:29)z e(cid:31) 0the test sHtaa: t(cid:31)i s(cid:31)ti (cid:31)c 0in the Do not Reject RejescutmmDoa nroyt boxR (esjetcetp 3D)o: not Reject _ 9.4 reject H0 H0 H0 reject H0 H0 reject H0 Hz 0Tests azb 5ou _ xt _ sa2_ y_Pp Ï _1o-v_9p_na__.u l 5u_le atipo-nv aPlureoportion 407 _ In order to (cid:31)see howA(cid:31) t ov ateluset tohfi st hkein tde(cid:31) so(cid:26)t2 fs htaytpisotitch ezs (cid:31)its(cid:26)h2, arte ims elmesbse trh tahna tz ewrhoe rne snu litss lwarhgeen, t xh ei s less than 19.5. This _ sampRleijnte g(cid:31)c0t dtH(cid:31)i0st (cid:31)t irfibution oHcptefar(cid:29)asR ot:ttee v(cid:31) tms jsie(cid:31) dt cta,0 het(cid:29) tsaHi t st(cid:31)10 e t 9mi vicf.i 5 dmm eisniug csohettf (cid:28) t (cid:27)t (cid:29)bt(cid:28)t boh _ (cid:27)R Ïette(cid:31) (cid:31)e_ e s(cid:26)(cid:26)_ _p t2t2_ju(cid:31)lpfo e__ˆo(e(cid:26)o_p2 c1 __rs20—rt_ rp__ s tmne2H _o_ _(cid:31)t j htp0__ r_eh a Ht (cid:29)__pt_ci at(cid:31)f _r__t)a(cid:26)tin2 es(cid:31) : H(cid:26),j2m1e0c9 ,t.ii5nn .m g fT pt0aoH,o-v vt a0ohad nerlieud n erco 0 i gwiff(cid:25)dah ee tvHa oolrthafeor ota oaowktf liHmenv aptutoe h-cb lvteh theao eclsulf aueel eums s0f(cid:25)ssti gme o atn fh rtatieharfiaeync apbznooecirxnoeptr(cid:29) hitug -e v(cid:28)athnteh a(cid:28)tasd, le uortee wiefv mra (cid:28)0 (cid:25)te at(cid:28)ta ohl uttn eweteho (cid:28) itce c xt(cid:28)oree iif tn tihtcdhaaie-tl value rule heading H: m , m. The critical value rule that we find is a left-tailed critical is approximately a stavnadlaured rnuolerm aanld dsiasytrsi btoau tdioon t.h eL ef0ot lplo0 wdienngo:te a specified value between 0 and 1 (its exact value will depend on the problem), and consider testing the null hypothesis H: p 5 p. We then have the following result: 0 0 Place the probability of a Type I error, (cid:31), in the left-hand tail of the standard normal curve and useC th eE nXoArmMal PtaLbEle 9to. 4fAin T dLh taehr eCg ocermi tSimcaaeml rvcapiallule eL oT2eazns(cid:31)t .C Haaesberoe: uM2tez aa(cid:31) n isP D othepeb utn-letaogt-aiEtoiqvnue i toPyfr toRhapetoiortion normaOl pnoei nmt eza(cid:31)s.u Treh aotf i sa, c2ozm(cid:31)p iasn tyh’es pfioniannt coina lt hhee ahltohr iziso nittsa dl aebxtis-t ou-nedqeuri ttyh er asttiaon. dTahrids nqouramntaitly is NHuypll othesis H0:tcR epues r5jvte e scp ttdrot0ah aefHttatfiiie0stonn : tg e ii(cid:31)micdsv e(cid:31)otztoosn io i1sabt 9o h ell.reei 5 gtstf hhthis-ene,h tairfifhtaanTS nativetdiasnaoso nttt oracit osn hiioftlaee i facl t hicrHhneerezd(cid:30) ai a:tci5 lcioe(cid:31)cta hmq _ Ïa t _(cid:30)iul_o _ppo_ _a_vfp_0 anˆ__1 ( l_anc 2 1___9tolyo__ o_ nu .2f__’mp_5 e_s_(cid:31)_ 0fi _ _pp_ i.c2_fn__0a o__)_aan z rnnip(cid:30)ecd osi(ar sotalto tneie nwpl ysd h 4teiaif)bcb .htt iB hl titoeethy cect.ayhA oFu eshmso saceurpvo m ou(cid:31)emb pt eepevtxaqidiotonu enuvyanssa’ld3ss ler u e.ed0eanq 1 cs(uo1oo,in f tnmt2 yphtsa.hm0e ,n p e$Ibde0f)ar 5tcn$hikai 5ssl criticall ovaanluse. S2uzp(cid:31)p oiss e2 tzh.a0t1, 5in 2or2d.e3r3 t[os er eed Tuaceb lrei sAk.,3 a a lnadrg Fe igbuanrek 9h.a3s( ad)e].cided to initiate a policy limiting the mean debt-to-equity ratio for its portfolio of commercial loans to being less Ctrhitaicna l1 V.5a.l uIen R ourldeer to assess whether the mean debpt--Vtoal-ueeq (uRietyje crat tHio0 ifm p -oVfa liutse ((cid:28)c u(cid:31)r) rent) com- Ha: p (cid:30) p0 merHcai: apl (cid:28)lo pa0n portfolioH ais: pl e(cid:31)s sp 0than 1.5, theH ab: apn (cid:30)k pw0ill test theH an: up l(cid:28)l hpy0pothesisH aH: p: (cid:31) m p 05 1.5 0 Do not Reject RevjeecrtsusD ot hnoet alterRnejeacttiveD ho ynoptothReejseicst H: m , 1.5. In this situation, a Type I error (reject- reject H0 H0 Hin0g Hr0e:je mct H50 1.5 Hw0henre Hjec0t: H m0 5H 10.5 isa true) wopu-lvda lureesupl-tv ainlu ethe bank concluding that the mean debt-to-equity ratio of its commercial loan portfolio is less than 1.5 when it is not. 1.0 5 (cid:31) B(cid:31)ecause the bank (cid:31)w(cid:25)2ishes to be(cid:31) v(cid:25)2ery sure that it does not commit this Type I error, it will Reje111c...0123t H 1110z (cid:31)922if 93 7 bteas(cid:27)nRt kzeH(cid:31) jer0ca 0tvn Hed0ro s imufsl yH sa ebley(cid:27)c Rzut(cid:31)ses(cid:25) 2jieanc g0ts Haam0 . z0ipf(cid:31)1l(cid:25)2e l eovfe 1l 5op fo- vsfai gliutnes0 i(cid:24)fico camarzenmacee.r cTiapol- vpzlaeolurafeno 0(cid:24) ram acrc etohauen htsy.pp(cid:27) Ao-(cid:26)vztauh(cid:26)ldueeist i0s(cid:24)s ottwef(cid:26)s izctt(cid:26)eh, ethsee z11 (cid:30)..45 z (cid:31)1 5 6 c1o.0m5zp, (cid:28)a1n. (cid:27)1ie1zs(cid:31), 1 re.1s9u,l tz1 (cid:26) i.z(cid:30)n2(cid:26) 1(cid:30)zt(cid:31),h (cid:25)z21e(cid:31) o.(cid:25) 22rf —z2o ,l(cid:28)t lh1o a(cid:27).wt2 zis9i(cid:31)n,(cid:25),2 g1 .d3e1b,to t1 -tt.ho3e-2 re,igq 1hu.ti 3ot3yf z, r1a.t3io7s,to 1 (t.ha4er1 rlae, nf1t go.4ef 5dz, i1n. 4i6ntr,chi g1rehe .ta6a ors5efi a,(cid:26)n z atg(cid:26)on tdoh er1d.e7r8):. 1.6 5 The mound-shaped stem-and-leaf display of these ratios is given in the page margin and 1.7 8 indicates that the population of all debt-to-equity ratios is (approximately) normally dis- tributed. It follows that it is appropriate to calculate the value of the test statistic t in the DS DebtEq summary box. Furthermore, because the alternative hypothesis H: m , 1.5 says to use C EXAMPLE 9.6 The Cheese Spread Case: Improving Profitability a Hypothesis testing summary boxes are featured theory. Chapters 13–15 present predictive analyt- We have seen that the cheese spread producer wishes to test H: p 5 .10 versus H: p , .10, throughout Chapter 9, Chapter 10 (two-sample proc0e- ics meathods that are based on parametric regression where p is the proportion of all current purchasers who would stop buying the cheese spread dures), Chapif tteher n1ew1 s p(ooutn wee-rwe uasyed,. Trahen pdroodmuceirz weidll ubsel othcek n,e wa snpdou t if Ha cnand bte irmejeect esde inr ies models. Specifically, Chapter 13 and 0 favor of H at the .01 level of significance. To perform the hypothesis test, we will ran- two-way analysis oaf variance), Chapter 12 (chi-square the first seven sections of Chapter 14 discuss simple domly select n 5 1,000 current purchasers of the cheese spread, find the proportion (pˆ) of tests of goodthnesee spsu rcohfa sefirst wahno dw oiunldd setoppe bnuydinegn tchee c)h,e easne dsp rtehade i fr tehe- new spaonutd w ebrea ussiecd , manud ltiple regression analysis by using a more calculate the value of the test statistic z in the summary box. Then, because the alternative mainder of the book. In addition, emphasis is placed streamlined organization and The Tasty Sub Shop (rev- hypothesis Ha: p , .10 says to use the left-tailed critical value rule in the summary box, we (cid:31) 5 .01 throughout woinll reejesctti Hm0:a pt 5in .g10 ipf trhae cvtailucea olf zi ims lepsso trhtaann 2czea 5a 2ftze.01r 5 22.3e3n. (uNeo tep trheatd uisicntgi on) Case (see Figure 14.4). The next five testing for stthaitsi psrtoiccedaulr es iisg vnaliifid bceacanucsee n.p0 5 1,000(.10) 5 100 and n(1 2 p0) 5s e1,c00ti0o(1n 2s .o10f) C5 hapte2rz .01140 present five advanced modeling 900 are both at least 5.) Suppose that when the sample is randomly selected, we find that 22.33 63 of the 1,000 current purchasers say they would stop buying the cheesteo sppriecads itf hthae tn ecwa n bp-vealu ecovered in any order without loss of spout were used. Because pˆ 5 63y1,000 5 .063, the value of the test statistic is 5 .00005 Chapters 13–18: Simple za 5n _d___ _pˆ_m_ _2____ up__0__ l__t__i p5l _e__. _0__r6__3e__ _2g_ ___ r.__1e__0__ s__s 5io 2n3. 90 continuity: dummyz v0ariables ( including a discussion anneaulryasli sn. eMtwodoerkl sb.u Tilidmineg s. eLr Ïoi _peg_0(_si1_s n_2tf_ i _opc_0_r ) r eecg Ïa _r.s1_e0t_(_1si1_n,s 0_20i_go0 _. ._1n _0_)C aonnd- oinft eirnatcetriaocnt iovna)r;i aqb2ul3.e9a0sd; rmatiocd evla briuaibldleisn ga nadn dq tuhaen teiftafeticvtes Because z 5 23.90 is less than 2z 5 22.33, we reject H: p 5 .10 in favor of H: p , .10. trol chartsT.h atN is,o wne pcoancrluadme (aet atnr aic o f .s01t.0)1a thtaits thtei cprso.p orDtioen cofi sa0lil ocunrr ent purochfa smersu wlahtioc woolulldin earity; residual analysis and diagnosing stop buying the cheese spread if the new spout were used is less than .10. It follows that the viii company will use the new spout. Furthermore, the point estimate pˆ 5 .063 says we estimate BI that 6.3 percent of all current customers would stop buying the cheese spread if the new spout were used. 3Some statisticians suggest using the more conservative rule that both np0 and n(1 2 p0) must be at least 10. bow49461_fm_i–xxi.indd 8 20/11/15 4:06 pm outlying and influential observations; and logistic re- ing Holt– Winters’ exponential smoothing models, and gression (see Figure 14.36). The last section of Chapter refers readers to Appendix B (at the end of the book), 14 discusses neural networks and has logistic regres- which succinctly discusses the Box–Jenkins method- sion as a prerequisite. This section shows why neural ology. The book concludes with Chapter 16 (a clear network modeling is particularly useful when analyz- discussion of control charts and process capability), ing big data and how neural network models are used Chapter 17 ( nonparametric statistics), and Chapter 18 to make predictions (see Figures 14.37 and 14.38). (decision theory, another useful predictive analytics Chapter 15 discusses time series forecasting, includ- topic). 594 Chapter 14 Multiple Regression and Model Building 652 Chapter 14 Multiple Regression and Model Building Figure 14.4 Excel and Minitab Outputs of a Regression Analysis of the Tasty Sub Shop Revenue Data Figure 14.36 Minitab Output of a Logistic Regression of the Credit Card Upgrade Data in Table 14.1 Using the Model y 5 b0 1 b1x1 1 b2x2 1 « (a) The Excel output Deviance Table MRASOt dbSaujqsnluetudisrRpatavelreraeeddtg iR RoEr e rnSrssqo suri oanre Statistic3s0006....999698780115505066 897 SRETroeorPPgutoaulrrraelcrtsecPshriaoosfnieles 33D21179F Seq32115 55095D.....84320ev46715 Cont1ri64130b56840u.....t12890i06500o%%%%%n Ad3111j 5309D....8832ev4171 Ad111j7300 M....9835e1071a9749n7882 Chi-Sq311u530a...883r417e P-V000a...000lu000001e ANOVA df SS MS F Significance F Model Summary Regression 2 486355.7 10 243177.8 180.689 13 9.46E-07 14 Deviance Deviance Residual 7 9420.8 11 1345.835 R-Sq R-Sq(adj) AIC Total 9 495776.5 12 65.10% 61.47% 25.21 Coefficients Standard Error 4 t Stat 5 P-value 6 Lower 95% 19 Upper 95% 19 Coefficients Intercept 125.289 1 40.9333 3.06 0.0183 28.4969 222.0807 Term Coef SE Coef 95% CI Z-Value P-Value VIF population 14.1996 2 0.9100 15.60 1.07E-06 12.0478 16.3517 Constant 210.68 4.19 (218.89, 22.46) 22.55 0.011 bus_rating 22.8107 3 5.7692 3.95 0.0055 9.1686 36.4527 Purchases 0.2264 0.0921 ( 0.0458, 0.4070) 2.46 0.014 1.59 (b) The Minitab output PlatProfile 3.84 1.62 ( 0.68, 7.01) 2.38 0.017 1.59 ASonuarlcyesis of VariancDeF Adj SS AdJ MS F-Value P-Value Odds Ratios for Continuous Predictors Regression 2 486356 10 243178 180.69 13 0.000 14 Odds Ratio 95% CI Population 1 327678 327678 243.48 0.000 Purchases 1.2541 (1.0469, 1.5024) Bus_Rating 1 21039 21039 15.63 0.006 Error 7 9421 11 1346 Odds Ratios for Categorical Predictors Total 9 495777 12 Level A Level B Odds Ratio 95% CI Model Summary PlatProfile S R-sq R-sq(adj) R-sq(pred) 1 0 46.7564 (1.9693, 1110.1076) 36.6856 7 98.10% 8 97.56% 9 96.31% Odds ratio for level A relative to level B Coefficients Term Coef SE Coef 4 T-Value 5 P-Value 6 VIF Goodness-of-Fit Tests Constant 125.3 1 40.9 3.06 0.018 Test DF Chi-Square P-Value Population 14.200 2 0.910 15.60 0.000 1.18 Deviance 37 19.21 0.993 Bus_Rating 22.81 3 5.77 3.95 0.006 1.18 Pearson 37 17.14 0.998 Regression Equation Hosmer-Lemeshow 8 3.23 0.919 Revenue 5 125.3 1 14.200 Population 1 22.81 Bus_Rating PBVouapsri_uaRlbaaltteiionng Set4ti7n.g73 956.6F0i6t 15 15S.0E4 F7i6t 16 (9219.052%4 ,C 9I 9127.188) (8629.854%4 ,P 1I 01580.37) VPPulaarrtciPahrbaolsfeieles S4e2t.t5in7g11 Pr0o.b9a4Fb3iti0ltie1tdy2 0.058S7E3 F1i9t (0.66029151%, 0 C.9I92954) 18bR02 29bA1 d ju st3edb R2 2 410 SE b j x5pl asitnaendd avradr ieartrioorn o f t h e e1s1timSSatEe 5bj U ne5xptl asitnaetids tvicasr ia t io n 6 p -v1a2luTeso tfaolr v ta srtiaattiisotnic s 173Fs (5m osdtaenl)d astradt iesrtrioc r VPuarrciahbalsees S5e1t.t8in3g5 ProbaFbitiltietdy SE Fit 95% CI 14p-value for F(model) 15yˆ 5 point prediction when x1 5 47.3 and x2 5 7 16syˆ 5 standard error of the estimate yˆ PlatProfile 0 0.742486 0.250558 (0.181013, 0.974102) 1795% confidence interval when x1 5 47.3 and x2 5 7 1895% prediction interval when x1 5 47.3 and x2 5 7 1995% confidence interval for bj residual—the difference between the restaurant’s observed and predicted yearly revenues— fairly small (in magnitude). We define the least squares point estimates to be the values of 14.13 estimate of 1.25 for Purchases says thNaetu froalr N eeatcwho riknsc (rOepatsieo noafl) $1,000 in last ye6ar5’7s purchases 65F4ig ure In1p4ut. 3La7y e r The siwtbC shiS0ho eei, pn aT n bwErgp he1mlxt,iese lec eao lLrf enndar ol1tedye rea4el lmdyb n ri 2sduPio n etlenMhar xH cB aEfpeiitodnop rxmdreiwtct resateioesnhnblnre ei mLadmo anl uaeuyidtnzaesp e,sriM unt OS tgissSn’ q aCiEiun tbo,aa Frbntrahei nngteseo cu lps hlrcuo,e o omi am1nfn 4tomdp .ef 4ua sKs.ttt eqhiTom ueethhmahaeertl aeee EntdsrMxi e c(orces2ueefd 0sl clteti 0aihoddpl5euu Olle)e etap.uspd lttRaIsu pinmr emtufa gotttamae hrrLttle eeeralts shtsiysmex .eeiu ro rFaass1n iol0tnig hnra e ra bneebatoxd s rtmdat haMamye.uu oTplorlteadlfiheanp eit,stlslh set c B if. sosoruq nerbiugmlsodairdouierneekslgasr-, NFEeisugtirmualra tee s1V4al.i3da8t io nJM: bbo4tRPoyyd6a O n d.tu2a7hdst 56oep Som u pbittfli ea Hvmournfeocp ekNlrdegs ’ecbn srulaaat arP.cradr kdlTlgi aN nehhteg rieot n wt lMfouhdoodmaoerrknddr a,eE sp t lsw h SrrtNoiaeiemTlfit avoiaelontedesiH ortd,ei n(sismc3 fft )a ooiabrmrftd o teuah t hhetpteh o gCS alorrdeitafl de tdvi4hrtie 6enCwr .ag S7chr da6ifol or Uv fdrcpoe ogahrrrn oaScPfdlaioldelrar vedDmterP ashrtsr ao hco tlaao fidrd leDdte hSrt ’h hseCsaeo a byo lrsdadsdaneU dmtrkhps ’ewg aos rta h faPw mdoulee apod tgueoinsrneatutsid mm oninfoa pgptte r ucio ntorfihccnlahrfeteao atsarhesmreees psibnao1c yi5rnse t wa 1es4ese .tse1i sm9btl9i1yamH 6t 51e1,a( sl,aht1 0e1)1no 00 5dt f …h01 bba ee rht210lle 11 1,m 5 21 1shbxie1 d1 11k12a 1ex,2n kn a. hty8ns1e1d2 xaa0 2brn7ldy2 ( sirtneeh evet e hb1neuu , Tsei2 anisn,e tcasyrsn e Sdar uas3tbeis n )S .gb hT yod h$poe1 e r4pse ,ov1nie9non9tt u.ec6esh0 tami wmnoghadeetee.n l T btah1hr e5ee pbp o10op4 i5u.n1lt a9 1et9i2so6t5n io.m 2sf8 aibz9tee1, PHHHHHHa111111r______a111222m::::::P IP IPnPnuuellttaarreetcctterrhhPPccraarreeoossppeefifittssllee::00 E220000•l•sa..4..2 t1401..is193672mt Ta Ta354228 a583615nnyhh772110tddee459229eea dcuu}}roo.pp neFgg sfirro naanℓℓrˆˆadd 12om l 555555eelt y s 2221hhppcˆˆ ,21..to rr420100o too..3311n32h92 bb48 fte95hh35hˆˆoaa9621 2011be92bbr((45PP5 m o iibuu11 lltrrii a tcctt..to10hhnyyo16 aam_k 1 32ssfft_ee56’ hoo 1ss71se_oe))rr92 (_f P11(( 2 aa55 b_e tl111 hhaˆˆh_a(0SS..2128822n.t_e6((33iii1JJ8k_ ll55nDD0 Mvv1))_’.uPP 611eells._aa82mitt rr1 PP2_nP..rr 416oo_cc. ififi4972lptllaaa_(2ee52a4))r681rrt_ 2bo4i71dd._n(5 29 fi4 7o_(( uhh2111l _.uem)))oo5 1_7t llpi13 _dds)p.u1_8eer4t_rr3o ( ._s18 wwfi)a4_}}) l (yhh_e1HHs)_oo 12)i (( ts5ℓℓhhhˆˆ21aa))a 5555dd.t9 w _ _.. e ee epp475__ 221161__uu....3e0011_92_3333rr_ 0_992286e_9955cc_6299 66_ _ s99hh2269__55 t_4621 _aa21i_95_m_ ss3011_ _11ee_35 a sst eoo fft h$$a54t12 ,,853751 llaasstt yyeeaarr …xx12 lH2 25(l2 h) 1250 …1ee ll 22h1 212 1h x1121k x1k h22x2 gL(L 5) 5 (cid:31)0 11 1(cid:31)11L1 He 21…(Ll1 1) ivqivq1ff a auu(cid:31)rr rreeaam(cid:31)iiaasslni2HtpbpbtHaimololteeta2i(nnv( tliilmssiess2vee)).e. HHHUUUU111pppp___gggg333rrrraaaa:::P IddddPnuleeeetare((((ct0000rhPc))))ar::::eHHHIosnpefi111tts___ele123r:c0e pt 220282.0231701.06..2992.1.331276.831372852184018282234}} ℓ ˆ L3ˆ 555555 2b212h1ˆ3.0117020 1...53141276 616 b7.h4182ˆ1839H81718(9 841P14 6 3(u2 ℓ (ˆ.r.10 5c2)2h 10132a 1_18 6s.be53_2 2s2819eH_)(26 (52_1(52 1.(_7e0ℓ .1ˆh8652ˆ_(03)239)2 ._ 1(65811J_8)6 D0 b1618_.P 3641l.H_a 829..t9P9132_3r 376(o3_.ℓfi3 242ˆ)l3_e(2 02))5 624_ 14((._1.(8 45)_38 154_.)8 51_345_36)._01804_32(_.098 )_)4)}} ( _0H)_ )3 g(5ℓ ˆ(3Lˆ)) 555.57 _.._ e e 1441_ 11_881_2..00_47_55 5e _7757_77_ 24388_ _ 88(6412_44_04 1_21_.08_4621_ _411_979_ 96_) xk … UpgradePurchasesPlatProfileP(Uropbgarabdileit5y 0)P(Uropbgarabdileit5y 1) H1_1 H1_2 H1_3 MLikoeslty 1 0 31.95 0 1 2.001752e-11 20.108826172 20.056174265 0.2837494642 0 lHmm =(l mh)m+ 50 …+e e h+llmmm 1h21mx 111k x+k hm2x2 13447301 010 33144972....995562874541 1101 036...33312313356656397199e6e-31-1360 0010....69298949891949394989063906e937-916697734 22220000....417005208181602936961999105378253731 2020..0001.2.01413363736171816906085079773886371 2202.0010..1.5449497807349764687251382985064737 1101 42 51.835 0 0.1877344817 0.8122655183 0.7698664933 0.5126296505 0.4845460029 1 card holders who have not yet been sent an upgrade offer and for whom we wish to esti- The idea behind neural network modeling is to represent the response variable as a nonlin- mate the probability of upgrading. Silver card holder 42 had purchases last year of $51,835 ear function of linear combinations of the predictor variables. The simplest but most widely (Purchases 5 51.835) and did not conform to the bank’s Platinum profile (PlatProfile 5 0). uTsheids nmeoudraell ,n wethwicohrk i sm aolsdoe ls iosm caeltliemde tsh cea slilnedg lteh-eh isdindgenle--llaayyeerr, pfeeerdcfeoprtwroanrd, i ns emuortailv nateetdw (olrikke. mBeacteasu sfeo rP tlhaetP nreoufirlael 5ne t0w, owrek hmaovdee JlD bPalsatePdro fiolen 5 th 1e . tFraiignuirneg 1d4a.t3a8 s seht oawnds thhoew p athraemy eatreer uessetid- all neural network models) by the connections of the neurons in the human brain. As illus- to estimate the probability that Silver card holder 42 would upgrade. Note that because the trated in Figure 14.37, this model involves: response variable Upgrade is qualitative, the output layer function is g(L) 5 1y(1 1 e2L). The final result obtained in the calculations, g(Lˆ) 5 .1877344817, is an estimate of the prob- 1 An input layer consisting of the predictor variables x1, x2, . . . , xk under consideration. ability that Silver card holder 42 would not upgrade (Upgrade 5 0). This implies that the 2 A single hidden layer consisting of m hidden nodes. At the vth hidden node, for estimate of the probability that Silver card holder 42 would upgrade is 1 2 .1877344817 5 v 5 1, 2, . . . , m, we forℓmv 5 a hlivn0e 1ar hcvo1mx1b 1in ahtvi2oxn2 1ℓv o. f. t.h 1e kh vpkr xekdictor variables: .uS8pi1lgv2re2ar6d c5ea 5rp1dr8 oh3bo.a lbdIfie lriw t4ye1 i)ps. rJaeMtd ilPcet au sast e S.s5 it,lh vteeh rem nco aSdriedll v hfieort ltcdoae trrhd ew h toorualdilndei rnu 4gp2 gd raiastda p esr eeitfd tiaocn tcedad lo ctnuoll yau tpeifg a rhnai dsue po g(rar ash deiesr dHdaeetnra e.n ,o Hhdave0v, fihnuvgn1 c,f t.oi or. mn. ,,e wdh vhℓki vca, hrwe i esu nathlksenono cwsapnlel ecpdiaf ryaan ma ahectitedirvdsa ettnhio annt o mfduuens fct utbinoec net,is oitsinm u Hasutve(adℓlvl )yf r oonfmo ℓn vtl.hi nTee hsaiars.m hTpihdlee- psperaotr tbaiacnbudil lafiort yrS eeilasvtciehmr oactfae rt dhfo ehr o3el3ad cephre ’rosc feu tnphtge o r6af7d t ehp eep rrScoieblnvateb roi lfci tatyhr dee sShtioilmlvdeaertr ecs aiisrnd a thth oelel davesatrl si.d 5ian, t iJtohMne P dt rapatirane idsneigct.t dsI afa tnaa activation function used by JMP is upgrade for the card holder and assigns a “most likely” qualitative value of 1 to the card Hv (ℓv) 5 _e e_ℓ ℓ v v _ 12__ 11_ hthoel dbeort.t Oomth eorfw Fisigeu, rJeM 1P4 .a3s8s,i gwnes as h“omwo stht eli kreeslyu”lt sq uoaf liJtMatiPv ed ovianlgu et hoifs 0f otor Sthilev cear rcda hrdo lhdoelrd. eArst 1, 17, 33, and 40. Specifically, JMP predicts an upgrade (1) for card holders 17 and 33, but [Noting that (e2x 2 1)y(e2x 1 1) is the hyperbolic tangent function of the variable x, it only card holder 33 did upgrade. JMP predicts a nonupgrade (0) for card holders 1 and 40, and f1o, l2lo, w. .s . t,h mat, Hwve ( ℓsvp)e icsi ftyhe hyperbolic tangent function of x 5 .5 ℓv.] For example, at nodes neither of these card holders upgraded. The “confusion matrices” in Figure 14.39 summarize ix bow49461_ch14_590-679.indd 654 23/11/15 4:37 pm bow49461_fm_i–xxi.indd 9 23/11/15 5:27 pm

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.