ebook img

What is a P-Value Anyway? PDF

223 Pages·2010·9.16 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview What is a P-Value Anyway?

• C —C C D l D ) Cc D. CVIt C40*QC II0“C4q cCcc)4’ c )I ’ 1 I t m 1 )q c ) 1’c ) )’ ) ’) ’ I ) Do E u C t O ) 42 C •f) .. - £ c H f c) ) L ‘iCf lkirdre I ‘.nch • I duct: C hntopher Cummings • :itor C hrl%tina I epr: ‘ • ni’tanC L)sns ions ntjtng I duor Karen SetnhoIm :uilon %uperk’r: Pegs> 1cMahon ‘: • A’ gn Superiwr. Co:rand Imerior I)e%IJn .ndrat Sn OpJner llluttations. Malt ndrew’. .. Manager. Cia> ‘. •.. Kathleen bet l)4’e/ \b%i’btdflt ahorSupport lechnoht4> Speualist Joeetew •• \f ‘i..’ .r’ng \Lmager: I eI>n I1eat’n ‘t.inuiic’unnit lluwr ( sot 1eIsille r .1 tinn C k.rdination.lechmeal ltlLntration%. and Composition: I aserwords Maine t”i ‘tO di’. i 14$. P01trail ofthe I nglish statistician andgeneticist. Sir Ronald %Imer I i%hcr JO l9b2).’ Photo Researchers. Inc. I n. ot the designations used b> manufacturersand sellers todistinguish theirproducts are claimed as di.muks. Vhere those designationsappearin this book, and Pearson wasawareofa trademark claim, It ‘l’cdesignations hae been pnnted in initial caps orall caps. librar> ofCongress(ataloging4n-Publlcatlon Data aers. ndrew hat is a P4alue anywa3? : 34 stones tohelp >ouactualls andetstand statistics ndre’ Vickerc. I cit. st p rn ISBN 0-32I-ô2930-2 I Niathematical stati’,tics. I. ride. Q2’O i2512(,l() 519$ dc’Z 2’ii’i Pearson I duatk’i’. Inc Il nglus resaed. So pait of this publication ma> be tpro— mtid’i d•1• at • “ • ‘r ;c,tj• ‘a ‘.‘.ai•• •;rwit rn•.e,ed, ii a.n’ ••jflfl •q h rg’si‘u jfl‘s, :1e”tI”i. inP”chari;iii. —. .•: %• ‘.‘..t,_:, % .!...•‘. P:ra•t.I.::,I•di•i,....::.t.•.. Ih”’,. • U!;n?n. ’•1,n•( •.1.•rht’r t•.—.I1eap•a• .!:n.::‘t!. •“i’ .Kt.’•.u•.t‘ko.npstr•e5%et. )‘(.lkstti, MO £ lt cii r1ustzo 6 i44 tie n it it’p ‘ petrsois.dc m ‘ or. :L: ;‘:i11’’ Addison-Wesley s an imprint of PEARSON : lI4\. ;“.i www.pernsoahlgheied.com IS.)’ ; I ‘_I 2) i — Oedktion This book is dediaied lo ftur ineiltors ho tiwiht me not onl\ ho\ to do statistics but. more mnportantlv. hv. lain ( /ILI/1/l(’r% [)Uilt/ Sa kcrr i*nim. :lliiiian Bcgg (o/ni About th Author S fl LtI \1icfld111 itch \Lilid li‘t in ih I )cpiIt1nnt nI’ [pi dLmk id I3IotatI,tIL’ t \knioi d ‘,lo i kLtLIIn ( tnu ( cfltT in \L\ ‘oik IL i’ ot fields or cancer research. includinc stiricul outcomes. molecular markers. a-ails, lie aRo conducts rieinil research in stutitical nicthoR. parttcularl\ \ ith md respect to lie ealuation of prediction models. At the t me of \\ rtinc. he has been principal author r author on o er 201) peer-re\ iered scientinc papers. [)r \ Jci’s has a stronc interest in teach tm stattt c. lie is the course leader kr the \lemor ml Sln-kctteutic (‘ancer (enter hiotaust cs course and teaches hiostatistics to medical students ( II \L ii Lhool [)i \ ikci lic ith hN itL md chmldtcn in Riookln omk Conterit Introduction Hov; to Read tbs Book x ........... .................. 1 tei a rend bat m oh s more 4:Jfl VOj’d :‘k. \ a: s sta: s:cs?..................,,,............ Describing Data 2 So BlI Gates walks into a diner: On means and medians 3 Bill Gates goes back to the diner: Standard deviation and interquartile range 7 4 A skewed shot, a biased referee 5 You can’t have 2.6 children: On different types of data 6 Why your high school math teacher was right: How to draw a graph 21 Data Distributions 7 Chutes-and-ladders and serum hemoglobin levels: Thoughts on the normal distribution 26 8 If the normal distribution is so normal, how come my data never a7re 9 But I like that sweater: What amount of fit is a “good enough” ft? . . . . . . . Variation of Study Results: Confidence Intervals 10 . . . 11 Ho to avo d a ra ny vedd ng Varat or and con’ dance interva S . . . . . . 12 Stat st ca tes and y yOu shou1n’t vear one More on n’ daflce nte’a s . . . V ( onteflts HypothesisTesting Crtoossr’g a ‘o.ite to q,c:e oive. \hat o1e-vsalt o fo’ .is 13 14 ‘e probab ..ty of a arv :ooti’bri.,sh: Ahat is a p-value a2nywa’y 15 M chael Jordan won t accept the ‘iull hypothesis Ho to .nterpret high p-.alues ‘4 The a;fferer,ce betee sports and bus’ness: Thoughts 16 n tte t test and the Wi’Co\On test 17 F1eeting up with friends: On sample size, precision and statistical power -n Regression and Decision Making 18 When to visit Chicago: About linear and logistic regression 77 is My assistant turns up for work with shorter hair: About regression and confounding 20 I ignore my child’s cough, my wife panics: About specificity and sensitivity 9o 21 Avoid the sales: Statistics to help make decisions 95 Some Common Statistical Errors, andWhatTheyTeach Us 22 One better thanTommy John: Four statistical errors, some of which are totally trivial, but all of which matter a great deal 23 Weed control for p-values: A single scientific question should be addressed by a single statistical test 102 24 How to shoot aTV episodeS Statistical analyses 1 th3t orovde a eari r’gf ,l n imbe s r”t 25 C s ‘d ?CO c..:J Coma 5e’-Q jrada.. ‘a . 3”.r ‘D eors .n regress c . . . . . . . 26 Reg ess n to ‘re Mike A statist ca exp’aation )f vyaegbefrendofmnesstllsflge 6 .. . 27 C. S csr Sal’v C ark. George ana me: Aoobt coratonal 28 Brj ‘4f: ; ‘jr e:e t oc s’ars”’p.eestirg 14 (‘ontents vii 29 Some things that have never happened to me: Why you shou dn t compare p-values . 30 How to .v n the marathon. Avoiding errors whe’i measur ng th ngs tnat happen over t me 31 The d iference between bad statist cs and a oacon sandwictv Are there “ru es’ n statstcs 32 Look at your garbage b n: It may be the on y th ng you need to know about statistics l4 33 Numbers that mean somethirg. L nking math and science ii n Statistics s about people, even if you can’t see the tears Discussion Section Answers a Credits and References Index 2” Wow to read thi$ boot It isanodd &eling when you losesshatyoudoandeseryoneelseseemstohate it. I gettopeerinto lists ofnumbersand tease out knowledge that can help people lite longer, healthier lites. But ifI tell friends 1 get a kick out ofstatistics, they inch away as ifI hate acommunicable disease. I hate started to think that most folks’ ‘Jews ofstatistics as a refined type oftorture go back to how it is taught. and to teitbooks in particular. Statistics textbooks can be long, boring and ecpensise. With this in mind. I proposed to my editor that I write a book that was short, boring and e’cpensise. He considered it butesentually decided I needed to come up with something bet ter. So I thought about itthis way: the typical statistics te’itbook (a) tells you how to do statistics. not how to understand it. (b) is full of formulas and (C) is no tim at all. I wondered whether I could write somethingthat (a) focused on how to understand statistics, (b) asoided formulasand (c) was tim, at least in places. This is how I came up with the idea of stories. The 10th commandment is shall not “You coset your neighbor’s house, wife, donkey. or ox” but no one says this in consersation. Instead, they say“thegrass isgreener:’ In case you didn’tknow, “thegrass is greener”comes from an old story about some goats thatwere happily eating grass in a field until they looked up and noticed the grass on the other side ofa small stream. The grass looked so much greener there that they crossed otera little bridge. But afterfeeding fora while, they looked up again and thought that actually, the grass on the other side of the stream. back where they had started, looked a lot greenerthan in the fieldwhere they were standing.And so they spentthedaycrossingthe bridge back and forth, always thinking that the grass was lookinggreeneron the other side. I think the last time I heard this story was in kindergarten, but I still remember it and what it means. The 10th commandment is spot on but is hard to rememberbecause it tellsyou what to do; you hear a story to help you understand somethingand you’ll remember it for life. Like stories,thechaptersin thebookare intendedtobeshortand funto read.Thesecondhalf ofthe book. the discussion section, is a little weightier. The discussion questions vary: there is usuallyone question, the first, that is pretty essential and is something that you should really try to think about Most ofthe others could be considered optional some are there only for the really enthusiastic types(I flagged these). Forexample. there is a discussion on the derisation of a mathematical constant callede and an introduction to statistical programming. Ifyou hate some epenence with ‘tatistics, feel tree to dip in and out ofthe book Otherwise. oust )uldpr)bibl4 ti’ adthe haptersthi uuF hon be’inninuto ml ihe first I2chapters & rtn is hsa s,s idstbtmad nlidi.n rtcdsll tic, a ‘es aptc.rs 31 tpotl esi’t st t aid, a res. Ietor dtc.i is’.ing rejession tb. statisti - a1 method I use most in nit oik and decision making which generally should be. but often isn’t, what statistics is about. The last third ot the book, starting from the chapter“One betterthan fommy John”. isdesotedtodiscussinga wide sanety of statistical errors. Ifit seemsodd todeote somuchofabook toslip-ups, it isbecause I hatea littletheory that‘science”isjustaspecial name for“learniniz from ourmistakes.”%hen I teach, I gise bonus points forany studentgisinga partic ularly dumb ins ser bccaus. those ire the ones se realls learn trom In fict I don’t think you cm reill under’$ai ci. i i - di w thout sccinasomeof tb. ‘sassit hi’been misusedandthinkinit thr uuh sshs tI cscc t’tbtc mis’ikes. Sc pleasedon thi )5 thesechaptcrsofT thinkingyou’e read tIc sun “ne a n’Ior I’cIIi ii r p’crs s ‘1pr’ tin n_c’ istrtst’ci’k owlelgi. ix x llrnv to read this hook What thic boot can and can’t teach you Hopefully, after von have read this textbook. >ou’ll have a good understanding of many ofthe key ideasofgood statistics. I also hope thatyouil he able to avoidsome of the most common sta tistical mistakes and errors. What you L)flt know how to do is actually do any statistical analyses. in short. because I haven’t provided any of the appropriate formulas. If you want to conduct analyses for your research or br oureoursework, you’ll haveto look it up a conventional statistical textbook with formulasand step-by-step instructions.Also, the book won’t be particularly useful as a reference textbook to look up things that you’ve forgotten. So ifyou want to run statistical analyses, this should not he the only book you buy. (Although it should be the only book you buy multiple copiesof, to giveto your friends, family, colleagues, neighborsand random people you meet.) On the otherhand, ifyou arethe sortofpersonwho doesn’twanttodo any statisticsyourself—which is, I guess, most of the world—but have to understand and interpret statistics that you read— which is more ofusthan you might think—thenthisbook mightwellbe all thatyou require. Where 1$ the Lection on decign? I am a very design oriented statistician. As a quick example. missing data is a big problem in medical research. Statisticians have written hundreds ofresearch papers proposing complex sta tistical techniques that predict what the data would have been, had it not been missing. My own contribution was to proposea very simpletechniqueto reduce the rate ofmissing data in the first place. which is to telephone patients at home and ask themjust two questions in place ofa long questionnaire. In this way, we reduced the rate ofmissing data in a trial from 25% to 6%, which made the use ofcomplex missing data analyses rather redundant. As such. you might be surprised that there is no section on design in this book. In short, this is because I don’t think you can separate out design from the rest ofstatistics and have a special chapteron it. I have two differentchapterson regression analysis and the Wilcoxontestbecause. in theory. you could do one without the other; you can’t think about either the Wilcoxon or regression analysiswithout considering the design ofthe study you are analyzing. Accordingly. I don’t have a chapteron design. Instead. comments on design are woven throughoutthe text. About the $toriec and data in thic boot When I started writing, my editor said to me. “Andrew. I want you to sritc ihc tiinnicst statistics textbook ecr!” So I thought. “Great. I’ll write onejoke and then I’ll he done.” Actually. it didn’t quite happen exactly like that, hut it isn’t far oil On ‘ihich point, the sto ries and data in this hook were dceloped to help you learn statistics. This has sometimes meant simpliting oraltering something to make it easierto understand. In some cases I simulated data r’simulation” is statistics speak for making stuffup). I did so on the grounds that the data I had to hand were much too complicated and would take far too much explaining and, as such, would detract from the reason I wanted to use the data in the first place which was to help you to understand something about statistics. Also, you’d get sick of hearing about prostate cancer, which is the main thing I study. How toreadthis book xl Accordingi>.the storiesand datathat follo are notall l000u factually accurate. I don’t think 1 hae aid anything misleading. hut please don’t use the hook to come to conclusion’ about blood counts in Scdi%h men (see (huws—anci—ladclers an! scrisiii henusginhin lewis: rhcnighis an flit’ normaldistribution). pro%tate cancer(sec Ifhen in visit Chicago:.lbe’ur lintarandlogis ticregrcssion). how long it takes forAfrican- \mericans to hail acab(seeSamething’ thaihint ntverhapptnedto nit. U‘In svii shouidn cc’mpa,rp—aim si or. forthat matter. my friend Mike ee Rcgrc.ssinn ft. the fiLt. .4 statistItal apianation at •shv Liii eligible Irienal at mine Ls still shag!). Or nen whether “scared straight” helpsjusenile delinquents asoid a life ofcrime (see Theprohahiifrvida dayioatlthru.sh. 11hatis ap—value ‘L’civ?i: it doesn’t, and I say itdoesn’t. but don’t take my word for it. look it up for ourselfsce www.cnchrane.nrgi. This is. afterall. a book aboutstatistic’s,not crime polics. I did analyte data sets for this book and pre’ent. iithout fudging. the results I found. You shouldbe able to replicate my analyses. Much ofthe raw data is aaiIabIe on the web, but ifyou can’t find itand wanttoreplicatesomething. please letme know and I’ll see how I can help. Inci dentally. formostcategorical dataanalyses in thisbook. 1 used Fisher’sexacttest. I would like to acknowledge the Pets Research Center (sswss.pe’wresearch.org). which pub lishes raw data from its fascinating sur’ieys of the American public. The data on attitudes to marriage between religions sseit adapted from the Northern Ireland Life & Times Sun 2006 (www.ark.ac.uk). The US 1996 crime statistics are asailable from www.statcrunch.com. an excellent resource fordata sets forteaching(although. unlike the otherdata sets mentioned here, this isasailableonlyby subscription).The acupunctureand headachedatasetcanbedownloaded from www.trialsjournal.com content 7 1 15 (whereyoucanalsoreadsomeofmythoughts about data sharing).The dataon prostate cancer(andbloodcounts inSwedish men)come froma series of studies I hase been conducting with my colleague. Hans Lilja. Mu can find out more by searchingthemedical database “PubMed”(http: wwwscbi.nlm.nih.gov sites entre?) forVick Lilja”. The data on maternity leave come from the work ofJanet Gornick (see. for example. Families That Ub,*: Polic’iecjbr Reconciling Pan’nthoi4antiEmpiqvment. New \brk: Russell Sage Foundation. 2003.

Description:
14 'e probab ..ty of a arv :ooti'bri.,sh: Ahat is a p-value anywa'y me a mean and a standard desiation and I can immediately answer questions such
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.