SHIN TAKAHASHI TREND-PRO, CO., LTD. TABLE OF CONTENTS EEA Ee carapace era se vii OUR PROLOGUE: STATISTICS WITH # HEART-POUNDING EXCITEMENT @....0...0 0... wee All 1 DETERMINING DATA TYPES: cc2 129 cece sin ci eeteswsacens eh ence seweeweeuenra 13 1. Categorical Data and Numerical Data 2.0.2. 14 2. An Example of Tricky Categorical Data 22... 0. o cece cece eee e ees 20 3. How Multiple-Choice Answers Are Handled in Practice. ..... 0.000.000. c eee eee 28 Exercise and Answer... 2.6... ccc eee nec eect eve e ee ee eens 29 SUMMEID peters ie crt Otero te heer ret tren emi ine ore ian Z GETTING THE BIG PICTURE: UNDERSTANDING NUMERICAL DATA............ 31 1. Frequency Distribution Tables and Histograms .............0 02. ccevecevvsseeeeeeee 32 2, Mean (AVGTGE cs ccawawesemawwn cosine a9caew sean ae elMn swe N eS eRe eR ETE CSAVCRE Terme eet cue RCC ee RTI ere ee err rte: 4. Standard Deviation. ................0. 5. The Range of Class of a Frequency Table 6. Estimation Theory and Descriptive Statistics ..........000 00 0c cece eee ee 57 Exercise and Answer Oe SCMUIEIY 5 esa a eee eee Foo aCe ARES EC Ge HGH 6 wie sin wees eiwarenreiwleuewennues 58 3 GETTING THE BIG PICTURE: UNDERSTANDING CATEGORICAL DATA........... 59 1. Cross Tabulations Exercise and Answer SUMMALY as aos Bid eecem cieec ene ates 4 STANDARD SCORE AND DEVIATION SCORE... ec eee 65 1, Netmalizationvanid Standard! Sere va sya. gawa Vege Nig wa eS) AUS ee ey See kre p howe 66 2. Characteristics of Standard Score . 6.6.0.0. e eee e sees 73 $B Devise OMY SG NE) ac7 tr eccnerareraetiesiasac: blexeaeacced beesentier A acer ee oR oan eRe 74 S LET'S OBTAIN THE PROBABILITY ©... cece eee neces 81 ae Prooanility. Dens PUnctio ly vena se wane eras euch aves ease hh eels Pe eee 82 2. Normal DIStiBUtlOty ie ¢xcewi’e fawn cam aenesiow verewe eum eae acuna wal epwies nace oma wane 86 3. Standard Normal Distribution ........ 0... cece eee ee 89 FETS G Ui ey car arn sneer ree yarns ame yw pera Ee A RIA Fg Nea mea a eg 95 [5-2 (yp) 0] ell | Ieee ee eee ceaty ema een Meni ere ta ny aiar Rita e RT Ee tech on et ie 97 Gs Chlaoqua he D ISth DUDS riemnw <aranensteeaya ws wiakansureeemeremena meen iwase eerie Se 99 Bitte (ist DUD Te iateus: anne rapt eiewe aa eras a RE Se SE 6. F Distribution : PADish butions apie CXCE 1a ceases or ecene fs yt slew Sly eS eRe eee 107 HE SESH Go See ACLU NIE SOU cei ac ee eat gee ec ape 108 SUMMA cose es eae ee ee ee 109 6 LET'S LOOK AT THE RELATIONSHIP BETWEEN TWO VARIABLES .......... 111 AL, (Conrelation GOCHICIENE <oscannasse cmon ita vitesse mahi eese sarees sae eee See 116 Pine Gln afl eile) gu sclil¢ omer reaeee ener =4e claret coe career ine te net restr eciei cror nN cree ocr 121 OF Grea eles: @ oe titi ela tee eareee ees a eee een) 127 EXerciserand Answers cas reser aves bien ea indes Ge es eee aes Sees 138 SUPT TIN AD SY a5 ws Sea esa ae eS ed Oe 142 Zz LET'S EXPLORE THE HYPOTHESIS TESTS...... 2. ccc eee 143 TEAM M Se eenees Sire cere son estes rer eccenee cst ica meet ee teers Th 2. The Chi-Square Test of Independence. . . Explanctions. pa. sais ete teeres es EI Seapets ea ec aN Se ecm BT a WB tel coh al sje key) =| calegemear nexus ent gle pemnsrs ts eer Ps peers 9 ay me er lebanon ges et Sue ener ANSWEl' an 5 Wi Stren Ses BERETS See Eee Bee eae we ngs See Sel See SSE? 3. Null Hypotheses and Alternative Hypotheses ............... 0.00.2 c cece eee eee eee 170 4. P-value and Procedure for Hypothesis Tests.............. 0. ccccececeveeceeeeeeee 175 5. Tests of Independence and Tests of Homogeneity............0.....0 00 cee eee eee ee 184 Exarniple: ssassnew Ses neem eesseiemr ens mes oe we SGT wEs BES we Ny ows RRR TEES 184 PROCBUUTE ec aaasie di enaye reac e wale aan 2 wie eee 185 6. Hypothesis Test Conclusions. .........0 006 c cece eee ence ete eee e eee eens 187 Exercise anc ANEWel enete cee oer gens ore ct aS aera 188 SUGicl eee ate Teo Le Meee eer Th Terre ter ee cer ere eee 189 APPENDIX LET'S CALCULATE USING EXCEL 6 cca cicca ve cevinvens ens duetuesius wuss 191 1. Making a Frequency Table .... 2.0.2.2. e cece teen eee eee ee 192 2. Calculating Arithmetic Mean, Median, and Standard Deviation....................... 195 @) Makiig) at Gross TabUaUGON eee tgs nem anton s er per ergy Eee Neve eychale SEES 197 4. Calculating the Standard Score and the Deviation Score............ 2... ccc eee eee 199 5. Calculating the Probability of the Standard Normal Distribution...................... 204 6. Calculating the Point on the Horizontal Axis of the Chi-Square Distribution............. 205 7. Caleulatirig the Correlation: Coetticietit os sys (a seers Kees ease eed omy ee 206 8» Performing) lests-oF Independence. cas gases tiie Sees ea cece em aves palate dea ead 208 Uy |b Ue Rec ie eames See TL ye re TC Renee ele Ree tele a Tee 213 Vi TABLE OF CONTENTS PREFACE This is an introductory book on statistics. The intended readers are: * Those who need to conduct data analysis for research or business * Those who do not necessarily need to conduct data analysis now but are interested in getting an idea of what the world of statistics is like * Those who have already acquired general knowledge of statistics and want to learn more Statistics is one of the areas of mathematics most closely related to everyday life and business. Familiarizing yourself with statistics may come in handy in situations like: * Estimating how many servings of fried noodles you can sell at a food stand you are planning to set up during a school festival * Estimating whether you will be able to pass a certification exam * Comparing the probability that a sick person will get better between a case in which medicine X is used and a case in which it is not used This book consists of seven chapters. Basically, each chapter is organized in the follow- ing sections: * — Cartoon * Text explanation to supplement the cartoon * — Exercise and answer * Summary You can lear a lot by just reading the cartoon section, but deeper understanding and knowledge will be acquired if you read the other sections as well. | would be very pleased if you start feeling that statistics is fun and useful after reading this book. | would like to thank the staff in the development department of Ohmsha, Ltd., who offered me the opportunity to write this book. | would also like to thank TREND-PRO, Co., Ltd. for making my manuscript into a cartoon, the scenario writer, re_akino, and the illustrator, Iroha Inoue. Last but not least, | would like to thank Dr. Sakaori Fumitake of the College of Social Relations at Rikkyo University. He provided me with invaluable advice while | was preparing the manuscript for this book. SHIN TAKAHASHI OUR PROLOGUE: STATISTICS WITH ~ HEART-POUNDING EXCITEMENT @ 2 OUR PROLOGUE ey S\ THIS IS MR. IGARASHI. HE WORKS FOR ME. I ASKED HIM TO STOP BY BECAUSE WE WERE HAVING A DRINK TOGETHER NEAR HERE. ie ow f\ joe (Gz ii WELL, WELL. WELCOME TO OUR HOME ¢ SWEET 2 Q) () l= y= Nl 2 S I’M HOME, RUI! SAY HELLO TO MR. IGARASHI. HE WORKS FOR ME. YOU HAVE A PRETTY DAUGHTER. MR. IGARASHI, WHAT KIND OF WORK DO YOU DOP YOU FLATTER ME! BUT I DO NOT DENY IT! IN SHORT, I DO MARKETING. WELL, AS I WORK FOR THE SAME COMPANY AS YOUR FATHER... ¥ STATISTICS WITH HEART-POUNDING EXCITEMENT » 3 TO EXPLAIN IT FULLY, I DO MARKET RESEARCH USING STATISTICS...BUT I GUESS MARKETING |S AN UNFAMILIAR WORD FOR A HIGH SCHOOL GIRL LIKE YOU. YOU ARE HONEST! DO YOU KNOW WHAT STATISTICS 15, DID I MAKE IT TOO DIFFICULT? OH, HERE IS A GOOD EXAMPLE! SORRY, I HAVE NEVER HEARD OF IT. MAYBE YOU DON’T KNOW THAT EITHER. ROUGHLY SPEAKING, STATISTICS IS A STUDY THAT ESTIMATES THE STATUS OF A POPULATION BY USING INFORMATION GAINED FROM SAMPLES. LOOK AT TODAY'S PAPER. HMMMM. NEITHER OF YOU WERE SURVEYED, BUT THE CABINET APPROVAL RATING IS IN THE PAPER. Nun, i Hiatt at "ity q Way gin THAT'S WEIRD. YOU TWO BOTH HAVE THE RIGHT TO VOTE, DON’T YOU? IT SAYS THAT “ACCORDING TO A CHOMAI TIMES SURVEY, THE CABINET APPROVAL RATING IS 39%.” THAT'S MY POINT. THAT'S WHERE STATISTICS COMES IN. CONTACTED BY THE CHOMAI TIMES ABOUT THE CABINET. I'VE NEVER BEEN HOW ABOUT YOU, MR. TAKATSUP RUI, DO YOU KNOW HOW MANY VOTERS THERE ARE IN JAPAN? LET ME SEE... ALOT! ( I KNOW! @ ( ¥ STATISTICS WITH HEART-POUNDING EXCITEMENT # 5& RIGHT. YOU CAN GET THE PRECISE CABINET APPROVAL RATING IF YOU SURVEY EVERY SINGLE VOTER. THAT IS WHY ONLY A LIMITED NUMBER OF PEOPLE ARE SURVEYED. SEE, RUIP THE REAL GROUP THAT SHOULD BE SURVEYED IS CALLED A POPULATION. A GROUP MADE OF SAMPLES SELECTED FROM THE POPULATION IS CALLED A SAMPLE. THOSE ARE STATISTICAL & OUR PROLOGUE HOWEVER, IT IS UNREALISTIC TO SURVEY EVERYONE. THERE ARE TOO MANY PEOPLE. DAD IS TORMENTING ME BY TALKING ABOUT HARD