Human–Computer Interaction Series Judy Robertson Maurits Kaptein Editors Modern Statistical Methods for HCI – Human Computer Interaction Series Editors-in-chief Desney Tan, Microsoft Research, USA Jean Vanderdonckt, Université Catholique de Louvain, Belgium HCI is a multidisciplinary field focused on human aspects of the development of computer technology. As computer-based technology becomes increasingly pervasive—not just in developed countries, but worldwide—the need to take a human-centered approach in the design and development of this technology becomes ever more important. For roughly 30 years now, researchers and practitioners in computational and behavioral sciences have worked to identify theory and practice that influences the direction of these technologies, and this diverse work makes up the field of human-computer interaction. Broadly speaking it includes the study of what technology might be able to do for people and how people might interact with the technology. The HCI series publishes books that advancethescienceandtechnologyofdevelopingsystemswhicharebotheffective and satisfying for people in a wide variety of contexts. Titles focus on theoretical perspectives (such as formal approaches drawn from a variety of behavioral sciences), practical approaches (such as the techniques for effectively integrating user needs in system development), and social issues (such as the determinants of utility, usability and acceptability). Titles published within the Human–Computer Interaction Series are included in ThomsonReuters’BookCitationIndex,TheDBLPComputerScienceBibliography and The HCI Bibliography. More information about this series at http://www.springer.com/series/6033 Judy Robertson Maurits Kaptein (cid:129) Editors Modern Statistical Methods for HCI 123 Editors JudyRobertson Maurits Kaptein MorayHouse Schoolof Education Donders Centrefor Cognition Edinburgh University Radboud University Nijmegen Edinburgh Nijmegen UK TheNetherlands Additional material tothis bookcanbedownloaded from http://extras.springer.com. ISSN 1571-5035 Human–Computer Interaction Series ISBN978-3-319-26631-2 ISBN978-3-319-26633-6 (eBook) DOI 10.1007/978-3-319-26633-6 LibraryofCongressControlNumber:2015958319 ©SpringerInternationalPublishingSwitzerland2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor foranyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland Foreword Lies,damnedlies,andstatistics. AttributedtoBenjaminDisraelibyMarkTwain Itiseasytoliewithstatistics,itiseveneasiertoliewithoutthem. AttributedtoFrederickMosteller These two popular quotes vividly capture the necessity and misuse of statistical methods in research and argumentation in general. HCI is no exception. The goal of the human–computer interaction field is to invent, design, develop, and understand effective means to delight users of computing devices, apps, and services.Butunliketypicalengineeringandcomputerscienceproblemsolving,the solutions we devise in HCI rarely have a simple and deterministic measure of efficacy.Anymeasurewecomeupwith,suchastheusualtimetocompletion,error rate, learning speed, subjective preference and ratings, tends to vary from trial to trial, task to task, and one individual user to another. Statistics, the art of making sense of fluctuating data, is a common method, among others, of moving our research, design or invention beyond a personal belief. Most HCI researchers, particularly those coming from computer science and engineering backgrounds, did not usually have formal training in statistics. Even those who do often also struggle with deciding on the most appropriate statistical models, tests, data processing techniques, and software tools for each project becausetheunderlyinglogic,assumptionsandexceptionsofeachstatisticalmethod arecomplexandoftendebatedbyspecialists.Statisticalissuesareoftencontentious in HCI publications. Paper reviewers often take issue with sample size, power, model assumption such as normality, and the statistical tests used. The reviewers’ criticism of statistical methods often frustrates authors who follow other published papersonthesamesubject.Oftenneitherthereviewersnortheauthorshaveenough training in statistical methods to debate the chosen method’s validity to a con- vincinglevel.Eveniftheydo,researchresults,paperwriting,andreviewingarejust not the right forum for statistical method discussion. Furthermore, even if one rigorously followed all the classical inferential statis- tics, the research conclusion and its reliability may still not mean the same to v vi Foreword everyone. Ever since null hypothesis significance testing (NHST) became the dominant quantitative research method, its validity has been regularly questioned and challenged in empirical sciences such as experimental psychology. In recent years, such criticisms, and the advocacy toward alternative methods, particularly Bayesian methods, intensified in many fields. Resorting to simple descriptive statistics from larger samples, one psychology journal recently banned inferential statistics all together. However, no other journals have taken such an extreme position. Unhappy with how statistics were interpreted and practiced in HCI Maurits KapteinandJudyRobertsontooktheirdiscussionformallybypublishingtheirCHI 2012paper“RethinkingstatisticalanalysismethodsforCHI.”Thepaperdrewgreat interest from researchers like me who are interested in and frequently apply sta- tisticalmethodsinourresearch,invention,productdesign,anddevelopmentwork. Knowingoneconferencepaperisnotenough,JudyandMauritsdecidedtotakethe subject to the next level, first by writing a journal paper. As the editor-in-chief of ACMTransactionsOnComputer-HumanInteractionatthetime,Irananextensive board discussion of their journal submission. Many of my esteemed associate editors are enthusiastic and knowledgeable about the topic. Their views, however, are as varied as in the field. Some think most of what Judy and Maurits recom- mendedwaswhatthey alreadytaughttheirstudents.Somethinksciencebynature is very much driven by community culture or prevailing norms. Some think criti- cismsofNHSTarecyclicandmayfadeawayonceagain.Mostagreewiththespirit of the Bayesian approach because it allows knowledge and hypothesis to be updated and improved with each new experiment, but they also acknowledge in practicethefirstnovelstudyofaspecificidea,design,orUImethod,isalwaysmost valued. Subsequent studies which can give stronger Bayesian analysis benefiting from a prior estimated from the previous studies are less valued by academics and hence rare. There are also HCI researchers who do not think statistics are worth- while and believe their best work did not rely on or use statistics at all. But none denied the need to educate the field to a much comprehensive level of under- standing and practice of statistics in HCI. Several associate editors spontaneously suggestedabookpresentingthebestpracticesofstatisticalmethodsfrommorethan one school of thought. Everyone loved the idea of having such a book for their teachingandresearch.HavingalsoservedontheeditorialboardofSpringer’sHCI book series since its beginning, I knew academic publishers like Springer would welcome such a book proposal. But having a wish or desire is easy. Many others in HCI have wished such a book,buttheycouldnotdevotethelargeamountoftimerequiredtoprepare,write, oreditsuchabook.Fortunately,JudyandMauritsdidit,andtheydidaremarkable job.Theyreachedouttoanimpressivesetofknowledgeableauthorsinandoutside the HCI field with different approaches and background in researching and prac- ticing statistics. Judy and Maurits gave them a common set of hypothetical HCI data that researchers could easily relate to. The common dataset also allowed different statistical approaches be compared and contrasted. Judy and Maurits also wroteintroductionstoeachsectionofthebook,andmultiplechaptersoftheirown. Foreword vii Sofinally,theHCIfieldhasacomprehensivestatisticalmethodsbookofitsown for researchers and students. It may not resolve many of the explicit or implicit debates in statistical methods. Instead, the book supports a less rigid, procedural view of statistics in favor of “fair statistical communication.” The book should become acommonreference for empirical HCI research.Thosewho areinterested in even deeper understanding of a particular statistical method can follow many of the references in the end of each chapter. I will certainly keep the book among the most reachable books on my shelf. December 2015 Shumin Zhai Preface Forthoseofuswhoarepassionatelyinterestedinresearch,themethodsweuseare at least as important as our findings. We need to have confidence that our quan- titative methods give us more than just an illusion of rigor; they should provide genuine insight into the problems which interest us. This book is a tool to assist researchers in making sense of their quantitative data without confusion, bias, or arbitrary conventions. A few years ago we wrote a critical examination of statistical practice in Human–ComputerInteractionwhichwas presentedatCHI 2012.Duetothelively discussions that surfaced at the conference, we thought there was more to “statis- ticalmethodsinHCI”thancouldbeconveyedthroughaconferencepaper.Initially, we set out to work on a journal paper which would both examine current practice, as well as introduce a number of statistical methods that are not covered in intro- ductory research methods and statistics lectures but could, in our view, strengthen thefield.Thearticlethenbecamelengthy,andwereallywantedittobehands-on.It spiraled out of control: we started involving both experts in different methods, as well as users of “a-typical” methods in HCI, and discussed their possible contri- butionstothearticle.Hence,mid2014,wedecided,incollaborationwithSpringer, to turn our article into a book. And here it is. About the Authors We confess to the reader at the outset that our own statistical practices are not perfect.Infact,overtheyears,bothofushavecommittedaboutallofthepotential errorsidentifiedinthisvolume(andtheyarethereintheliteratureforyoutosorrow over). These less than perfect analyses stem from ignorance of the pitfalls of null hypothesis significance testing—honestly!—rather than an intention to deceive the reading public. On occasion however, we have consciously used “traditional” methodsratherthantheirmorerecentcounterpartsinordertotailortheanalysesto ix x Preface thereviewers’expectations.Itwouldbebetterforourfieldifauthorsdidn’thaveto do this. You can help by leaving your copy of this book on coffee tables in HCI conference venues, with Part V of Chap. 14 helpfully highlighted. Maurits Iamasocialscientistandresearcherprimarilyinterestedinpersuasion,quantitative research methods, and optimal design. After doing my master’s in (economic-) PsychologyattheUniversityofTilburg,anddoingapost-masterprograminUser– SystemInteractionattheEindhovenUniversityofTechnology,IreceivedmyPh.D. with honors from the Eindhoven University of Technology, Eindhoven, the Netherlands. Next, I worked as a postdoctoral researcher at the Aalto school of Economics,Aalto,Finland. Afterwards, Iworkedfor 2 years asassistant professor of Statistics and Research Methods at the University of Tilburg. During my Ph.D. I also worked as a research scientist at Philips Research, Eindhoven, the Netherlands and as a distinguished visiting scholar at the CHIMe lab of Stanford University, Stanford CA, USA. Currently I am assistant professor in Artificial Intelligence (AI) at the Radboud UniversityNijmegen.Also,Iamthetrackleaderofamastertrackcalled“Weband Language.” I (amongst other courses) teach a course on AI techniques on the web called“AIattheWebscale.”YoucanfindthewebsiteoftheresearchlabthatIrun right here: www.nth-iteration.com. Attheendof2012myfirst“popular”(Dutch)bookcalled“DigitaleVerleiding” which, since 2015̇, is also available in English under the name “Persuasion Profiling” was released. I am also a founder of PersuasionAPI, “pioneers in per- suasionprofiling”(seewww.sciencerockstars.com).Thecompanyisnowownedby Webpower b.v. My prime research interest are: (cid:129) Persuasivetechnologies.Ifocusonthereal-timeadaptationoftheuseofdistinct persuasive principles in interactive technologies. (cid:129) Research methods. I study both parametric and non-parametric statistical methods, hierarchical models, and time-series. (cid:129) Online/streaming learning. I work quite a bit on how to fit hierarchical models online. (cid:129) Bandit problems. I have worked on policies for multi-armed bandit problems. (cid:129) Dynamic Adaptation. I have been involved in several attempts to model, in real-time, consumer behavior and adapt e-commerce attempts accordingly. Obviously, my interest in research methods drove me to start editing this book. Ihave,throughoutmystudiesandwork,beeninterestedinquantitativemethodsin diverse fields, ranging from social science, to computer science, to physics and