ebook img

2014-davisstober.pdf (PDFy mirror) PDF

0.97 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview 2014-davisstober.pdf (PDFy mirror)

Decision ©2014AmericanPsychologicalAssociation 2014,Vol.1,No.1,000 2325-9965/14/$12.00 DOI:10.1037/dec0000004 When Is a Crowd Wise? Clintin P. Davis-Stober David V. Budescu UniversityofMissouri FordhamUniversity Jason Dana Stephen B. Broomell YaleUniversity CarnegieMellonUniversity y. s.adl herbro Numerousstudiesandanecdotesdemonstratethe“wisdomofthecrowd,”thesurprising blised accuracyofagroup’saggregatedjudgments.Lessisknown,however,aboutthegenerality punat ofcrowdwisdom.Forexample,arecrowdswiseeveniftheirmembershavesystematic dmi judgmentalbiases,orcaninfluenceeachotherbeforemembersrendertheirjudgments?If salliedisse sinod,iavriedutahlesr?eWsietupartoiovnidseinapwrhecicishewbuetcgaenneerxapledcetfianictiroonwodftcorobwedlewsissdaocmcu:rAatecrtohwandisskwilliesde ite ofob ifalinearaggregate,forexampleamean,ofitsmembers’judgmentsisclosertothetarget et value than a randomly, but not necessarily uniformly, sampled member of the crowd. onnot Building on this definition, we develop a theoretical framework for examining, a priori, oris whenandtowhatdegreeacrowdwillbewise.Wesystematicallyinvestigatetheboundary ationand conditionsforcrowdwisdomwithinthisframeworkanddetermineconditionsunderwhich cier the accuracy advantage for crowds is maximized. Our results demonstrate that crowd os su wisdomishighlyrobust:Evenifjudgmentsarebiasedandcorrelated,onewouldneedto s alAdual nearlydeterministicallyselectonlyahighlyskilledjudgebeforeanindividual’sjudgment ologicindivi caolsuoldprboeviedxepaenctaecdcutorabceymraotiroenaaclecubreahtiendthtahneaneseimdpfolerdaviveerrasgitiyngofojfutdhgemceronwtsda.mOounrgrgersouultps he members. Contrary to folk explanations of crowd wisdom which hold that judgments ch Psyoft should ideally be independent so that errors cancel out, we find that crowd wisdom is ne maximizedwhenjudgmentssystematicallydifferasmuchaspossible.Wereanalyzedata as cu from2publishedstudiesthatconfirmourtheoreticalresults. erial mn eAerso Keywords:wisdomofthecrowd,judgment,forecasting,aggregation,prediction hp ythe Supplementalmaterials:http://dx.doi.org/10.1037/dec0000004.supp bt yrightedolelyfor Galton(1907)providedperhapsthefirstdoc- (Krause, Ruxton, & Krause, 2009) effect when ps coed umentationofthe“WisdomoftheCrowd”(Sur- he analyzed 787 individuals’ guesses of the isnd owiecki, 2004) or “swarm intelligence” weight of a slaughtered and dressed ox. The mentinte individualswereenteredinacontestatalivestock Thisdocusarticleis stchhouomswpmefotoetrdivwafthoinircghpthrtihezmeeys,towtehgrueeierscshdwaisregcleul.dssBiaoecnsamsuasoelfltfhteheeye, Thi ClintinP.Davis-Stober,DepartmentofPsychologicalSciences, guesseswerelikelylimited,resultinginaseriesof UniversityofMissouri;DavidV.Budescu,DepartmentofPsy- individualjudgmentsuninfluencedbytheguesses chology,FordhamUniversity;JasonDana,SchoolofManage- ment, Yale University; Stephen B. Broomell, Department of ofothers.Somecompetitorswerehighlyskilledin SocialandDecisionSciences,CarnegieMellonUniversity. thisjudgment,suchasbutchersandfarmers,while This work was supported, in part, by the Intelligence others were novices. The guesses, which Galton AdvancedResearchProjectsActivity(IARPA)viaDepart- mentofInteriorNationalBusinessCentercontractnumber called “unbiased by passion and oratory,” were D11PC20059. nearly symmetric about the correct answer of Correspondence concerning this article should be ad- 1,198 pounds, and the wisdom of the crowd, as dressedtoClintinP.Davis-Stober,DepartmentofPsycho- captured by the median guess of 1,207 pounds, logical Sciences, University of Missouri, Columbia, MO 65211.E-mail:[email protected] wasaccuratewithin0.8%. 1 2 DAVIS-STOBER,BUDESCU,DANA,ANDBROOMELL Ithassincebeenwellestablishedthataggre- or other attributes. On the basis of our defini- gating judgments or predictions across individ- tion, we develop a framework for analyzing ualscanbesurprisinglyaccurateinavarietyof crowd wisdom that includes various aggrega- domains,includingpredictionmarkets,political tion and sampling rules. These rules include polls, game shows, and forecasting (see Sur- both weighting the aggregate and sampling the owiecki, 2004). Under Galton’s conditions of individual according to skill, where skill is op- individuals having largely unbiased and inde- erationalized as predictive validity; that is, the pendentjudgments,theaggregatedjudgmentof correlationbetweenajudge’spredictionandthe a group of individuals is uncontroversially bet- criterion. Although the amount of the crowd’s ter, on average, than the individual judgments wisdom—theexpecteddifferencebetweenindi- publishers.natedbroadly. t1kwh9liee8srmd9,o;s1me9Gl7v,a1ehls)to.ownT(e,eh.vge1e.9,rb,0oA7au;rnremdSansurtoryrotowcnaogsien,cwdk2ieti0il,ol0n21us0;n0doC4ef;rlcesrWtmoooweinndd-,. vtjrhuuidedleugasammlsepoeneurtcrnsio,tfryooiufanrngbdrieawsschuraeolntnwsddayniseoeilnrmdrionpsrdli—emepapiesvlneendraaoengnndecliegnweeoinaflrletrhbianeel dmi Forexample,whengroupmembersareallowed wise. While a simple average of the crowd is itsallieedisse apcocseesdstotomoatkhienrgmtheemmbienrds’epperneddeincttiloyn,sth,eairsporpe-- nuontifoalrwmalyysatwriasnediofmin,dwiveisdhuoawlstahraettnhoetresaamlwpaleyds ofob dictionsbecomemorepositivelycorrelated,and existssomeaprioriaggregationrulethatmakes oneott thecrowd’sperformancecandiminish(Lorenz, the crowd wise. orsn Rauhut, Schweitzer, & Helbing, 2011). In the Our results suggest that crowd wisdom is i ationand cuoanlstehxatvoef hbaenendicfaopupnidngtospmoratkseressyuslttes,miantdicivaildly- robust to different choices of aggregation and cier samplingrules.Thatis,howoneaggregatesthe ssous biased predictions, so that their aggregated judgmentsorchoosesanindividualjudgerarely Aal judgments may not be wise (Simmons, Nelson, ologicalindividu Gcraolwakd,w&isdFormedteorifcakc,to2rs01s1u)c.hHasownonroinbduesptenis- acthfrfoaenwctdtshtehthaietnidsqiauvaisdliiumtaaptli.lveBeayvceoirdnaecgnleutisofifyojinundgtghceaostniesdviwteiinosneasr he dence and bias of crowd members’ judgments? Psycofth Iftheconditionsforcrowdwisdomarelessthan fgouridcarnocwedfowriscdoonmst,ruoctuirngreasnultospatilmsoallpyrowviidsee ne ideal, is it better to aggregate judgments or, for caus group—a group whose accuracy most exceeds eAmeriersonal iWmnseotmuanlbdceerit,obrreelabyelteotsensrtsaokiasldlkedidllaeodhnieginhwdlyihvosikdmiullaaelkdejcsurdosgwyesd?- tphraistinogf ictsoninclduisviiodnusalemmeemrgbinegrs.—Fwirsitth, atwcoroswurd- ythhep tematically different predictions than other becomes wisest when it is maximally informa- bt tive,whichentailsthatitsmembers’judgments yrightedolelyfor mwieWsmdoebmeprr-soo,vfi-intdhceere-acasrsioimnwgpdldeie,vfpefreresccittiysae?nddefianistyiosnteomfatthiec aproessaisblnee,gaastivoelpypocoserrdelatotedbewinitgh eiancdhepoetnhderenats. iscopndeds wdeafiynetoaecxraomwidneasitwsisbeouifndaalriynecaornadgigtiroengsa.teWoef Tthhauts,isthmeabxeimstajlulydgdeiftfoeraedndt tforoamcrootwhedrsi.sOonnee entnte itsmembers’judgmentsofacriterionvaluehas intuitiveanalogyofthisresultistothinkofthe documcleisi loefsasneixnpdeicvtiedduaslqsuaamrepdleedrrroarndthoamnlyth,ebujutdngomtneencts- gbreottuepr atos adifivenrasnifcyialpeprofrotrfmolaion:ceSobmyet“ihmeedsgiintgi”s hisarti essarily uniformly, from the crowd. Previous andincludinganassetthatperformswellwhen This definitions of the wisdom of the crowd effect other assets perform poorly. This result pro- T havelargelyfocusedoncomparingthecrowd’s vides mathematical support for the idea that accuracy with that of the average individual crowds with more diversity are wiser (Hong & member (Larrick, Mannes, & Soll, 2012). Our Page, 2004). Furthermore, our theoretical definitiongeneralizespriorapproachesinacou- frameworkprovidesamechanismfordetermin- ple of ways. First, we consider crowds created ingwhenitwouldbebetterfortheoverallgroup byanylinearaggregate,notjustsimpleaverag- predictiontoaddagroupmemberwho,perhaps, ing. Second, our definition allows the compari- islessskilledthanthealternativemembers,but son of the crowd to an individual selected ac- provides diverse predictions. In other words, cording to a distribution that could reflect past our framework provides a quantification of the individualperformance;forexample,theirskill, accuracy–diversity trade-off. WHENISACROWDWISE? 3 A second surprising conclusion is that while ities for evaluating the wisdom of the crowd ef- the absolute accuracy of the crowd depends on fect. We then analyze several special cases, in- the direction and magnitude of members’ bias, cluding comparing an equally weighted linear itisalmostalwayspreferabletouseaweighted aggregateofthejudgestoprobabilisticallyselect- aggregate of judgments rather than select the ing an individual judge according to his or her single best group member, even if the crowd skill.Wethenapplyourframeworktoareanalysis members are biased. Unless the best group of two data sets. We conclude with a discussion membercanbeselecteddeterministically,asin andpresentfuturedirectionsforthiswork. certain intellective tasks (Laughlin, 1996), the decrease in variance of predictions caused by The General Model y. aggregating judgments will offset the bias, a shers.broadl mtraadnei-foesffta(tGioingoonfeth&eHwaesltli-ek,n1o9w9n7)b.ias/variance The Crowd Prediction publinated We define accuracy as the average squared Consider a set of N-many decision makers dmi errorofprediction(whetheragrouporindivid- (DMs), where each DM makes a judgment itsallieedisse uraacly). mTheitsriciswaitchoinmmthoenfi“egldolodfsstatantdisatridcs” (aLcceuh-- ambooduetlththeeucnrkitneoriwonn vbaeliuneg opfreadicctreitder(ioonr.eWstie- ofob mann & Casella, 1998). This accuracy metric mated) by the group members as a random oneott allows us to derive distribution-free results on variable with finite mean and variance. In this orsn crowd wisdom. In other words, we make no way,weconceptualizeourframeworkasapply- i ciationerand atisosnuamlpfotiromnsorfetghaerdinindgivtihdeuaulnodregrlryoiunpg’sdipsrteridbiuc-- iansgtototrhaendsopmecicarlitecraisae,sasofinepstriemdaictitniogn,aassiwngellel sous tions,suchasnormality,nordoweimposeany fixed quantity (which we accommodate by set- s chologicalAheindividual csdcyohemnafinsnmtgirteaeioitnrtnyhtsseoo(rcenou.gnnt.hci,mleuaosdvdiioeasrntlaristigybeo.uftAaioobltnuse’orrsnlumasttheoiavdpeeeerla,rocstucrh)ucohrucagacanhys ttwihniiStgshicmtmhriieteleaavrrnailoyr(cid:1)in,aywnvcaaeenludaoesfvstutaohmreibaeecnrctaihetaer(cid:1)train2yoe.danoctmhoD0v)aM.rWi’asbeljeut,adkgYe-, Psyoft our approach is one that could, in theory, be ment is a random variable. This assumption ne extended to any accuracy metric. as represents the variability of a DM who gives cu erial We present an application of our framework variable responses to the same task. With this Amson to experimental studies by reanalyzing the data assumption,wecanmodelhowaDM’spredic- heper collected, analyzed and published by Vul and tionscorrelatewiththecriterionaswellasother bytthe Pashler (2008) and Simmons et al. (2011). Our DMs in the crowd. Let the prediction distribu- pyrightedsolelyfor awVnhuaellynasnaidsppfiPlnaiesddhsletaor“(twh2e0is0gd8ro)om,uepxotofefnthdineindcgivroitdhwuedao”lsriegffrifonemaclt twioiAtnhocmrfoetawhnde(cid:1)pitxrhieadDniMcdtivobaner,idathneenceorta(cid:1)endx2di.Com,isvdaerifianbeledaXsi coed analysiswhichexaminedtheaccuracyofpooled therandomvariableformedbylinearlycombin- mentisintend arenpaelyatseisdojfudthgemSeinmtsmwointhsientianld.i(v2i0d1u1a)lsd.aOtaursurpe-- iwnegighthtsewD,MCs(cid:2)ac(cid:1)coNrdiwngX,twoitphrtehdeerteesrtmricitnioedn docucleis pinodritvsidthuealobviearsalblytrmeaatmniepnutlaetfinfegctthoefsipnocrrtesasbientg- that all wii (cid:1)are noni(cid:2)-n1egiatiive and, to ensure hisarti ting information available to them. In contrast uniqueness, iN(cid:2)1wi (cid:2) 1.Theweights,wi,are Ts notrandomvariables,butratherfixedchoicesof hi totheoriginalfindingsreportedbySimmonset T how to combine crowd member judgments. al. (2011), our reanalysis, guided by our new At this point, note that we place no a priori formulation, finds an overall improvement of the crowd’s predictions relative to individuals restrictions on the (cid:1)xi and (cid:1)x2i values. This al- across all treatments in the study. In other lows for the possibility that DMs are biased, words, while the members are individually bi- meaningthattheiraveragejudgmentwouldnot asedandthecrowdnotparticularlyaccurate,the equal the average criterion value, E(cid:2)Xi(cid:3) (cid:2) crowd is still wise relative to the individual. (cid:3) (cid:4) (cid:3) .AlsonotethatweallowDMstohave xi y Inthenextsection,wepresentthegeneraldef- differentpredictionvariances,(cid:1)2,andarbitrary xi inition of crowd wisdom and our basic sampling covariances with other DMs where (cid:2) de- xi,xj assumptions.Wethenderiveafamilyofinequal- notes the covariance of X and X. In other i j 4 DAVIS-STOBER,BUDESCU,DANA,ANDBROOMELL words, the judgments of the crowd members merelyamechanicallinearaggregationofindi- may be correlated with each other. In this way, vidual judgments, as opposed to, for example, we can model the effects of crowd members freelyinteractingdeliberativegroupslikejuries influencing each others’ judgments. This ap- or structured group interactions (e.g., Delphi proach builds upon the seminal works of Hog- method; Linstone & Turoff, 1975). While this arth (1978) and Winkler (1981), and our anal- focusisperhapssomewhatlimited,itisconsis- yses generalize those of Einhorn, Hogarth, and tent with much of the literature on crowd wis- Klempner (1977). dom(althoughseeMerkle&Steyvers,2011,for Finally,notethatweplacenoapriorirestric- a Bayesian aggregation model using nonlinear tions on the possible ranges of the covariance weights).Ourdefinitionofa“crowd”prediction y. between X and Y, denoted (cid:1) , other than the as a linear combination of group member pre- shers.broadl urisaunaclepomsaittjirviceesse.mInideofithneitrewreosxrtir,dyisc,tiosonmsoenccroovwad- dspicetciioanlsccaosnet.aiOnsurthaepspirmoapclehgcraonupbaeveseraegneaassaa publinated membersmayhavemoreskillthanothersinthat generalization of several previous approaches, dmi their judgments are better related to the crite- such as comparing the group average with an salliedisse rioTn.o fix these ideas, our framework could be ientdailv.,id1u9a7l7s;elWecatelldsteunni&forDmileydrearnicdho,m20(0E1in).hWorne ite ofob used to evaluate the following types of tasks: extend these approaches by considering other oneott 1. A group of N-many financial analysts that special cases, such as the one where the proba- orsn predict the weekly changes (or absolute bility of selecting an individual is proportional i ationand cthheanDgOesW)iJnI)t,heorvathlueeeoxfcahamnagrekertatiendoefxt(hseucUh.aSs. tsourtehdatbiyndtihveidcuoarlr’eslaetxiopnecwteidthptehrefocrmritaenricoen(mvaerai-- cier sous dollar and the Euro. able). s Aal 2.AgroupofN-manysportsprognosticators ologicalindividu wwheeokpirnedailcltNthFeLnguammbeesr,oorftphoeinntusmsbcoerreodfegvoearlys PRraenddicotmiolny of an Individual Selected he scored in the Bundesliga. ch nPsyeoft pre3d.icAt gthroeutpotoaflNam-moaunnytwofeamthoenrthfolyrecraaisnte,rosrwthhoe iseWxpeecctoendsitdoebrewbhetettehretrhathneancrionwdidv’isdujuadlgcrmowendt as cu member’s.LetPbetherandomvariableformed ytheAmerihepersonal atmhveoe4nrp.atrAhgoebwgmariblooliunlbipttheyolybtfhetNaleotm-wmthpaee8nr%uaytnu.eercmeopinnlooamymgisitvesnenptrrleaodtceiactntiieonxngt. bpitryyobsoaefbleislcietsliteniccgtailnalyg,saitnhngedleliethtmpceirmdoewbnedor(cid:1)temoftehmethbpeerro,cbrawobwiitldh- pyrightedbsolelyfort vtiaornIinasbafllrelomo(cfrmittheuerslitoeinpc)laesaejnusdd,gwreeesp.ehaTatvheedeairnardnaidnvodidmoumaplrteafodrgricee--t appis(cid:5)(cid:2)pe0c1i,a∀,l∀ic(cid:2)ais(cid:2)e(cid:4),1(cid:4)i,1f2,a,2l.l,..p..i,.vN,alN(cid:5)ua(cid:5)e,nsdtahreeneiN(cid:2)Pq1upraeild,(cid:2)uthc1ea.stAitsos, coed castsandobservedrealizationsofthecorrespond- i N isnd ing random variables allow straightforward selectinganyindividualDMwithequalprobabil- Thisdocumentsarticleisinte earesnstdWiumlcetaostvicoaalrnraeiraoindffcyeaefilsltn)htehathdteapotoauvprrelaarmydaeeatfiesrnrionsiltge(iomleinne,aaoanbnussdr,tvrmaaaocnritdaaelnpylc.rteeics- 0shitei,ygl∀.ehciAet(cid:2)settdpt(cid:4)h1ewer,if2tooh,rt.mhpe.irr.no,gbeNxagbt(cid:5)rr,ioeliiumt(cid:4)pye,mokine,femt,phbkefeon(cid:2)rritshe1exkanwkmotihwtphDlne.pM,Iinth(cid:2)ieas Thi ddiocmtioonfttahseks.aOmfetecnr,oowndeaiscrionstsermesuteltdipilne,thdeistwinics-t plartoeproerxtiaomnaplletowtheeciothnDsidMe’rstchoerrcealsaetiownhweriethpYi.is prediction tasks. In Section 5, we demonstrate how our theory can be extended to such cases A Wisdom of the Crowd Criterion by adding some additional assumptions on crowd behavior for our reanalysis of the Vul We consider the expected squared loss be- and Pashler (2008) data set. This allows for an tween each prediction distribution and the cri- application of the theory to a wide range of terion distribution Y throughout. We compare empirical data sets. the values E[(C (cid:3) Y)2] and E[(P (cid:3) Y)2] to one Ourmodelandresultsarelimitedtoso-called another, where “E(cid:2)·(cid:3)” is the expectation opera- “statisticized” groups, where the crowd is tor. In other words, the prediction model that WHENISACROWDWISE? 5 comesclosest,onaverage,toYisconsideredto cording to the probability measure, p,i(cid:2) i be more accurate. This accuracy criterion, ex- (cid:4)1,2,...,N(cid:5), if, and only if, the following in- pected squared-error, is only appropriate for equality holds: tasks in which “close-ness” of a prediction or judgment can be evaluated on a continuous ((cid:3)(cid:1)w(cid:6)(cid:3) )2(cid:8)w(cid:1)(cid:9) w(cid:6)2w(cid:1)(cid:1) (cid:8)(cid:1)2 X y XX xy y scale(seeLee,Steyvers,deYoung,andMiller, 2012;Yi,Steyvers,Lee,&Dry,2012,forrecent (cid:7)(cid:1)N p(cid:2)((cid:3) (cid:6)(cid:3) )2(cid:8)(cid:1)2 (cid:6)2(cid:1) (cid:8)(cid:1)2(cid:3). (2) approaches to modeling crowd wisdom for i xi y xi y,xi y i(cid:2)1 combinatorial and ranking tasks, which would not meet our modeling assumptions). It is possible to rearrange Inequality (2) in a y. We define a wisdom of the crowd effect to way that separates clearly the various factors shers.broadl hold if, and only if, tchaantsdirmivpelifthyetheifsfeecxtp.rBesysiroenararasnfgoilnlogwtesr,ms, we publinated E[(C(cid:6)Y)2](cid:7)E[(P(cid:6)Y)2], (1) dmi (cid:1)N (cid:1)N oroneofitsallieisnottobedisse f(cid:4)tpbho1eielr,ci2ttrye,sidgo.whm.at.ec-eih,cgaNuhcnr(cid:5)trad,socawyspniddido,efisa(cid:2)eosglefgel(cid:4)creI1entc,igeto2iaqnn,tuge.ad.liaisw.tnty,reiNiib(ng1(cid:5)udh.)titivsNoi,sindotutwpheareil,otiebha(cid:2)xcaa---t i,ij(cid:4)(cid:2)j1 wiw(cid:8)j((cid:1)(cid:1)xxi,ix,yj)(cid:8)(cid:6)(cid:3)i(cid:1)(cid:2)xNi(cid:3)1(cid:6)xwj)i2(cid:7)(cid:6)2pi(cid:2)i(cid:7)1(cid:6)((cid:3)wx2ii(cid:6)(cid:8)p(cid:1)ix2)i((cid:7)(cid:3).y(cid:3)(x3i) ationand cording to an arbitrary, prespecified probability If we additionally assume that (cid:1) (cid:6) 0 we cier distribution, in contrast to previous formula- y ssous tions such as evaluating the arithmetic mean obtain: Aal hologicaleindividu aacl.c,Lu2er0ta1(cid:1)cy2X).boeftihnedNivi(cid:4)du1alvepcrteodricotfiothnesD(LMasrr’imckeaent i,(cid:1)j(cid:2)N1 wiwj((cid:1)xi,xj(cid:8)(cid:3)xi(cid:3)xj)(cid:7)2i(cid:1)(cid:2)N1(wi(cid:6)pi)(cid:1)xi,y canPsycuseofth pt(cid:1)hreedXdicei,tniioo(cid:2)ntes.(cid:4)t1hL,ee2tN,(cid:5).(cid:4)x.x.1b,evNet(cid:5)hc,teroacrnoodvfoacmroiavvnaacrreiiaamnbclaeetssr.ioxLfoeYft i(cid:4)j (cid:6)(cid:1)N (cid:6)w2(cid:6)p(cid:7)MSE . merinal wxityheachX,i(cid:2)(cid:4)1,2,...,N(cid:5).Itisstraightfor- i(cid:2)1 i i xi eAerso ward to shoiw that E[(C (cid:3) Y)2] is equal to the Theright-handsideofthisinequalityfocuses hp ythe following: ontheNindividualsinthecrowd.Inparticular, bt copyrightededsolelyfor E[(C(cid:6)Y)2](cid:2)(cid:6)(cid:3)X(cid:1)w(cid:6)(cid:3)y(cid:7)2(cid:8)w(cid:1)(cid:6)(cid:9)X2Xww(cid:1)(cid:1)xy(cid:8)(cid:1)2y, tfvsheeiridlseeuncaetclxiesnpgirbneetsththsweieoseecnerniohnwidtghdihvel(iiwdwguhie)atiaslgsnhtdhftrsetohameesfsfpteirhgcoentbecaodrbfoittlwohitediiensd(dpiofii)f-- isnd on the individual mean squared errors of the entnte where w is the N (cid:4) 1 vector of weights, individualjudgesandtheindividualjudges’co- documcleisi wi,Nie(cid:2)xt,(cid:4)1w,e2,c.o.n.s,idNe(cid:5)r,tdheefirnainndgomC.variable (P (cid:3) vmaarxiainmciezsedwiwthhethneicnrdiitveridiouna.lTjuhdisgeesxpwreisthsiohnigihs Thishisarti tYh)e2o.rAemn a(ep.pgl.i,cBatiicoknelo&f thDeokitseuramte,d20e0x1p)ecytiaetlidosn: cinodviavriidaunaclesmweiathn thsequcarrietedrioenrro((cid:2)rsy,xi)araendovloewr- T weighted in the crowd, relative to their proba- E[(P(cid:6)Y)2](cid:2)(cid:1)N pi(cid:2)((cid:3)xi(cid:6)(cid:3)y)2(cid:8)(cid:1)x2i(cid:6)2(cid:1)y,xi bhialnitdysoifdeseolefcttihoen.inOenqutahleityothiserinhdaenpde,ntdheenltefot-f i(cid:2)1 the criterion, Y, and reflects only the interrela- (cid:8)(cid:1)2(cid:3). tion between the various judges in the crowd y andtheirrelativeweights.Itisminimizedwhen Proposition 1. (Wisdom of the Crowd there is an inverse relationship between Effect). The aggregate crowd prediction dis- E(xx)(cid:6)(cid:1) (cid:7)(cid:1) (cid:1) )andww,thatis,when i j xi,xj xi xj i j tribution, C, defined by w has lower expected pairs of judges with high (low) E(xx) are as- i j loss than an individual judgment selected ac- signed relatively low (high) weights. 6 DAVIS-STOBER,BUDESCU,DANA,ANDBROOMELL The above proposition provides an explicit, 1 testableconditiontodeterminewhetheragiven average, wi (cid:2) N,i(cid:2)(cid:4)1,2,...,N(cid:5), and the crowd is wise. In the following special cases, competitor model P is defined by the uniform we demonstrate how this result can be used to 1 evaluatetherelativetrade-offsofgroupmember distribution p (cid:2) ,i(cid:2)(cid:4)1,2,...,N(cid:5); that is, interdependence versus bias. First, we prove a i N basic result within our framework. the competitor individual is selected uniformly Result 1. Consider the case when w (cid:2) atrandom.Thismodelsthecasewhereonehas i p,∀i(cid:2)(cid:4)1,2,...,N(cid:5), that is, the aggregation no prior information to suggest or reason to i believe that any member of the group is any weights providing the crowd prediction are hers.broadly. imdeinneticthael tiondtihveidsuealelcDtiMonpwreedigichttisonusdeidstrtoibduetitoenr-. bIneettqeuralaittyth(2e) pcarendbicetiroenarrtaasnkgetdhaanndawnyrittoetnh:er. s Then a wisdom of the crowd effect always eofitsalliedpublitobedisseminated hmeroaelPRgmdreesbos.uecorrlfto.aw1vdeSerexmaetgeeeAnmdipbsspeemrtnhotdeoriexfia.nnadcycinsuigrtauttaehtaitothnathnienthcwerhoaiwcvhd- N12i(cid:1)(cid:2)N1(cid:1)j(cid:2)N1(cid:1)xi,xj(cid:6)N1i(cid:1)(cid:2)N1(cid:6)(cid:1)x2(cid:8)i(cid:7)N1i(cid:1)(cid:2)N1N1i(cid:1)(cid:2)N(cid:3)1x(i(cid:3)(cid:6)xi(cid:3)(cid:6)y(cid:9)(cid:3)2.y)2(4) onot the aggregation weights are identical to the n ors probability weights used to select the individ- Recall that this inequality is simply an alge- i ciationerand u1a9l7.0A;sHiongaprrtehv,io1u9s7,8r)e,lRateesdulatr1guims aensttsra(iDghawtfoers-, bYr)a2]ic(cid:7)reEa[r(rPan(cid:3)gemY)e2n],taonfdtthheusi,nethqeuamliatyg,niEtu[dCe(cid:3)to sous ward application of Jensen’s inequality. which(4)deviatesfromequalityispreciselythe s ologicalAindividual aitgigWsrehignialteetirRoensetsimunlgett1htoogdcuoathnraasntidtmeeeraskpteahsrettihecexuilcsartreonwacgdegwroefigsaaen-, deaxbevpleiesactt(ieoCdn(cid:3)dfirfoYfme)r2eenaqncuedabl(iPetytw(cid:3)ieneYn()42t.)h,Tethhreeangmdroeoamrteervpatrrhoie-- chhe tion rules, such as an unweighted aggregate, nounced the wisdom of the crowd effect. yt Psof against other methods of selecting individuals. Whatdoesthecompositionofthe“crowd”look anse Forexample,apracticalproblemthatonemight like when the inequality 0 (cid:7) E[(P (cid:3) Y)2] (cid:3) cu erial face is to choose between the judgment of an E[C (cid:3) Y)2] is maximized? When it is mini- mn Aso available expert and a crowd. Without exact mized? First, consider the left-hand side of in- heper knowledge of the relevant crowd parameters, it equality (4). Because the covariance matrix of ythe is difficult to decide on aggregation rules other the judges is positive semidefinite, this side of bt yrightedolelyfor tshhcoaewnneaavriesorism, tpionlevsoeaelvvehirnoagwgea.liWkseimelycpailtenwsctorioullwldrudbneatvhteororafiugngedh, tsshitmaenpdilnaifreydqiumzaealditttyesrusci,shasntsheuacmtesteshatehriacltyoavlanlropinarnpedcoeiscittbiivoetenw.seaTernoe ps isconded amnaeyxwpeorntdtheartaibsomutorteheacecxuterantte.oFfucrrtohwerdmwoirsed,owme jtuiodngse,srX1.TanhdelXe2ft-chaannbdesiidneteorpfrienteeqduaaslitcyor(r4e)lais- mentinte andfactorsthatleadtomaximizingthecrowd’s maximixzijed when all the judges are perfectly Thisdocusarticleis pnInreeexdqtiucpatrlioivtveyid(ae1d)vsotaonmtdaeegmceoomonvpsetarrraatttehivefeaicantnodarivslyitdsheuasatlu.msWianxge- (cid:8)coN1r2r(cid:1)elaiN(cid:2)te1d(cid:1)Nj(cid:2)w1irthxij (cid:6)eacN1h(cid:1)oiN(cid:2)t1hrex2ri(cid:9), (cid:2)thaNNt22 i(cid:6)s, hi imize crowd wisdom. In the section following, T N we examine special cases in which the aggre- (cid:2) 0. As the judges become less correlated gation weights do not match the selection N weights. with one another, this value becomes smaller and the wisdom of the crowd effect becomes Unweighted Average Versus Selecting an more pronounced. This result is intuitive be- Individual at Random With Equal cause if all the judges provide the same (or Probability almost the same) predictions, there would be little gained by aggregation. Note that when Considerthesimplecasewherethecrowd,C, crowd members’ judgments are highly corre- is defined by the unweighted (simple) group lated, adding in a new member whose judg- WHENISACROWDWISE? 7 ments are nonredundant improves crowd wis- are unbiased. All aggregation can do in this dom, particularly for a small crowd. As is situation is reduce the variance of the aggregate evident from Inequality (4), this improvement prediction (Wallsten & Diederich, 2001). Maxi- can occur even if the new, nonredundant mem- mizingthistermisfarmoreinteresting.Theright- ber has substantially lower skill than the exist- hand side of (4) becomes arbitrarily large as the ingmembers.Theintuitionbehindthisresultis averagesquaredbiasofthejudgesbecomeslarge clearer when one considers other linear aggre- withthesquaredbiasofthejudgeaverageremain- gation problems: If one has highly redundant ing small or zero. In this case, all judges are predictor variables in a multiple regression, (possibly greatly) biased in their individual pre- adding a predictor that gives new information dictions, yet the average of their predictions is shers.broadly. wpthoielolrclhoyenlctpeoxtrhrteeolafmtemodduewlltiiepthvleetnhreeigforteuhstecsionomenwethvpiarserideafibcfelteoc.rtIiinss vnjueodrlyoggecyloposrfeedLtioacrtrtihiocenkstarun“edbrcSarcoitkleler(ti”o2n0t.0h6Pe)u,tttrhiuneeitnhcderiivtteeirdrmiuoain-l itsalliedpubliedisseminated scrreiiaomlnanitilimpaoernprstrohoseivp“es(wsutehpietephrTceorzsoteshwileogdrno’”jvsuide&ngstetiHhsmaeaatnntiaedk,vnni1eoaw9th9thi1mse)e.ocmrrihbteeerr- m(cid:1)fthaylie.lsaHenii,nethrdfeaei,lvrltiiahdnbeugoa,ivnlmedboioivrareisdboueiraslollewcpsarsen(cid:1),dcbyie,ocltbtliheuodtnabswaonshvydeesnttaehanmevdaectrbiarecoglawoelldwdy, ofob Itisworthnotingthatwhiletheleft-handside predictioniswise. et onot issmallerforperfectlyuncorrelatedjudges,itis n ors not minimized in this case. This term is mini- Unweighted Average Versus Selecting an i ciationerand mmiazlelyd nwehgeantivaelllyjucdogrreeslaateredewqiutahlloynaendanmotahxeir-. Individual According to Their Skill sous This result is distinct from the folk explanation In the previous section, we considered the s alAdual of crowd wisdom which holds that indepen- simple case where one does not discriminate ologicindivi dcaenncceelaomuotn(gvajruiadngcees risednueccteiossnardyoessoftahcattoerrirnotros bweotwrkeehnasthedeimndoinvsitdruataeldjutdhgaetsfoarpirnioteril.lecPtriivoer chhe the right-hand side of this inequality, as shown tasks without demonstrable solutions, groups yt Psof below). This result is in line, however, with are often poor at identifying the highest per- anse other mathematical models demonstrating forming member (Henry, 1995). While infor- cu erial how group diversity can improve overall mative,thiscasemaynotbegeneral.Often,we mn Aso group accuracy (Hong & Page, 2004). To dohaveinformationonthepriorperformanceof heper clarify, this maximal negative correlation is judges; that is, their skill at a particular predic- bytthe subjecttotheusualpositivesemidefinitecon- tiontask,forexampleCooke’smethod(Cooke, pyrightedsolelyfor swvterhraiyicnhst,msfaoolnrl.ltAahresgeitnhcteerronjwuumddgsbe,erwcooilrflrejbuleadtginoeenscegmsosaeatsrriitlxoy, i1ang9dg9ir1ve)ig.dauItainolnstuhwcisheigsthehacttsti,tohwne,i,ptwrooeabacrbaoinmlditpoyamroelfysdseielfelfecectrietnendgt coed infinity, the maximal negative correlation of the ith DM is proportional to that judge’s skill, sd mentiinten allWjuhdilgeesthaepplreofta-chhaensdzseirdoe. of Inequality (4) dtieofinnwedithasthtehecrictoerrrioelna,tiYo.nLoeftahlilsjuodrgheesr’spkrieldlibce- hisdocuarticleis djuedsgcreisb,etshethreigehfft-ehcatsndofsiindteerdceosrcrreilbaetisotnhebeetfwfeecetns nleotnp-inebgeadtievfien,erdxiya(cid:5)s fo0l,lo∀wis(cid:2): (cid:4)1,2,...,N(cid:5), and Ts of judge bias, that is, the expected squared de- hi T viation between a judge’s prediction and the r p (cid:2) xi,y . (5) criterion. This side of the inequality must i (cid:1)N necessarily be non-n(cid:8)egative. This t(cid:9)erm, r xi,y 1(cid:1) 1(cid:1) 2 i(cid:2)1 N (cid:6)(cid:3) (cid:6) (cid:3)(cid:7)2 (cid:6) N (cid:3) (cid:6) (cid:3) , is N i(cid:2)1 xi y N i(cid:2)1 xi y Clearly,thehigherthecorrelationofanindi- minimizedwhenthetruemeansofalljudgepre- vidual’spredictionswiththecriterion,themore dictionsareequaltothemeanofthecriterion,that likelythatindividualwillbeselected.IfallDM is,whenalljudgesareunbiased.Inthiscase,there predictions are equally correlated with Y then islittlebenefittoaggregatingthejudgesfromthe this choice of P will reduce to selecting a DM standpointofminimizingbiasbecauseallofthem uniformly at random, which has already been 8 DAVIS-STOBER,BUDESCU,DANA,ANDBROOMELL showninResult1tobeinferior,onaverage,to criterion,Y,andeachotheratthevalue(cid:8).This thecasewhereCistheunweightedaverage.At groupofN–1DMsrepresentsthe“herd.”The theotherextreme,ifonlyasingleDM’spredic- remaining DM is the “dissident” and correlates tions correlate with Y then that DM will be withthecriterionYat(cid:9).Toensurethepositive chosen with probability one, and will, most semidefiniteness of the intercorrelation matrix likely, outperform the unweighted average of between judges and the criterion, we will also the crowd. assume that the dissident correlates with the Let C be defined according to the simple herd judges at (cid:8). For now we assume that all unweighted average and let P be defined ac- N-many DMs are unbiased in their predictions, cording to (5). Applying Proposition 1 and re- that is, (cid:3) (cid:2) (cid:3) ,∀i(cid:2)(cid:4)1,2,...,N(cid:5). We will y. arranging terms gives us the following result. xi y hers.broadl Corollary 1. Let w (cid:2) 1,∀i(cid:2)(cid:4)1,2,..., conCsliedaerrlyb,iawsehdencas(cid:8)es(cid:6)in0suabnsdeq(cid:9)uenist sleacrtgioen,sw. e blised i N have a group of uncorrelated DMs and only salliedpudisseminat Nsthu(cid:5)amtaen(cid:1)dxtih,lyeat(cid:2)tpairlblxie,yvdaaernifiadnbe(cid:1)ldexsi,axsjar(cid:2)ienrsExtaiq,xnuj,da∀atiroid,nijz.e(5Td)h.seAuncsha- stthhueemdcpritsiitoseinrdsieo,nnthtevDadMriisasbhidlaees.n1atnUisynsdseeklreilclttheaidtswpsreietthdoipcfrtoianbsg-- ofitobe wisdomofthecrowdeffectholdsif,andonlyif, ability 1 under P, and will be more accurate, et inexpectation,thanthegroupaggregateC.At sociationoronuserandisnot MSEcrowd(cid:6)i(cid:1)(cid:2)N(cid:8)1i(cid:1)(cid:2)Nr1xri,xyi,yMSExi (cid:9) orttihefoleaneto,qetPudha,elwralysietxdhstkerfiioelnnmleeeded,aDiinnfoM(cid:8)(th5se)(cid:6)w,r.rh(cid:9)eUodwunacdreeeeshreatqtovhueisasealllcyegocrnctoiodunripg-- s ologicalAindividual (cid:7)2 Mean(rxiy)(cid:6)i(cid:1)(cid:2)N1(cid:1)Nr2xri,xyi,y , (6) owunnielwleoaifglwhttaheyedsDadMgogsrwewograistthee,,eoCqnu,aablvyperrRaogbeeas,builtlthita1yn.atTnhode chhe i(cid:2)1 investigate this relationship, we consider yt Psof three different levels of “skill” on the part of ericanaluse iwthhDerMe MpSreEdxiicitsiotnhedimsteriabnutsiqounaraenddeMrrSoErcfroorwdthies the dissident, (cid:9) (cid:6) .95 (high), (cid:9) (cid:6) .70 (me(cid:10)- eAmerson tthioen,mCe.an squared error for the crowd predic- dium)and(cid:9)(cid:6).40(low),andvarytheratio (cid:11) hp ythe Meansquarederrorisequivalenttothesumof from 0 to 1. As an application of Corollary 1, yrightedbolelyfort rtrhiegsehptep-chrteadntiodctsi(cid:1)oidyn)edaoinfsdtIrnibietusqtuivoaalnir’tiysan(sc6qe)u.,awErexedasmebeiinatsihnag(twttihhthee l(oe6ft),ILnseHimqSuialabliretlyyt,h(l6ee)t.leRTfHth-uhSsa,bnedthteshievdaerliugoehftR-Ihna(cid:6)enqduRasHliidStye/ ps coed crowd prediction benefits when skill is evenly LHSistheratioofindividualexpectedlossto isnd distributedamongtheDMs;inotherwords,when crowdexpectedloss.Foreaseofpresentation, entnte allDMsare“equallygood.”Theleft-handsideof our results are in terms of R on a logarithmic mi docucleis Iwneelql,utahlietym(o6s)thinigdhiclaytesskitlhleadt DfoMr tsh(ethcorsoewwditthotdhoe sticnauleo,udsemnoetaesdurleogo(fRt)h.eLowgi(sRdo)mproovfidtheesacrcoownd- hisarti highestcorrelationswiththecriterion)shouldalso effect with positive (negative) numbers indi- Ts hi havethelargestbiases. catingthecrowdisexpectedtobemore(less) T accuratethananindividualchosenatrandom. One DM Doesn’t Follow the “Herd”: When log(R) (cid:6) 0, the expected loss of the Unbiased Case crowdisequaltotheindividualexpectedloss. Figure1plotslog(R)asafunctionoftheratio Consider the case of a defector, “one DM who doesn’t follow the herd” model assuming that C is defined by the unweighted group av- erageandPisdefinedasin(5).Inthiscase,we 1AsNincreaseswithoutbound,(cid:8)equalstheminimalcorre- will assume that there are N-many judges with lationbetweenthedissidentandherdDMsthatguaranteesthatthe N – 1 DMs who positively correlate with the resultingcorrelationmatrixispositivesemidefinite. WHENISACROWDWISE? 9 φ=.95 φ = .70 φ=.40 0.6 0.6 0.6 0.4 0.4 0.4 y. s.adl herbro s blied dpuminat 0.2 0.2 0.2 ofitsallieobedisse log(R) log(R) log(R) et onot 0 0 0 n ors i ationand cier os su Asal −0.2 −0.2 −0.2 aldu N=5 N=5 N=5 hologiceindivi NNN===112050 NNN===112050 NNN===112050 ycth N=25 N=25 N=25 Psof −0.4 −0.4 −0.4 ne as cu erial mn Aso eer hp ythe 0 0.5 1 0 0.5 1 0 0.5 1 bt ψ/φ ψ/φ ψ/φ pyrightedsolelyfor Figure1. Thisfigureplotslog(R)asafunctionoftheratio(cid:11)(cid:10)for(cid:9)(cid:6).95,.70,.40.Foreach coed value of (cid:9), log(R) values are plotting for five samples sizes, N (cid:6) 5, 10, 15, 20, 25. The isnd left-handgraphcorrespondsto(cid:9)(cid:6).95,themiddlegraphcorrespondsto(cid:9)(cid:6).70,andthe entnte right-handgraphcorrespondsto(cid:9)(cid:6).40.Theline,log(R)(cid:6)0,isplottedforreference.All documcleisi pointsabovethislinerepresentcaseswherethecrowd’sperformanceissuperior. Thissarti (cid:10) vidual according to (5) outperforms the crowd Thi (cid:11) for group sizes N (cid:6) 5,10,15,20,25 sepa- in this case only when the herd is very weakly correlated with the criterion, for example, the rately for the different skill levels of the dis- sident. The line denoting equal accuracy of point at which C outperforms P for N (cid:6) 15 the crowd and a randomly selected individual under(cid:9)(cid:6).95occurswhen(cid:8)(cid:6).052.Thesize is plotted for reference. of the groups plays a large role in determining Note that the simple unweighted average, C, the point at which the wisdom of the crowd performs quite favorably compared with an in- effect emerges. As expected, the larger the dividual selected at random according to (5), groupsizethemorepronouncedthewisdomof even in the case where the dissident has a high (cid:10) correlationwiththecriterion.Selectinganindi- thecrowdeffectandthesmallerthevalueof (cid:11) 10 DAVIS-STOBER,BUDESCU,DANA,ANDBROOMELL atwhichitemerges.Thereisalsoastrongeffect One DM Doesn’t Follow the “Herd”: oftheskilllevelofthedissident.As(cid:9)decreases, Biased Case (cid:10) the favorable range of the ratio under P be- (cid:11) Letusreturntothe“oneDMdoesn’tfollow comesquitesmalland,eventually,vanishes((cid:9)(cid:6) theherd”modelanalyzedabovebutallowthe .40). In this case, even if one could select deter- DMs in the herd to be biased in their predic- ministically the best member of the group, their tions. Intuitively, the wisdom of the crowd modest correlation with the criterion would not effect, as defined in Proposition 1, should offsetthereductioninsamplingerrorbyincorpo- depend not just on the magnitude of the DM y. rating an unweighted average of the rest of the biasesbutalsoontheirrelativeconfiguration. s.adl DMs.Tosummarize,unlessonecouldnearlyde- For example, a herd in which the DM biases sherbro terministically select the best member of the are equally likely to be above/below (cid:1)y will dpubliminated gwreoiugph,tewdhgoromupusatvbeerahgieghwliyll,skoinlleadv,eraagsiem,pprleevuanil-. hliekredlyinrewsuhlitchinadllifDfeMrenbtialosges(Ra)revailnuethsethsaanmae salliedisse verIstuissPinutnedreesrttihnegsethaastsutmheptpioenrfsoirsmhaingchelyonfonC- dirWecetiocno.nsider two cases. In the symmetric ite oneofottob linear. Under the extreme case of (cid:11)(cid:10) (cid:2) 1, a cnaosmetohreeplirkeedliycttioonbebaiabsoevseotfhtahnebheelrodwD(cid:1)Myswaitrhe orisn wisdom of the crowd effect is guaranteed to the dissident DM as the sole member whose ationand occur by Result 1, yet this is not the condition p(cid:1)redviacltuioenssfoarrethuinsbmiaosdeedl.Tunadbeler1thdeisspylmaymsetthrye sociuser that yields maximal log(R) values. Across all coXndition for the five group sizes, N (cid:6) 5, 10, Asal three conditions, the largest values of log(R) 15,20,25.Recallthatbecause(cid:1) isdefinedto ologicalindividu oimccaullryw, hceonrrtehlea(cid:10)tehderdwiisthmothdeestclryi,tebruiotnn.otUmndaxer- bbeiasz.eAros,bietfosruef,fiwceesatsosusmpeectihfaytya(cid:1)lXl ptroedmicotidoenl ychthe thesevaluesof ,thedissidenthasareasonable values are standardized so that values of (cid:1)X Psof (cid:11) can be interpreted as bias in units of standard anse chanceofnotbeingselectedwiththeremaining deviations. As shown in Table 1, the number cu erial group members having relatively smaller pre- of DMs in each group with positive or nega- mn Aso dictioncorrelationswiththecriterion.Yet,there tive biases are roughly equal and symmetric eer hp is a large amount of information present in the with respect to bias magnitude. As group size bytthe group as a whole, as measured by small judge increases, the magnitude of the biases also yrightedolelyfor ilinkteelrycoprererflaotrimone,xstroemtheelypwreedlli.ction of C will imnicnrueasstews,owstiatnhdaardmdaexviimatailonbsi.as of plus or ps coed sd in entnte documcleisi TBaiabsleC1onfigurationsfortheFiveHypotheticalGroupsUndertheSymmetryCondition Thishisarti Nuomfber Possiblebiasvalues T decision 0 .5 (cid:3).5 1 (cid:3)1 1.5 (cid:3)1.5 2 (cid:3)2 makers (DMs) Countspergroup N(cid:6)5 1 1 1 1 1 0 0 0 0 N(cid:6)10 1 3 2 2 2 0 0 0 0 N(cid:6)15 1 3 3 3 3 1 1 0 0 N(cid:6)20 1 4 3 3 3 3 3 0 0 N(cid:6)25 1 3 3 3 3 3 3 3 3 Note. Thecolumnsindicatethepossiblebiasvaluesweconsider,aswellasthenumberofDMsineachgroupwiththat biaslevel.Forexample,thegroupwith5DMshasoneunbiasedDM,withtheremaining4DMshavingbiaslevelsof.5, (cid:3).5,1,and(cid:3)1.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.