Table Of ContentDynamic Task Allocation for Crowdsourcing Settings
AngelaZhou ANGELAZ@PRINCETON.EDU
PrincetonUniversity,Princeton,NJ08544
IrineoCabrerors CABREROS@PRINCETON.EDU
PrincetonUniversity,Princeton,NJ08544
7 KaranSingh KARANS@PRINCETON.EDU
1 PrincetonUniversity,Princeton,NJ08544
0
2
b
Abstract 1.1.PreviousWork
e
F
DawidandSkenestudiedthisestimationproblemin1979
5 We consider the problem of optimal budget al- in the context of responses to surveys [4]. In the Dawid-
2 location for crowdsourcing problems, allocating Skene stochastic model, each user has an associated reli-
users to tasks to maximize our final confidence ability score dictating the probability with which the user
]
G inthecrowdsourcedanswers. Suchanoptimized answersabinaryquestioncorrectly.Thejointestimationof
L worker assignment method allows us to “boost” user reliabilities and correct answers in this model is very
. the efficacy of any popular crowdsourcing esti- well studied in statistics, even for the case of k-ary ques-
s mation algorithm. We consider a mutual infor- tions. Popular approaches use Expectation Maximization
c
[ mationinterpretationofthecrowdsourcingprob- to find the true labels maximizing the likelihood of esti-
lem, which leads to a stochastic subset selec- mated answer labels and reliabilities. Recent work uses
2
v tion problem with a submodular objective func- spectral methods (based on eigenvalues of the assignment
5 tion. We present experimental simulation re- graph) to initialize the iterative expectation maximization
9 sultswhichdemonstratetheeffectivenessofour algorithm, proving optimality convergence rates of such a
7 dynamic task allocation method for achieving scheme [7]. Another approach achieving state-of-the-art
8 higheraccuracy,possiblyrequiringfewerlabels, performance finds the labeling via a minimax conditional
0
as well as improving upon a previous method entropyapproach[8].
.
1 which is sensitive to the proportion of users to
0 questions. Most analysis of the estimation algorithms assume ran-
7 dom sampling. However, conceivably, intermediate es-
1
timates of the reliabilities of workers can inform better
:
v worker-task assignments which lead to higher quality fi-
Xi 1.Introduction nal estimates. A 2011 paper by Karger, Oh, and Shah on
budget-optimalcrowdsourcingtracksmultipleinstancesof
r
a Therationaleofcrowdsourcingistoleveragethe“wisdom distinct assignment graphs but assumes a highly specific
ofthecrowd”insolicitingsomekindofresponseorsugges- ”spammer-hammer” model where all users either answer
tion. Crowdsourcingallowsfortheannotationoflargere- randomly or correctly. Another approach models the as-
searchcorpusesfortrainingdata-intensivemodelssuchas signment problem as a Markov Decision Process and de-
deepneuralnetworks. PlatformssuchasMechanicalTurk rivestheOptimisticKnowledgeGradient,whichcomputes
allowforthesystematicandprogrammaticimplementation the conditional expectation of choosing certain workers,
and assignment of crowdsourcing tasks. The crowdsourc- assuming a Beta-Bernoulli prior on the reliability of each
ingestimationproblemisespeciallydifficultbecauseboth worker [2]. However the assignment scheme requires ex-
the reliabilities of users and the true answers to the ques- tensiverecomputationandupdatingbetweeneachindivid-
tionsareunknown. ual question-worker assignment, an unrealistic frequency
ofmodelupdating.
International Conference on Machine Learning, New York, NY,
Classical Dawid-Skene Model In the classical Dawid-
USA,2016. AppearingintheworkshopData-EfficientMachine
Skenemodel, abbreviated“D-S”, therearenusersandm
Learning.Copyright2016bytheauthor(s).
DynamicTaskAllocationforCrowdsourcingSettings
questions with correct answers {−1,+1}. (Without loss tual information sums over all possible outcomes of the
of generality we may map the answers to {0,1}.) Let random variables X, Y, and measures the mutual depen-
G ∈ {0,1}n×m denoteabipartitematrixindicatingifthe dence between two variables by quantifying the “amount
ith useransweredthejth question. Thisistypicallycalled ofinformation”obtainedaboutonerandomvariablebyob-
the assignment matrix. Additionally, let a ∈ {−1,+1}m servinganother(Cover,2006). Avariantofthemutualin-
be a vector which captures the correct answer for each of formation between realized outcomes of the random vari-
themquestions. Letp ∈ [0,1]n beavectordenotingeach ables X = x,Y = y is the pointwise mutual information
user’sreliability,fornusers. LetA ∈ {−1,0,1}n×m bea whichhasfoundapplicationsinstatisticalnaturallanguage
stochasticanswermatrixsuchthat processing. We define another variant of mutual informa-
tion,partialmutualinformation,betweenarandomvari-
0 ifG =0
ij able on the source channel, X, and observed outcomes of
A = a withprobabilityp ,ifG =1
ij j i ij Y = y, where we only integrate over the randomness of
−a withprobability1−p ,ifG =1
j i ij the unknown source channel. Partial Mutual Informa-
tion, pMI(X; Y), is defined between random variable X
2.MethodforQuasi-OnlineTaskAllocation
andoutcomesofrandomvariableY =y.
We derive an improved task allocation scheme and model (cid:18) (cid:19)
(cid:88) p(x,y)
the crowdsourcing problem as an optimization problem pMI(X;Y)= p(x,y)log
p(x)p(y)
with an information-theoretic objective, since we want to x∈X
assignworkerstotaskstogainthemostinformationabout
thetruelabelofthequestion. Inournotation:
Previous work (unpublished) by the same authors (cid:89) (cid:89)
P[X =1,{Y }]= F (1−F )
(Cabreros, 2015) proposes a two-step estimation method m m im jm
that uses a budget parameter, s ∈ [0,1], in two stages. i|yi=1 j|yj=0
Running a crowdsourcing estimation algorithm when half
ofthebudgethasbeenallocatedyieldsestimatesofthetrue
answers, whichmaybeusedwithamixturemodelonthe However, solving the cardinality-constrained integer pro-
topicsofquestionstoestimatereliabilitiesofeachuser,F˜. gramtoassignuserstoquestionsismostlikelyNP-hard,as
Thensampleusersmorelikelytoprovidecorrectanswers (Krause,2012)reduceasimilarformulationofanentropy
for specific questions during the remaining portion of the minimizationproblemtoindependentset.Thebestthingto
budgettoarriveatafinalA. ThefinalanswersmatrixAis doina‘one-shot’settingifourremainingbudgetsisless
thenusedasinputforthesameblack-boxestimationalgo- thanthenumberofquestionsmistochoosethesbestrela-
rithmtoarriveatthefinalestimates, asdepictedinFigure tiveimprovementsinthepartialmutualinformation,which
1. we denote as ∆pMI (v;X ;Y ) for the improvement
rel m m
withrespecttoqueryinguserv. EstimatingF hasalready
ij
computeda“nestedoptimization”andfoundwhichuseris
thebesttoassigntoacertainquestion.
3.Results
We evaluate the error rates of three methods: (1) running
Dawid-Skeneestimationonabudgetsrandomlysampled,
and(2)runningDawid-Skeneestimationonabudget s ran-
2
domlysampledandassigningtheremainingbudgetviathe
one-shotallocation,and(3)method(2)butassigningbud-
getviathedynamictaskallocation. Weevaluatetheerror
ratesfrom20randomsamplesofquestionsanduserrelia-
bilitiesfor1000usersand100questions,assumingamix-
ture of 2 topics, over 10 different budgets from assigning
Figure1.Schematic diagram of the dynamic task allocation 0.5%ofuserstoeachquestionto2%coverageinFigure2.
scheme.
We also consider how the dimensions of the estimation
How do we decide which questions require more budget problemimpacttheperformanceofthesesamplingpolicies
allocated? Weconsiderinformation-theoreticmetrics: mu- inFigure3: whenusersoutnumberquestions,theone-shot
DynamicTaskAllocationforCrowdsourcingSettings
Figure3.Comparingsamplingpolicyperformancewhenvarying
Figure2.Comparisonoferrorvs. budgetpercentagesforthedy- thenumberofquestions,10sampletrialseach.Errorbarsareone
namic allocation method (blue) against fully random sampling standard deviation. Dynamic task allocation (red solid) is com-
andDawid-Skene(dashedred)estimator.Notethatforsmallbud- petitivewithfullsampling(bluedashed),whileone-shotlearning
gets the dynamic allocation scheme significantly improves esti- (magentadashed)performspoorlyasthenumberofquestionsin-
mationerror.Forlargebudgetsbothmethodsarehighlyaccurate. creases
Errorbarsare1.96standarderrorsover25trials(confidencelevel
of95%)
compute the label estimates. We are able to show signifi-
cantimprovementinestimationaccuracyforsmallbudgets.
Our dynamic task allocation scheme is also robust for es-
allocationdoespoorlywhilethenewdynamicuseralloca-
timation schemes with higher ratios of questions to users,
tionisstillrobust.
wheretheone-shotlearningpolicysuffershighererrorthan
However, examining the intermediate accuracies yielded randomsampling. Theadvantageofsuchaschemeisthat
by the dynamic task allocation method indicate that the the mutual information heuristic we develop may be ex-
strengths of this method lie in the ability to achieve bet- tendedtoevaluateearlyterminationoftheestimationpro-
ter estimation accuracy performance with fewer samples, cess.
and terminate the estimation process early. Can we use
the estimates of mutual information in an optimal stop-
References
pingframeworktodevelopmoredata-efficientcrowdsourc-
ing estimation algorithms? “Efficiency” will be measured Cabreros, Irineo; Singh, Karan; Zhou Angela. Mixture
withregardstorandomsampling, wherewewanttoshow modelforcrowdsourcing. 2015.
our method is (1+(cid:15)) efficient using (1−δ) samples as
needed by random sampling, for any chosen δ. We will Chen, Xi; Lin, QIhang; Zhou Dengyong. Optimistic
considerthisquestioninthefullpaperandconjecturethat knowledge gradient policy for optimal allocation in
one point of connection between the channel capacity is crowdsourcing. JournalofMachineLearningResearch,
with the Fisher Information. Analysis would proceed by 2013.
consideringtheminimaxconvergenceratesoftheD-Sesti-
Cover, Thomas; Thomas, Joy. Elements of Information
mator,provedin(Gao,2013),toanalyzethedisadvantage
Theory. Wiley,2006.
ofusingfewersamples.
Dawid,A.P;Skene,A.M. Maximumlikelihoodestimation
4.Conclusions ofobservererror-ratesusingtheemalgorithm. Journal
oftheRoyalStatisticalSociety,1979.
We model task assignment in the crowdsourcing prob-
lem and develop a probabilistic model for a “partial mu- Gao, Chao; Zhou, Dengyong. Minimax optimal conver-
tual information” criterion which yields a one-step looka- gence rates for estimating ground truth from crowd-
head policy of which questions to ask next. We imple- sourcedlabels. MSRTechnicalReport,2013.
ment a batch estimation policy which exhausts the bud-
get by querying an additional label for each question, re- Krause, Andreas; Golovin, Daniel. Submodular function
estimating F˜ , and using these updated estimates to re- maximization. 2012.
est
DynamicTaskAllocationforCrowdsourcingSettings
Zhang, Yuchen; Chen, Xi; Zhou Dengyong; Jor-
dan Michael. Spectral methods meet em: A provably
optimalalgorithmforcrowdsourcing. NIPS,2014.
Zhou, D; Liu, Q; Platt J.C; Meek C. Aggregating ordi-
nallabelsfromcrowdsbyminimaxconditionalentropy.
ProceedingsofICML,2014.