Table Of ContentProperties of the Nonparametric Maximum Likelihood ROC Model with a Monotonic
Likelihood Ratio
LucasTcheuko
UniversityofMarylandCollegePark
FrankSamuelson
U.S.FoodandDrugAdministration
10903NewHampshireAvenue
Building62,Room3102
SilverSpring,MD20993-0002
3
1
0
2
n
a
J
1
3
]
P
A
.
t
a
t
s
[
1
v
0
2
7
7
.
1
0
3
1
:
v
i
X
r
a
Emailaddress:lucast@umd.edu(LucasTcheuko)
PreprintsubmittedtoJournalofMathematicalPsychology February1,2013
Properties of the Nonparametric Maximum Likelihood ROC Model with a Monotonic
Likelihood Ratio
LucasTcheuko
UniversityofMarylandCollegePark
FrankSamuelson
U.S.FoodandDrugAdministration
10903NewHampshireAvenue
Building62,Room3102
SilverSpring,MD20993-0002
Abstract
Weexpectthatsomeobserversinperceptualsignaldetectionexperiments, suchasradiologists, willmakerationaldecisions,and
therefore ratings from those observers are expected to form a convex ROC curve. However, measured and published curves are
oftennotconvex. Thisarticleexaminestheconvexity-constrainednonparametricmaximumlikelihoodestimatoroftheROCcurve
givenbyLloyd(2002). LikeLloydweusethePoolAdjacentViolatorAlgorithm(PAVA)toconstructtheestimateoftheconvex
curve. WepresentadirectproofthatthisestimateisaconvexhulloftheempiricalROCcurve. Theestimateissimpletoconstruct
byhand,andfollowsthesuggestionsbyPesce,etal.(2010).
We examine the properties of this constrained nonparametric maximum likelihood estimator (NPMLE) under a large number
of experimental conditions. In particular we examine the behavior of the area under the curve which is often used as summary
metric of diagnostic performance. This constrained ROC estimator gives an area under the curve (AUC) estimate that is biased
highwithrespecttotheusualempiricalAUCestimate,butmaybelessbiasedwithrespecttotheunderlyingcontinuoustrueAUC
value. TheconstrainedROCestimatorhaslowervariancethantheusualempiricalone. Unlikepreviousauthorswhousedcomplex
bootstrapping to estimate the variance of the constrained NPMLE we demonstrate that standard unbiased estimators of variance
workwelltoestimatethevarianceoftheNPMLEAUC.
Keywords: constrainedestimates,empiricallikelihood,PAVAalgorithm,convexROCcurves
1. Introduction OperatingCharacteristic(ROC)curve(Green&Swets,1966).
An example of such a curve is shown in Figure 3. The area
In signal detection experiments, such as diagnostic medical
underanROCcurve(AUC)istheprobabilitythattheradiolo-
imagingstudies,animageispresentedtoanobserverwhogives
gistwillgiveahigherratingtoadiseasedpatientthanapatient
an ordered categorical rating or score that indicates his confi-
withoutdiseaseandisusedtoratetheperformanceofthediag-
denceinthepresenceofasignalordiseaseinthatimage. For
noses.
example, asubjectmaygivearatingofonethroughfive, with
fiveindicatingtheobserverisveryconfidentthattheimagecon-
An ROC curve that is not convex implies that a human ob-
tains a signal. The probability distribution of ratings given to
server, such as a radiologist, will give high ratings to some of
thesubjectswhohavediseasewecallF,andthedistributionof
the images with signal or disease, but also give such images
ratings given to the subjects who do not have disease we call
very low ratings, even lower than the ratings given to com-
H. ExampledensitiesareshowninFigure1. Typicallyweas-
pletelyimageswithnosignal. Thehigherratingsareexpected,
sumethattheseratingsarisefromsomecontinuousunderlying
butthelowerratingsareirrational. Thereisnoreasoningen-
perceptual distribution and are discretized by the observer for
eralthatahumanobserverwouldgivethelowestofhisratings
convenience.
toimagesthatreallycontainasignal. Logically, scientistsbe-
Fromthistaskwecancreatemeasuresofdiagnosticperfor-
lievethatinthelimitofbarelyperceptiblesignalstheworstthat
mance. Ateachratingcategorywecancalculatethetruepos-
human observers could do is guess, in other words, assign a
tiverate(TPR),alsocalledsensitivity,andthefalsepositiverate
lowratingtoasignalpresentorsignalabsentimagewithequal
(FPR) or 1-specificity. The plot of TPR versus FPR across all
probability. Theoreticallythelikelihoodratioofapairofdistri-
orderedcategoriesyieldsanempiricalestimateoftheReceiver
butions,dF/dH,thatmodelhumandetectionresponsesshould
be a monotonically increasing function of the rating given by
Emailaddress:lucast@umd.edu(LucasTcheuko) theobserver. ModelsofROCdatathathavemonotoniclikeli-
PreprintsubmittedtoJournalofMathematicalPsychology February1,2013
hood ratios are called “proper.” (Green & Swets, 1966; Dorf-
manetal.,1997;Metz&Pan,1999;Lloyd,2002)Pesceetal.
(2010)detailwhyweshouldexpecttoseeconvexROCcurves
undertheassumptionthatradiologistswouldperformtheirtask
at least as accurately as guessing, and they propose a method
forconstructingaconvexROCcurveestimate.
Inthispaperweexaminethenonparametricmaximumlikeli-
hoodestimate(NPMLE)oftheROCcurveundertheconstraint
thatitslikelihoodratiobemonotonicthatwasfirstdevelopedby
Lloyd(2002). LikeLim&Won(2012)weprovethatthisesti-
mateisequivalenttoasimpleconvexhullaroundtheusualem-
pirical or unconstrained nonparametric ROC curve MLE. Un-
likeLim&Won(2012)weprovethisdirectlyfromthelikeli-
hoodandconstraintsthereon.
This paper examines the properties of the constrained ROC
NPMLE under a large number of population parameters. In
particularweevaluatethepropertiesoftheassociatedAUCes-
timate, the area under the the constrained ROC NPMLE. We 4
0.
demonstratethatthisestimatehaslessvariancethananuncon-
strainedestimator,anditmayhavelessbiasfordiscretizeddata.
UnlikeLloyd(2002)andLim&Won(2012),whousednu-
merical bootstrap techniques to estimate the variance of pa- es 0.3
g
rameters of the constrained ROC curve MLE, we have found a
m
that standard estimators of AUC variance may be used to ob- of I No Disease, h
tlaikinelvihaorioadncAeUeCstiemstaitmesatoef.cOovnesrtraaiwneiddenroannpgaeroamfseitmriuclmataioxnimcounm- ction 0.2 Diseased, f
a
figurationswehavefoundthattheseestimatesarenearlyunbi- Fr
ased and very robust. In summary this constrained estimator
1
hasarationalconvexROCcurve,issimpletoconstruct,itmay 0.
havesmallbias,itsvarianceiseasilyestimated,andthereforeit
maybeusefulinobserverperformanceexperiments.
0
0.
2. Data 1 2 3 4 5 6 7 8 9 10
Rating Given to Image
2.1. Notation
Ourdataofinterestistypicallyorderedcategoricaldatacol-
Figure 1: This figure shows two discrete density functions of
lectedfromhumansobservingtwoclassesofobjects.
ordinal observations from two different classes of objects, im-
Let
ageswithdiseaseandimageswithnodisease.Thisdataisfrom
x ,...,x beanIIDrandomsampleofratingsorscoresfrom
1 N Chanetal.(2001)andisalsogiveninTable2. Theempirical
thesignal-absentornon-diseasedpopulationwithdensityh,let
cumulativedistributionsofthesedatasetscanbeplottedagainst
y ,...,y bearandomsamplefromthesignal-presentordis-
1 M eachothertoformtheROCcurveshowninFigure3.
eased population with density f. All ratings are assumed to
be discrete in k ordered categories. Each signal-present score
y , s = 1...M corresponds to a category index d , and each
s s
signal-absentscorex , r=1...Ncorrespondstoacategoryc .
r r
Therearem signalobservationsandn signal-absentobserva-
i i
tionsintheithcategory.Forsimplicityweassumedthescoresin
eachsignal-absentcategoryitobeallidenticallyequaltox¯ and
i
thescoresineachsignal-presentcategory jtobeallidentically
equaltoy¯ .
j
(cid:88)k (cid:88)k
LetN = n,andM = m.
i i
i=1 i=1
2.2. ExampleDataSets
AsexampleswehaveselecteddatasetsfromastudybyChan
etal.(2001). Inthestudy,radiologistsreviewedX-rayimages
3
ofmammographicmicrocalcifications(lesions)thatweredigi- By first minimizing −log(L) as a function of q ...,q we
i k
tizedatdifferentresolutions.Everyradiologistratedeachlesion obtain q = mi+ni . Plugging mi+ni forq in Equation 3 and
i N+Mwi N+Mwi i
fromonetotenateveryresolution. Higherratingsindicateda ignoringtheconstanttermswearethenledtominimize
radiologist’sgreaterconfidenceinthemalignancyofthelesion.
(cid:88)k
Radiologistsgavelowerratingstolesionsthattheybelievedto (cid:96)= (m +n)log(N+Mw)−m log(w)
i i i i i
be benign. The study measured the ability of radiologists to
i=1
differentiate malignant lesions from benign lesions as a func-
asafunctionofW =(w ...,w ).
tionoftheimageresolutionasmeasuredbytheareaunderthe 1 k
ROC curve (AUC). In the study the authors used a bi-normal
parametricmethodasinDorfman&Alf(1969)toestimatethe 4. Estimates
ROCcurvefromtheten-categorydata.
4.1. UnconstrainedEstimates
The distribution of scores from radiologist 5 evaluating the
As a function of w ,...,w , (cid:96) is minimized for w¯ = mi/M.
lesionsataresolution140micronsisgiveninFigure1. We plug w¯ back to q1 = mik+ni and p = qw to oibtainni/tNhe
Figure3ashowstheempiricalROCcurvederivedfromthis knownempiiricallikeliihoodNe+sMtiwmiatesq¯ i= ni aindi p¯ = mi.
data. Easilyvisibleinthefigurearethe10solidlinesegments i N i M
derivedfromthe10ratingcategories. Weobservethattheem- In what follows we will refer to w¯i = mnii//NM as the uncon-
pirical ROC curve is not convex for this given data set. Later strained maximum likelihood ratio (UMLR). We should point
inSection7weusethisdataandratingsfromradiologist4ata outthattheseestimatesofq¯i,p¯i,andw¯igivetheusualempirical
resolutionof35micronstodemonstratethemethodsdescribed ROCcurve(Green&Swets,1966;Bamber,1975). Theempir-
inthispaper. ical likelihood ratios wi are the slopes of the segments of the
In this experiment we have confidence that doctors will not linesoftheempiricalROCcurve,i.e. theslopesofthelinesin
systematicallygivelowerratingstoimagesthatcontainlesions, Figure3a.
norwillthey
4.2. ConstrainedEstimates
systematically give higher ratings to images with no le-
sions. Therefore we believe the true underlying population Believing that the true ROC curve is convex, we are led to
ROCcurveswillbeconvex,andanynonconvexityintheseem- (cid:88)k
minimize(cid:96)= (m +n)log(N+Mw)−m log(w)subjectto
pirical ROC curve estimates is merely due to statistical uncer- i i i i i
i=1
tainty from finite sample size. The average ROC curve across themonotoniclikelihoodratioconstraintw ≤ w ≤ ··· ≤ w .
1 2 k
all radiologists and experiments in Figure 7 of Chan et al. This is equivalent to requiring that the slopes of the segments
(2001)supportsthisassertion. oftheROCcurvebeincreasingfromrighttoleft. Let p¯ andq¯
i i
be defined as in Section 4.1. Let P = Q = 1, and P =
0 0 i
3. LikelihoodFunction (cid:88)k−i (cid:88)k−i
p¯ , Q = q¯ ,fori=1...k.
j i j
The usual likelihood (Owen, 2001, 1988) of obtaining the j=1 j=1
Asafunctionofw ,...,w ,(cid:96)isasumofkindependentfunc-
categorizedratingsforbothsignalpresentandabsentobserva- 1 k
tions(cid:96) ,...,(cid:96) ;where(cid:96) =(m +n)log(N+Mw)−m log(w)
tionsfromourexperimentsgivenourestimatesofF andHis 1 k i i i i i i
isafunctionwithasingleminimumw¯ . AsshowninFigure2,
i
L ∝(cid:89)k pmi iqnii (1) (cid:96)idWecitrheoausetslouspsotofwg¯einaenrdaliintycrweaesceosnasfitdeerwrtawrdo.consecutivefunc-
i=1 tions(cid:96) (w)and(cid:96) (w).InFigure2weplot(cid:96) and(cid:96) asfunctions
1 2 1 2
where pi = dF(y¯i) = F(y¯i)− F(y¯i−);qi = dH(x¯i) = H(x¯i)− of w1andw2 respectively. The graph in Figure 2a shows the
H(x¯i−),andy¯i−isthelargestpossiblevalueofylessthany¯i.As UMLRw¯1of(cid:96)1lessthantheUMLRw¯2of(cid:96)2,leadingtoacon-
inthepowerbiasedmodelQin&Lawless(1994),Vardi(1985), vexROCcurve. Onthecontrary,inFigure2bwehavew¯1 >w¯2
and Vardi (1982), we set w = p/q. The likelihood function leadingtoanonconvexROC.
becomes Let w¯1 > w¯2 as in Figure 2b. Our goal is to find (w˜1,w˜2)
(cid:89)k suchthat(cid:96) (w˜ )+(cid:96) (w˜ ) = min (cid:96) (w )+(cid:96) (w ). Inthiscase,
L ∝ qni+miwmi. (2) 1 1 2 2 w1≤w2 1 1 2 2
i i we need to change the ordering of w and w while keeping
i=1 1 2
(cid:96) (w )+(cid:96) (w )assmallaspossible.
We use the Lagrange multipliers as in Anderson (1972) to 1 1 2 2
maximize the likelihood function. Maximizing L subject to
Let A(w ,(cid:96) (w ))andB(w ,(cid:96) (w )) be two arbitrary points
(cid:88)k (cid:88)k 1 1 1 2 2 2
qi =1and qiwi =1isequivalenttominimizing asinFigure2bsuchthatw1 <w2. Keepinginmindthatw¯1and
w¯ aretherespectiveglobalminimaof(cid:96) and(cid:96) weobtainthe
i=1 i=1 2 1 2
followinginequalities.
(cid:88)k
−log(L)= −(ni+mi)log(qi)−milog(wi)+C (3) Ifw1 <w¯2 then (cid:96)1(w¯2)+(cid:96)2(w¯2)<(cid:96)1(w1)+(cid:96)2(w2)
i=1 Ifw¯ <w then (cid:96) (w¯ )+(cid:96) (w¯ )<(cid:96) (w )+(cid:96) (w )
1 2 1 1 2 1 1 1 2 2
subjecttothesameconstraints,whereCisaconstant.
4
If w¯ < w < w < w¯ , (cid:96) decreases while (cid:96) increases on 2. Stop if ROC curve is convex; meaning, w˜ ≤ ... ≤ w˜ .
2 1 2 1 1 2 1 k˜
theinterval[w¯1,w¯2],leadingtothefollowinginequalities. Otherwisethereexistsiin1:k˜−1suchthatw˜i >w˜i+1.If
so,continue.
3. Thetwocategoriesiandi+1arecombinedintocategory
(cid:96) (w )<(cid:96) (w ) ⇒ (cid:96) (w )+(cid:96) (w ) < (cid:96) (w )+(cid:96) (w )
2 1 2 2 and 1 1 2 1 1 1 2 2 4. iA.fm˜teir←comm˜ib+inmi˜nig+1t,hne˜it←won˜cia+ten˜gio+r1i;ews˜ii←andmn˜˜iiMiN+. 1 into cate-
(cid:96)1(w2)<(cid:96)1(w1) ⇒ (cid:96)1(w2)+(cid:96)2(w2) < (cid:96)1(w1)+(cid:96)2(w2). gory i, the following commands describe how we elimi-
nate category i+1, reorganize the remaining categories,
andreassignthescorecategoryindexes.
The above inequalities show that the function (cid:96)1 +(cid:96)2 is never Fors=1...M, ifd˜s >i thend˜s ←d˜s−1;
at its minimum for two distinct values of w1andw2. We de- forr=1...N, ifc˜r >ithenc˜r ←c˜r−1;
ducethat(w˜1,w˜2)satisfiesw˜1 = w˜2.Settingw1 = w2 = wwe for j=i+1...k˜−1, n˜j ←n˜j+1; m˜j ←m˜j+1; w˜j ←w˜j+1;
minimizethefollowingexpression: k˜ ←k˜−1.
(cid:96) (w)+(cid:96) (w) = (m +n )log(N+Mw)−m log(w)+
1 2 1 1 1 5. Gotostep2. ProceeduntilROCcurveisconvex.
(m +n )log(N+Mw)−m log(w)
2 2 2
= (m +m +n +n )log(N+Mw)− Appliedtothedata,thePAVAcreatesanewdatasetwithtotal
1 2 1 2 number of categories k˜ which is less than or equal to k. We
(m +m )log(w).
1 2 respectivelydenotethenumberofsignal-absentscoresineach
newi-category,thenumberofsignal-presentscoresineachnew
Taking the derivative and setting equal to zero we obtain w˜ = i-category,thenewsignal-presentcategoryindex,andthenew
w˜ = w˜ = (m1+m2)N. Note that for w = w we obtain signal-absent category index of the new data set by n˜i, m˜i, d˜s,
(cid:96)∗1(w) = (cid:96)2 (w)+(n1(cid:96)+n(2w)M) whose graph is sim1ilar to t2hat of either andc˜r.
1 2
The dashed line in Figure 3b is obtained from applying the
(cid:96) or(cid:96) from Figure 2. The above proof and conclusions ap-
1 2
PAVA to the ROC curve in Figure 3a. In Figure 3a we have
ply to any two adjacent categories in a ROC curve where the
k=10. Let p, q, P and Q be defined as in Section 4.2. In
unconstrainedlikelihoodratiosarenotincreasing. i i i i
the first step there is a violation for i = 3, i.e w¯ is greater
The above demonstrates analytically that the obtained esti- 3
thanw¯ . ThetwosegmentsS andS arereplacedbyasingle
mate is the constrained maximum likelihood estimate of the 4 3 4
segmentS(cid:48) from(Q ,P )to(Q ,P )andw˜ issetsetequalto
nonN-optaeratmhaettrtihcemcoodnestlr.ained estimators w˜ = w˜1 = w˜2 yield a (m3M+m4/n3N+3n4). Thisl2ead2stoano4ther4violatio3nastheneww˜3 is
lessthanw¯ .WethenreplaceS andS(cid:48) byasinglesegmentS(cid:48)
solutionthatisequivalenttoaddingthetwoadjacentcategories 2 2 3 2
intoasinglecategorywithcountsm˜ =m1+m2andn˜ =n1+n2. from(Q1,P1)to(Q4,P4)andw˜2issetequalto ((mn22++mn33++nm44)M)N.
Tofindallconstrainedestimatesw˜ inasequenceof(cid:96),weuse In the next step the PAVA encounters another violation be-
i i
analgorithmsimilartoLloyd(2002). tweenthesegmentsS6andS7;thesegmentsS6andS7 arere-
placed by a single segment S(cid:48) . In the final step, the PAVA
6
encounters another violation as the slope of S(cid:48) is greater than
5. PAVA:PoolAdjacentViolatorAlgorithm 6
theslopeofS ;wethenreplaceS(cid:48) andS byasinglesegment
8 6 8
The PAVA, a simple iterative algorithm, is a standard non- andtheROCisconvex.
parametricmethodusedtoproducepointestimatesforafunc- An intuitive reason for implementing the algorithm can be
tion known to be monotone (Barlow et al., 1972; Robertson summarizedasthis:
etal.,1988;Best&Chakravarti,1990). Given that observers will not perform worse than guessing
It replaces each stretch of monotonicity-violating observa- we assume that any non-convexity in a measured empirical
tionswiththeweightedaverageoftheoriginalfunctionvalues ROC curve is due to a statistical variation from insufficient
inthatstretch. counts in a particular rating category. We therefore combine
Inordertoapplythetheoreticalapproacheddescribedabove, adjacent ordinal categories until the number of counts in the
wedevelopedanR(RDevelopmentCoreTeam,2009)version combinedcategoriesaresufficienttogenerateaplausiblecon-
ofthePAVAalgoritmsimilartothatofLloyd(2002).AsinSec- vexROCcurve.
tion2,weassumethedatahasbeendiscretizedintokdifferent This PAVA algorithm is equivalent to the algorithm given
categories. Keeping in mind that each signal-present score y in Section 1.2 of Robertson et al. (1988). Specifically using
s
corresponds to a category d and each signal-absent score x Robertson’s w, g, W, and G notation let w(x¯) = ni, g(x¯) =
s r i N i
corresponds to a category cr as assumed in Section 2, as the (cid:88)j (cid:88)j
PAVAcombinescategoriesitalsoreassignseachdatascoretoa mMi/nNi, Wj = wj, Gj = g(x¯i)w(x¯i).ThentheROCcurve,
newcategory. i=1 i=1
aplot of P = (W ,G ), j = 0,1,...,kwithP = (0,0)is the
WesequentiallydescribethePAVAasfollows. j j j 0
cumulativesumdiagram(CSD)forthefunctiongwithweight
1. k˜ ←k; foralli,w˜i ←w¯i = mniiMN;m˜i ←mi;n˜i ←ni;forall w.AdaptingRobertson’sapproach,findingthebestconvexplot
s,andforallr,d˜ ←d ; c˜ ←c fromthenonconvexonereducestofindingthe greatestconvex
s s r r
5
minorant(GCM)oftheCSDintheinterval[0,Z ]forthefunc- 6.1. SimulationParameters
k
tiongwithweightz.Thismethodwillleadtothesameestimate
We simulated data from normal distributions, D=N(µ,1),
obtainedinSection4.2.
andfromuniformdistributions,D=U(0,m),whichareuniform
Toverifythattheabovesolutionreallyisthemaximumlike-
in probability between 0 and m. For the uniform distributions
lihoodestimateundertheconstraintofamonotoniclikelihood
U, the non-diseased or signal-absent sample is drawn from
ratio,wewroteasoftwarefunctionwhoseinputisacombined
U(0,1), and the signal-present or diseased sample is drawn
vectorof p andq andwhoseoutputisthelikelihoodfunction
i i from U(0,m) where m ≥ 1. Both samples together form a
giveninExpression1iftheROCcurveisconvex. IftheROC
combinedsample. Inordertohavethedesired AUC valuefor
curveisnotconvex,theoutputisavaluelessthanzero.Wethen
theuniformdistribution,itfollowsfromAUC = P(Y > X)that
usedanumericaloptimizer(“optim”fromthepackageR(RDe-
AUC = 1− 1 fromwhichwederivem = 1 .Thispair
velopmentCoreTeam,2009))tomaximizethefunctionoverall 2m 2(1−AUC)
ofuniformdistributionsyieldsthemostasymmetricROCcurve
inputs p,q onmanyrandomsimulateddatasetswithrandom
i i
thatstillhasamonotonicallyincreasinglikelihoodratio.
initial values. The maximum likelihood function given by the
ForthenormaldistributionN,thediseasedsampleisdrawn
optimizer was usually equal to that obtained from the PAVA.
from N(µ,1) and the non-diseased sample is drawn from
Allothertimestheoptimizerfailedtoconvergeorconvergedto
N(0,1). To achieve the desired AUC, we used P(Y > X) =
a local maximum with likelihood less than that obtained from (cid:32) (cid:33)
the PAVA. This verified numerically that the PAVA gives the Φ √µY−µX , where Φ is the cumulative normal distribution
constrainedmaximumlikelihood. σ2X+σ2Y √
function, µ = µ, and µ = 0. We set µ = 2Φ−1(AUC).
Y X
ThismodelgivesacontinoussymmetricconvexROCcurve.
6. BiasandVarianceEstimationofAUC
6.2. BiasStudy
The Wilcoxon-Mann-Whitney statistic is equivalent to the
area under the ROC curve (AUC) and is frequently used as a We first considered a vector of AUC values with elements
summarymeasureofseparabilityoftheobservationsthatcon- ranging from .6 to .99 with step equal to .01. For each AUC
tain a signal from observations that do not, or separability of valuefromtheabovevector,wesampleddatafromeithernor-
imagesthathavediseasefromthosewhodonot. Theareaun- mal distributions, D=N, or uniform distributions, D=U. We
derthepopulationROCcurveisequaltotheprobabilitythata simulated10,000independentrandomcombinedsamplesfrom
rating of an image with a signal is higher than a signal-absent D.Eachcombinedrandomsampleconsistedofasampleofsize
rating, AUC= P(Y > X) (Bamber, 1975). The area under the 55 drawn from the non-diseased distribution and a sample of
unconstrainedempiricalmaximum-likelihoodROCcurveesti- size110drawnfromthenon-diseasedpopulation. Thesesam-
mateis ple size are typical in many studies in medical literature. We
A(cid:91)UC = 1 (cid:88)N (cid:88)M I (4) then discretize each sample into seven categories. For each
NM rs discretized combined sample we computed the usual uncon-
r=1 s=1
strained AUC estimate, then applied the PAVA algorithm and
Weassumethateachpair(r,s)correspondstoapair(cr,ds)such computedtheconstrainedAUCestimate.
thatxrbelongstothecategorycrandysbelongstothecategory In Figure 4 we plotted the average constrained and uncon-
ds. strained AUC estimates from the normally distributed simula-
Irs = 0121 iiifffcccrrr =><dddsss (5) tstirtouraneinvueendrdsiAusscUrteChteieztsertduimepaoAtpUeuCilsa.taiWolwneaAfiyrUssbtCin.aosTteheditslhorawetdtwuhceittiahovnreerosapfgeiecntufontorcmothnae--
tionandAUCestimatesduetodiscretizationiswellknown.We
The area under the constrained empirical maximum-
alsonotethattheaverageconstrainedAUCestimateisalways
likelihoodROCcurveestimatehasthesameformandis
greaterthantheaverageunconstrainedAUCestimate. Thisre-
A(cid:93)UC = 1 (cid:88)N (cid:88)M I(cid:48) (6) scuulrtviesicnonSseicsttieonntw4.i2th. tThoeccoonnssttrruuccttiothneocfothnestrcaoinnsetdraRinOedCReOstCi-
NM rs
r=1 s=1 matewecombinecategoriesinawaythatalwaysincreasesthe
areaunderthecurve. WithrespecttothetruepopulationAUC,
where
I(cid:48) = 012 iiffcc˜˜r =>dd˜˜s (7) tthimeeasvegrraegaetecro,ndsetpraeinndeidngAUonCtehsetivmaaluteeisofsoAmUeCti.mDesepleesnsd,isnogmoen-
rs 1 ifc˜r <d˜s AUC,samplesize,anddiscretenessofdata,theconstrainedes-
r s timatemaybelessbiasedthantheunconstrainedestimate.
In this Section we use simulated data to measure how the
constrainedAUCestimatediffersfromtheunconstrainedAUC
6.3. VarianceEstimationoftheAUC
estimateandatruecontinuousAUCpopulationvalue. Wealso
examinewaystoestimatevarianceoftheconstrainedAUCes- In this Section we demonstrate empirically that a standard
timate. varianceestimatoroftheempiricalareaunderthecurve(AUC)
6
ℓ2 ℓ2
ℓ1 ℓ1
d (ℓ)
o
o
h
eli BB
k
Li
g
o
L
-
AA
w1=w~1 w2=w~2 w2 w~ w1
Likelihood Ratio Likelihood Ratio
(a)LikelihoodsofaConvexEstimate (b)LikelihoodsofaNon-convexEstimate
Figure2: Thesearegraphsofthenegativeloglikelihoodcomponents(cid:96)asafunctionofthelikelihoodratiosw.Theunconstrained
minimumof(cid:96) isw¯ . InFigure2atheunconstrainedminimumw¯ ofthe−loglikelihoodfunction(cid:96) islessthantheunconstrained
i i 1 1
minimum w¯ of (cid:96) ; therefore, the likelihood function (cid:96) = (cid:96) + (cid:96) yields a convex ROC. In Figure 2b w¯ > w¯ meaning the
2 2 1 2 1 2
corresponding-loglikelihoodratioyieldsanon-convexROCcurve.
alsoprovidesagoodestimatetheAUCoftheaboveconvexity-
restricted ROC curve estimate. Our method of variance esti-
mation can be considered as a two-way analysis of variance
Figure4: Thisfigureshowsthefractionalbias(themeanesti- (ANOVA) with one observation per cell as in Chapter 7 of
matedvaluedividedbythetruevalue)ofthetwononparamet- Sheffe´ (2009). We model I where each cell consists of 1 if
rs
ricestimatorsofAUCinthispaperasafunctionofAUC.The thediseasedscore sisgreaterthanthenon-diseasedscorer,of
meanoftheusualunconstrainednonparametricAUCestimate 1 ifbothscoresareequal,and0otherwise.Seeequation5.The
2
(equation4)isalwayslessthanthetrueunderlyingsimulation model is I = µ+a +b +(cid:15) for r = 1...N;s = 1...M, or
rs r s rs
AUC value because the observed values are discretized. The (cid:88)N (cid:88)M
constrained AUC estimate (equation 6) is always greater than equivalentlyforAUC,Aˆ = N1M (µ+ar+bs+εrs)where
the unconstrained AUC estimate. The constrained AUC esti- r=1 s=1
µrepresentstheoverallmeanAUC.Therandomvariablea has
matecanbeconsideredlessbiasedthantheusualunconstrained r
meanzeroandvarianceσ reflectingthevariabilitybetweenthe
empriricalAUCestimateformostvaluesoftheAUC. a
diseasetests. b hasmean0andvarianceσ reflectingthevari-
s b
ability between the non-disease tests. ε has mean zero and
rs
e AUC)1.02 UCnocnosntrsatirnaeinded nvaorni-adnicseeaσseεtreesptsr.esTehnetitnogtatlhveariinatnecraecotifoAnˆibsetween disease and
u
ns/(Tr01 Var(Aˆ)= σ2a + σbb + σ2ε (8)
atio1. N M NM
ul
m Therespectiveestimatesσˆ , σˆ , andσˆ ofσ , σ , andσ are
Si a b ε a b ε
s 0 computed as in equation 7.4.10 of Sheffe´ (2009). For simula-
s0
cro1. tion purposes, we chose the distribution D=U or D=N as in
A
e Section 6.2 and then chose the parameters of the distribution
mat sothatAUC=.6,AUC=.84,andAUC=.94whichrespectively
Esti0.99 correspond to m = 1.25,m = 3.125, andm = 25/3 for D=U
C andtoµ = .358, µ = 1.406, andµ = 2.199forD=N. There
U
n A are often more non-disease patients available than disease pa-
ea98 tients. With that in mind, we assumed in the simulation the
M0.
0.6 0.7 0.8 0.9 1.0 ratioofnon-diseasesamplesizetodiseasedsamplesizetotake
True AUC values r = 1,r = 3, and r = 9. This means for each com-
a a a
binedsamplesetconsistsofMobservationstreatedasdiseased
and N = r × M treatedasnon-diseased. Wediscretizedeach
a
sample into k categories, where k takes values k=3, k=7, and
7
l
l(Q,P)
1 1
S2
y
ensitivit (Q4,P4)lS4 l(SQ33,lP(Q3)2,P2)
S
l
l
l(Q7,P7)
l
l
l
1−Specificity
(a)NPMLEROCCurve
l
l
l
l l
l
l
l
l
l
l
1−Specificity
(b)ConstrainedNPMLEROCCurve
Figure3: InFigure3aweplottedtheROCcurveofthedatafromChanetal.(2001)describedinSection2.2andgiveninTable2.
TheROCcurveisthesetallpossiblemeasurablespecificityandsensitivitypairs.Weobtainedtheconstrainedestimate(dottedline)
inFigure3bbyapplyingthePAVAtothedatausedinFigure3a. InFigure3awehavek=10linesegments. P andQ aredefined
i i
inSection4.2. SegmentsS ,S ,andS formaviolationbecausetheslopew¯ ofsegmentS issolow. Thereforewereplacethose
2 3 4 4 4
3segmentswithasinglesegmentS(cid:48) withslopew˜ = (m +m +m )N/(n +n +n )M AlsoseeTable2. CategoriesS ,S ,and
2 2 2 3 4 2 3 4 6 7
S alsoformaviolationandarereplacedwithasinglecategory.
8
8
k=16. For each quadruple (D,M,r ,k), we generated 10,000
a
independentsetsofcombinedrandomsamplesfromthechosen
distribution.
For each quadruple (D,M,r ,k), we computed the uncon-
a
strained AUC Aˆj and the estimate of variance (cid:98)Vj using Ex-
u u
pression 8. We then applied the PAVA algorithm, computed
the constrained AUC A(cid:98)j, computed the estimate of the con-
c
strained AUC variance (cid:98)Vj using Expression 8. We then com-
c
puted an approximate confidence interval of the AUC, Icj =
(cid:34) (cid:113) (cid:113) (cid:35) 1−α
A(cid:98)j−z (cid:98)Vj,A(cid:98)j+z (cid:98)Vj , forα = .05, where z is the
c α/2 c c α/2 c α
standardnormaldeviatewithprobabilityα. Figure 5: This figure plots the square root of the mean of the
variance estimates of AUC as a function of the standard devi-
Foreachcombinedsamplesetof104simulations,weplotted
ationsoftheAUCestimatesofoursimulations. Dataforboth
the square root of the mean of all the unconstrained variance
the usual unconstrained empirical AUC estimate and the con-
estimates(cid:98)Vj versusthestandarddeviationacrossallthesimu-
u strained AUC estimate are displayed. Each point represents
latedunconstrainedAUC A(cid:98)uj inFigure5. Thesedatapointslie 104 simulated samples from one of our simulation configura-
on the diagonal of the plot indicating that our estimate of the
tions. The true standard deviation of AUC of the constrained
variance of the empirical AUC estimate is unbiased (Gallas &
ROC curves are lower on average than the standard deviation
etal.,2009).
of the AUC of the unconstrained ROC curves, and the mean
estimateofstandarddeviationisreducedbyalmostexactlythe
Also in Figure 5 we plotted the square root of the mean of
the constrained variance estimates (cid:98)Vj versus the standard de- sameamount.
c
viationacrossallthesimulatedconstrainedAUC A(cid:98)j. Wenote
c
fromtheseplotsthatthetruestandarddeviationofAUCofthe l Unconstrained lll
constrainedROCcurvesareloweronaveragethanthestandard 05 Constrained ll l
deviationoftheAUCoftheunconstrainedROCcurves,andthe 0.
mean estimate of the standard deviation is reduced by almost ll
exactlythesameamount. ate lllll
m4 l
eomfFtphrioerimccaotlnhAsetUrfiaCginuevrdeareiitamnispciearpiacplasaloreevnstetirmtyhaawtteeoolulfreAssttUaimnCda,ataerdnsdtehsiettimivsaarrtoioabrnucoseft on Esti0.0 llllllllllllllllllllllll
asasiigbmdaiTeliuionttlihsateteetsist,oahtwbnetho.hlivaisecrgahesesastreriaemrntitagohteneedofwrfcaeopcrnateifiropadnomesrnetoctiefnertsiThnaoetbevsrleievmra1wlutIlhhacitejcihoc.nowFsvreefoarvmalalgirTneiegapdbroloteuhbt1e-- Standard Deviati0.020.03 llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllll
weobservethattheoverallperformanceofthe1−eαstimateofvari- lllllllllll
anceisaccurateexceptinafewcaseswherethefollowingcon- lllllllllllllll
danitdioanslaergxeisntusmimbuelrtaonfecoauteslgyo:riaesl.arge AUC, small sample size 0.01 lllllllll
0.01 0.02 0.03 0.04 0.05
Note that there is nothing special about the AUC variance Standard Deviation Across Simulated AUC
estimator that we used. Other variance estimates (Bamber,
1975; Campbell, 1994) also make useful measures the vari-
ance of the constrained AUC estimate. These simple analytic
variance estimators agree well with bootstrap estimation tech-
niques (Efron & Tibshirani, 1993; Lloyd, 2002; Lim & Won,
2012). This includes the simple AUC variance approximation
A(cid:91)UC(1−A(cid:91)UC)(N−1 + M−1)/4 (Wagner & et al., 1997). This
form makes clear the binomial-like behavior of AUC and the
expectedreductioninvariancecorrespondingtotheincreasein
AUC.
9
7. Application likelihoodROCcurveestimate.Holdingthetotalnumberofob-
servationsconstant,werecalculatethenumberofobservations
InSection5weshowedhowtoobtaintheconstrainedROC in each category implied by the constrained estimate. This is
curve estimate using data from Chan et al. (2001). Figure 3a equivalenttocombiningadjacentcategoriesthatdonothavein-
shows the application of the PAVA to the ROC curve of radi- creasinglikelihoodratios,asinTable2. Nextweusethesenew
ologist5describedinSection2.2. Tenordinalcategorieswere combinedobservationnumberstocalculatetheempiricalAUC
reduced to 6 categories to construct a convex ROC curve. In estimate in the usual manner. This is the constrained NPMLE
this section we use the same data to compute the constrained ofAUC.Wealsocanusethosenewcombinedobservationsin
and unconstrained estimates of AUC as well as the estimates standardnonparametricestimatorsofAUCvariancetoestimate
ofvarianceofthoseAUCvalues. Alltheseestimatesaregiven thevarianceoftheconstrainedNPMLEofAUC.Thisestima-
inTable3. Alsogivenaretheresultsforradiologist4reading tor is a useful option for nonparametric analysis of data from
imagesatadifferentresolution. studiesofobserversignal-detectionperformancewheretheob-
serversareassumedtogenerateconvexROCcurvesrationally.
Unconstrained Constrained
AUC Variance AUC Variance
9. Acknowledgements
Radiologist4 .7386 .002605 .7618 .002471
Radiologist5 .6622 .002886 .68472 .002782
TheauthorsaregratefultoDr. Chanandtheotherauthorsof
Chanetal. 2001fortheuseoftheirdatainthispaper.
Table3: Thistablegivestheconstrainedandunconstrainedes-
timates of AUC and the variance AUC given given the ratings References
of radiologist 5 evaluating lesions at a resolution of 140 mi-
crons. The same estimates are given for radiologist number 4 Anderson,J.A.(1972). Separatesamplelogisticdiscrimination. Biometrika,
59,19–35.
evaluating the lesions at a resolution of 35 microns. The esti-
Bamber,D.(1975). Theareaabovetheordinaldominancegraphandthearea
matesofAUCwerecalculatedusingExpressions4and6,and belowthereceiveroperatingcharacteristicgraph. Journalofmathematical
thevariancewasestimatedusingExpression8. psychology,12,387–415.
Barlow, R.E., Bartholomew, D.J., Bremner, J.M., &Brunk, H.D.(1972).
StatisticalInferenceUnderOrderRestrictions:TheTheoryandApplication
AsexpectedtheconstrainedAUCestimatesaregreaterthan ofIsotonicRegression.NewYork:JohnWiley&Sons.
the unconstrained AUC estimates. Correspondingly the vari- Best,M.J.,&Chakravarti,N.(1990).Activesetalgorithmsforisotonicregres-
ance estimates of the constrained AUC estimates are smaller sion;aunifyingframework.MathematicalProgramming,47,425–439.
Campbell,G.(1994).Advancesinstatisticalmethodologyfortheevaluationof
thantheunconstrainedvarianceestimates.
diagnosticandlaboratorytests.StatisticsinMedicine,13,499–508.
Chan,H.P.,Helvie,M.A.,Petrick,N.,Sahiner,B.,Adler,D.D.,Paramagul,C.,
Roubidoux,M.A.,Blane,C.E.,Joynt,L.K.,Wilson,T.E.,Hadjiiski,L.M.,
8. Summary &Goodsitt,M.M.(2001). Digitalmammography: Observerperformance
studyoftheeffectsofpixelsizeonthecharacterizationofmalignantand
Insomestudiesofobserversignal-detectionperformancewe benignmicrocalcifications.AcademicRadiololgy,8,454–466.
mayassumethatdecisionsmadebyhumanobserversareratio- Dorfman,D.D.,&Alf,E.(1969). Maximumlikelihoodestimationofparam-
etersofsignaldetectiontheoryanddeterminationofconfidenceintervals–
nal,implyingtheconvexityofapopulationROCcurve. Inthat
ratingmethoddata.JournalofMathematicalPsychology,6,487.
scenario we may want to analyze the data under that assump- Dorfman,D.D.,Berbaum,K.S.,Metz,C.E.,Lenth,R.V.,Hanley,J.A.,&
tion. Dagga,H.A.(1997). Properreceiveroperatingcharacteristicanalysis:The
bigammamodel.AcademicRadiology,4,138–149.
Inthispaperweexaminedonesuchmethod,theconstrained
Efron, B., & Tibshirani, R. J. (1993). Introduction to the Bootstrap. Boca
nonparametricmaximumlikelihoodROCcurveestimategiven Raton:Chapman&Hall/CRC.
by Lloyd (2002). We proved directly that this this estimate is Gallas,B.D.,&etal.(2009). Aframeworkforrandom-effectsROCanalysis:
a simple convex hull around the usual empirical ROC curve. Biaseswiththebootstrapandothervarianceestimators. Comm.inStat.-
TheoryandMethods,38,2586–2603.
We examined the properties of the area under this estimate
Green, D. M., & Swets, J. A. (1966). Signal Detection Theory and Psy-
and found that it has lower variance than the usual empirical
chophysics.NewYork:JohnWiley&Sons.
AUCestimate,anditmayalsohavelowerbiaswithrespectto Lim,J.,&Won,J.(2012).Rocconvexhullandnonparametricmaximumlike-
a continous nondiscretized ROC curve. Our simulations indi- lihoodestimation.Mach.Learn,88.
Lloyd,C.J.(2002). EstimationofaconvexROCcurve. StatisticsandProba-
catethatstandardestimatorsofvarianceoftheMann-Whitney-
bilityLetters,59,99–111.
Wilcoxonstatistic(AUC)variancemaybeusedtoobtainnearly Metz, C. E., & Pan, X. (1999). Proper binormal ROC curves: theory and
unbiasedandveryrobustvarianceestimatesofconstrainednon- maximum-likelihoodestimation. JournalofMathematicalPsychology,43,
parametricmaximumlikelihoodAUCestimate. Bootstrapping 1–33.
Owen,A.B.(1988).Empiricallikelihoodratioconfidenceintervalsforasingle
oftheconvexityalgorithm(PAVA)isnotrequired.
functional.Biomaker,75,237–249.
TheconstrainedROCcurveestimateissimpletoconstructin Owen, A. B. (2001). Empirical Likelihood: Monographs on Statistics and
amannersimilartothatsuggestedinPesceetal.(2010). First AppliedProbability82.BocaRaton:CRC.
a regular empirical ROC curve is constructed. Then the low- Pesce,L.L.,Metz,C.E.,&Berbaum,K.S.(2010). OntheconvexityofROC
curvesestimatedfromradiologicaltestresults.Acad.Rad.,17,960–968.
estconvexenvelopethatwillenclosethisROCcurveisdrawn.
Qin, J., & Lawless, J. (1994). Empirical likelihood and general estimating
Thisenvelopeisthenewconstrainednonparametricmaximum equations.Ann.Statist.,22,300–325.
10