ebook img

Multiclass MinMax Rank Aggregation PDF

0.13 MB·
by  Pan Li
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Multiclass MinMax Rank Aggregation

Multiclass MinMax Rank Aggregation Pan Li and Olgica Milenkovic ECE Department, University of Illinois at Urbana-Champaign Email: [email protected], [email protected] Abstract—We introduce a new family of minmax rank aggre- distances between π and the elements of Σk) or a minimum gationproblemsundertwodistancemeasures,theKendallτ and distance (which equals the smallest distance between π and theSpearmanfootrule.AstheproblemsareNP-hard,weproceed an element in Σk). The above described MinMax problem to describe a number of constant-approximation algorithms for is motivated by a number of applications in which classes of solving them. We conclude with illustrative applications of the aggregation methods on the Mallows model and genomic data. rankings arise due to different ranking criteria or properties 7 of the ranking entities (social platforms) or due to prior 1 I. INTRODUCTION 0 knowledgeofdifferentsimilaritydegreesingroupsofrankings Rankings, a special form of ordinal data, have received 2 (genome evolution). The minmax criteria is typically used significant attention in the machine learning community as when trying to ensure that the aggregate violates each vote n theyariseina numberofimportantapplicationdomains,such a (class of votes) to roughly the same extent. asrecommendersystems,socialvotingandproductplacement J WestartouranalysiswiththeMinMaxproblemwithC =1 platforms. Of particular importance are rankings of the form 8 andunderthemedianandminimumdistance,andthenproceed 2 of linear orders (permutations) and partial rankings (weak to study the problemfor the case of arbitraryvaluesof C and orders), which are frequently obtained through conversion m , k = 1,...,C. For both the case of the Kendall τ as ] fromratings.Oneof themainprocessingtasksforrankingsis k G well as the Spearman footrule in the median and minimum rank aggregation,which often involvesevaluating the median L distance setting, the MinMax problems may be shown to of a set of permutations or partial rankings under a suitably . be NP-hard by using the corresponding results of [3]. In s chosen distance function [2], [4], [7], [9], [11], [12], [16]. particular, the work in [3] outlines a general framework for c The median rank aggregation problem under the Kendall τ [ provingNP-hardnessresults for the median, single class min- distance was introduced by Kemeny [11], and was proved to max-aggregation problem under different ranking distances. 1 beNP-hardbyBartholdietal.[4].Anumberofapproximation v Nevertheless,onlyahandfulofapproximationalgorithmswere algorithmsfortheproblemhavebeendescribedin [2],mostly 5 proposedevenforthisbasicmin-max-aggregationform:Tothe pertainingtopermutations;acorrespondingPTAS(polynomial 0 best of our knowledge, the only provable algorithm for the 3 timeapproximationscheme)wasproposedin[12].Inthecon- single class MinMax under the minimum distance measure 8 textofpartialrankingaggregation,knownsolutionsincludethe was providedin [3]. The algorithmtakes the form of the well 0 results of [1], [10]. Median aggregation under other distance studied ”pick-a-permutation” method, and tends to perform . 1 functions has received less attention, one notable exception poorly in practice. 0 being the Spearman rank aggregation problem [7], which The main results of our work include families of con- 7 is known to provide a constant approximation for Kendall 1 stant approximation algorithm for the new, general family τ aggregation using a polynomial time algorithm based on : of multiclass MinMax problems, both under the median and v weighted bipartite matching [9]. minimum class distance, evaluated using the Kendall τ and Xi We propose to investigate a broad new family of rank Spearman footrule. Furthermore, we illustrate the use of the aggregation problems in which the median is replaced by a r new aggregation paradigm on the problem of finding an a minmax type of function and where the rankingsare grouped ancestral genome arrangement for mitochondrial DNA under in classes. More precisely, assume that there are C > 1 the tandem duplication model for genomes [6]. differentclassesofrankingsandletΣk ={σk,σk,...,σk }be 1 2 mk the set of mk =|Σk| rankings belonging to the class labeled II. MATHEMATICALPRELIMINARIES by k ∈ [C]. Our minmax rank aggregation problem may Let S denote a set of n elements, which without loss of be succinctly described as follows: Output a ranking π that generality we set to [n] ≡ {1,2,...,n}. A ranking is an agreesintheminmaxsensewiththerankingsbelongingtothe ordering of a subset of elements Q of [n] according to a different classes. Rigorously, we seek to solve the following predefinedrule. When Q=[n], the resulting order is referred optimization problem: to as a permutation. When the rankings include ties, they are MinMax: minmaxλ d(π,Σk), referred to as partial rankings [10]. k π k More precisely, a permutation is a bijection σ : [n]→[n], where λ > 0 represent the costs of violating the agreement andthesetofpermutationsover[n]formsthesymmetricgroup k with rankings in class k. In the above formulation, d(π,Σ) ofordern!,denotedbyS .Foranyσ ∈S andx∈[n],σ(x) n n stands for a distance between a ranking or partial ranking π denotestherank(position)oftheelementxinσ.Wesaythatx and a set of rankings Σk, and it may be chosen to be of the isrankedhigherthany (rankedlowerthany) iffσ(x)<σ(y) form of a median distance (which equals the total sum of (σ(x)>σ(y)). The inverse of a permutation σ is denoted by σ−1 :[n]→[n].Clearly,σ−1(t)representstheelementranked DefinitionII.1. SupposethatπisarankingandthatΣisaset at position t in σ. Similarly, partial rankings [10] represent a of rankings. Given a distance between two rankings d (·,·), ⋆ mappingover[n]inwhichtheremayexisttwoelementsx6=y the median-⋆ distance (med−⋆) between π and Σ equals such that σ(x) = σ(y). It is common to use σ(x) to denote 1 the position of the element x in the partial ranking σ, and to dmed−⋆(π,Σ)= Xd⋆(π,σ). |Σ| define it as σ∈Σ DefinitionII.2. SupposethatπisarankingandthatΣisaset σ(x),|{y ∈[n]:y is ranked higher thanx}| of rankings. Given a distance between two rankings d (·,·), ⋆ 1 the min-⋆ distance (min−⋆) between π and Σ is defined as + (|{y ∈[n]:y is tied withx}|+1). (1) 2 d (π,Σ)=mind (π,σ). min−⋆ ⋆ A number of distance functions between rankings were σ∈Σ proposedintheliterature[7],[10],[14].Onedistancefunction We recall that the focal problem of this work is to find countsthenumberofadjacenttranspositionsneededtoconvert constant approximation algorithms for the MinMax rank ag- a permutation into another. Adjacent transpositions generate gregation problem, which reads as S ,i.e.,anypermutationπ ∈S canbeconvertedintoanother n n MinMax: minmaxλ d(π,Σk), permutation σ ∈ Sn through a sequence of adjacent transpo- π k k sitions [14]. The smallest number of adjacent transpositions where d(π,Σk) is a med−⋆ or min−⋆ distance, with ⋆ ∈ needed to convert a permutation π into another permutation {τ,S,K,prS}. In our future analysis we use λ∗ , max λ σ is termed the Kendall τ distance, denoted by d (π,σ). The k k τ and M , {k : λ = λ∗}. Furthermore, we let π∗ denote the Kendall τ distance between two permutations π and σ over k argumentoftheoptimalsolutionoftheMinMaxproblemand [n] also equals the numberof pairwise inversionsof elements let W =max λ d(π∗,Σk). of the two permutations: k k III. APPROXIMATE MINMAX AGGREGATION d (σ,π)=|{(x,y):π(x)>π(y),σ(x)<σ(y)}|. (2) τ Aspreviouslypointedout,theMinMaxproblemunderboth the med−⋆ and min−⋆ can be shown to be NP-hard using AnotherpositionaldistancemeasureistheSpearmanfootrule, the results of [3], which established hardness for the special dS(σ,π)= X |σ(x)−π(x)|. case mk = 1 and d(·,·) a pseudometric. We hence focus on devising approximation algorithms for the MinMax problem. x∈[n] A. Permutations It can be shown that d (π,σ)6d (π,σ)62d (π,σ) [7]. τ S τ One may similarly define a generalization of the Kendall We first consider ordinal data of the form of permutations. τ distance for partial rankings π and σ over the set [n]. This We show that a simple algorithm, which we term Pick-Rnd- distance is known as the Kemeny distance, and equals Perm, can achieve a 2-approximation in expectation for the case of the med−⋆ problem whenever d (·,·) is a pseudo- ⋆ d (π,σ)=|{(x,y):π(x)>σ(y),π(x)<σ(y)}| metric.Then,for⋆∈{τ,S},wedescribetwo2-approximation K 1 algorithmsthatuse a combinationof convexoptimizationand + |{(x,y):π(x)=π(y),σ(x)>σ(y) 2 rounding procedures and offer significantly better empirical or π(x)>π(y),σ(x)=σ(y)}|. (3) performance than random selection. Finally, we describe a 2-approximation algorithm for the min−⋆ problems when The Spearman footrule analogue for partial rankings [10] d (·,·) is a pseudometric. The selection algorithm essentially ⋆ equalsthesumoftheabsolutedifferencesbetween“positions” transforms the min − ⋆ problem into a med− ⋆ problem: of elements in the partial rankings, Thus, the algorithms developed for approximating multiclass med−⋆problemsmaybeusedtoapproximatecorresponding dprS(σ,π)= X |σ(x)−π(x)|, instances of the min−⋆ problem. x∈[n] The Pick-Rnd-Perm Algorithm. Pick a permutation π from ∪ Σ uniformly at random. where positions are as defined in (1). The Spearman footrule k∈M k distance for partial rankings is a 2-approximation for the Theorem III.1. For the d (·,·) distance, where d is med−⋆ ⋆ Kemeny distance [10]. a pseudometric, the Pick-Rnd-Perm algorithm produces a 2- The notion of a distance between two rankings has an approximation of the med−⋆ problem. important extension in terms of a distance between a ranking Proof: For a given k, and a set of rankings,which we referto as rank-setdistances. We focus our attention on two types of rank-set distances, λ mk defined below. For compactness, we use ⋆ to denote an λkdmed−⋆(π,Σk)= mk Xd⋆(π,Σk) k arbitrarydistance onpairsofrankings,butfocusourattention i=1 6λ d (π,π∗)+d (π∗,Σk) 6λ∗d (π,π∗)+W. throughoutthe paper on ⋆∈{τ,S,K,prS}. k(cid:2) ⋆ m−⋆ (cid:3) ⋆ By calculating the expectation, we obtain v) whose positions are determined by pivoting on v. Define E[maxλkdmed−⋆(π,Σk)]6λ∗E[d⋆(π,π∗)]+W 62W. Pv(u)={(x,y):x, y ∈Vv, hvxhyv =1}, k Akv(u)= X(hxvwvkx+hvxwxkv)+ X wxky, Clearly, random selection may be improved by picking x∈Vv (x,y)∈Pv the optimal permutation from ∪ Σ instead. We term this approach Pick-Opt-Perm. Altkh∈oMughkthe Pick-Rnd (Opt) Bvk(u)= X(uxvwvkx+uvxwxkv)+ X (uxywykx+uyxwxky). -Perm algorithms are exceptionally simple and offer a 2- x∈Vv (x,y)∈Pv approximation to the optimal solution, they have a number The rounding procedure makes iterative calls to the the fol- of drawbacks, including the fact that the aggregate is a given lowing routine. ranking from the clusters, which violates fairness rules of mmKT-Conv (V,u) aggregates, and that its empirical performance is typically very poor. To mitigate these problems, we propose more 1:Choosethepivotv∈V according tov=argmina maxk BAakak((uu)). sophisticated aggregation algorithms for both the med − τ 2:SetVL=∅,VR=∅. 3:Forallx∈Vv: and med−S problems. Ifhxv=1,VL←VL∪{x}.Otherwise, VR←VR∪{x}. Case I: d = dmed−τ. For C = 1, a well known method 4:Return[mmKT-Conv(VL,u),v,mmKT-Conv(VR,u)]. termedrandompivotingproposedbyAilonetal.[1],[2]offers a 2-approximation in expectation for both the permutation Theorem III.2. The iterative application of the mmKT-Conv and partial rank aggregation problem. In random pivoting, at algorithm outputs a permutation with at most twice the cost each step, one element in the ranking is chosen uniformly at of the optimal solution of the linear program (4). random and the remaining elements are partitioned based on At each iteration of rounding, Ak(u) denotes the cost of v the pairwise comparisonwith the pivotelement. However,for rounding incurred by the class k of rankings, while Bk(u) v thecaseoftheMinMaxproblemwithC >1,randompivoting denotes the associated cost of the linear program for class k. maybeinadequate:Thedifficultyliesinthefactthatrankings Hence,thegoalistoprovethatforthegivenchoiceofthepivot indifferentclassesmayleadtowidelydisparatepairwisepivot v, we have Ak(u) 6 2Bk(u) for all k ∈ [C]. Suppose that v v comparisons. Another problem in this context is that while k′ is the index of the class that maximizes Akv(u) at the first one may achieve a constant approximation in expectation step of mmKT-Conv. Then, it suffices to shoBwvk(tuh)at Ak′(u)6 for each class individually, the largest cost among classes v 2Bk′(u). This result is a corollary of the following lemma. may not be bounded due to the exchange of the expectation v and maximization operators. Therefore, instead of pivoting, Lemma III.3. Ak(u)62 Bk(u), ∀k ∈[C]. Pv∈V v Pv∈V v one must resort to a different approach to the problem. Our Proof: To prove the claimed result, it suffices to prove approach is to deterministically round the fractional solution that for any two distinct elements x,y, one has of a specific convex optimization problem. The deterministic rounding procedure is motivated by ideas in [16]. h w +h w 62(u w +u w ), (5) Let wk , λk mk 1{σk(x) < σk(y)}, where 1 stands xy yx yx xy xy yx yx xy for the ixnydicatomrkfuPnci=ti1on, anid let wki = 0 for all x,k. For and for any triple of distinct elements x,y,z, one has xx a given ranking π, also define the variables u ,1{π(x)< xy Xhxzhzywyx 62Xhxzhzy(uxywyx+uyxwxy), (6) π(y)}. The MinMax problem may be stated as where the summation is circular over all permutations of min q x,y,z. Both summations are taken over all possible permuta- u,q tions of the two (three) elements in the argument. s.t. X wxkyuyx 6q for allk ∈[C] (4) The inequality (5) is easy to prove: Suppose that h =1. xy x,y∈[n] Then the sum on the left hand side equals w 6 2u w yx xy yx uxy ∈{0,1}, which is bounded by the right hand side expression. To u +u =1 for alli,j ∈[n], i6=j prove the inequality (6), consider the six variables associ- xy yx u +u +u >1 for all distinctsx,y,z ∈[n] ated with x,y,z, namely hxy,hyx,hxz,hzx,hyz,hzy. These xy yz zx variablesmay be partitionedinto two classes, {h ,h ,h } xy zx yz Note that if the rankingsare permutations, then wxky+wykx = and {hyz,hxz,hzx}. There are at least three variables that λ , which is a value that only depends on k. are 0’s. Without loss of generality, suppose that the class k The above integer program may be relaxed to a linear pro- {hxy,hzx,hyz} contains at least two 0’s. gram by allowing uxy to take fractionalvalues. Upon solving Case 1: Assume that hxy,hzx,hyz = 0. Then, the difference the linearprogram,oneneedsto roundthe valuesofu . The of the left and right hand side of the inequality under consid- xy next rounding procedure guarantees a 2-approximation. eration equals Leth =1 ,ifx>y,andh =1−h ,ifx<y. Let v bexya pivuoxtyin>g1/e2lement for the roxuynding pryoxcedure and (1−2uxy)wyx+(1−2uyz)wzy +(1−2uzx)wxz− use P (u) to denote the set of pairs of elements (excluding 2u w −2u w −2u w . v xz zx yx xy zy yz The claimed result then follows from observing that (1 − λ d (π∗,σk)6W. As d is pseudometric, we have k ⋆ 1 ⋆ 2u )w 6u (w +w ). xy yx yx yz zx λ λ Case 2: Assume that hxy = 1,hzx,hyz = 0. The left hand λ k+jλ d⋆(σ1k,σ1j)6W. side equals w 6 2u w which is clearly bounded from k j yx xy yx above by the right hand side expression as hxy =1. Next, choose an arbitrary k˜ ∈ M and let k′ = Case II: d = dmed−S. When C = 1, the MinMax argmaxj∈[C]/{k˜}λjd⋆(σ1k˜,σ1j). Then, aggregation problem may be solved in polynomial time via weighted bipartite matching [9]. However, when C > 1, the min max λjd⋆(σ1k,σ1j)6λk′d⋆(σ1k˜,σ1k′) problem is hard even if m =1 for all k [3]. k∈[C]j∈[C]/{k} k Step1:Ifweremovetheintegralconstraintonthepositionof 6 2λk˜λk′ d (σk˜,σk′)62max λkλj d (σk,σj). elements in π, the optimization problem of interest is convex λk˜+λk′ ⋆ 1 1 k,j λk+λj ⋆ 1 1 and may be solved efficiently: Moreover, the output π of min-Pick-Perm satisfies λ mk maxd (π,Σj)= min min max min d (σk,σj) u∗ =um∈iRnnmkaxmkk X||u−σgk||1, (7) j∈[C] min−⋆ k∈[C]σik∈Σkj∈[C]/kσgj∈Σj ⋆ i g g=1 6 min max λ d (σk,σj). j ⋆ 1 1 where ||u−σk|| = |u(h)−σk(h)|. k∈[C]j∈[C]/{k} g 1 Ph∈[n] g Step 2 (mmSP-Conv): We assign positions to elements ac- The result follows by combining the above inequalities. cording to the fractional solution u∗ as follows. If u∗(x) < RemarkIII.1. Let(i∗,k∗)betheoptimalindicesgeneratedby u∗(y),we letπ(x)<π(y)foranytwodistinctelementsx, y, min-Pick-Perm. Define Σ˜k∗ ={σk∗} and let i∗ with ties broken randomly. Σ˜j ={σ ∈Σj :d⋆(σik∗∗,σ)=dmin−⋆(σik∗∗,Σj)} Theorem III.4. mmSP-Conv rounding increases the cost of for j ∈[C]/{k∗}. A c−approximatesolution for the med−⋆ the convex optimization problem (7) at most twice. problem with input {Σ˜ } , denoted by π′, satisfies k k∈C Proof: First, we claim that the output of mmSP-Conv, denotedbyπ ,isinΠ′ ,{π′ ∈Sn :||u∗−π′|| =min||u∗− maxλjdmin−⋆(π′,Σj)6 maxλjdmed−⋆(π′,Σ˜j) S 1 j∈[C] j∈[C] xπ,||1y}∈. T[nhi]ssfaotilslofywsπ(sxin)c>e fπor(ya)nyanrdanuk∗i(nxg)π<, iuf∗t(wyo),ewleemmenatys 6cmπinjm∈a[Cx]λjdmed−⋆(π′,Σj)6cjm∈a[Cx]λjdmed−⋆(σik∗∗,Σj) transposexandy inπ toobtainasmaller||u∗−π||1.Second, 62cW. for an arbitrary permutation σ, we have Hence, π′ is a 2c−approximation for the original min−⋆ ||πS −σ||1 6||πS −u∗||1+||σ−u∗||1 62||σ−u∗||1. problem. Therefore, convex optimization and rounding can be used on the med−⋆ problem. We refer to these adapted The claim follows by setting σ =σk, i∈[m ], k ∈[C]. i k algorithms as min-mmKT-Conv and min-mmSP-Conv. Note thatthe integralitygapofthe problems(4) (7) is 2,as onemayconsidertwoequallyweightedclasses,eachofwhich B. Partial rankings contains one single ranking, (1,2,3,4,...) and (2,1,3,4,...), All the algorithms proposed for permutation aggregation respectively. Hence, the best approximation constant via the generalizetopartialrankingaggregation.Onemayeasilyshow useofucannotbelessthan2,whichimpliesthattheproposed that as long as the distance d defined for partial rankings ⋆ rounding is optimal. One may expect to achieve a smaller is a pseudometric (e.g., ⋆ ∈ {K,prS}), the 2-approximation approximation constant by outputting the better of the two guarantees for all previous methods hold. To get a fractional results produced by Pick-Rnd-Perm and mmKT(SP)-Conv. solution in the program of mmKT-Conv, we have to change Thisapproachwillbediscussedinthefullversionofthepaper. the constraint (4) to We introducenextthemin-Pick-Permalgorithmforsolving 1 the dmin−⋆ problem. 2Tk+ X wxkyuyx 6w for allk ∈[C], x,y∈[n] m1i:nF-Porickea-Pcherkm∈(ΣC1a,nΣd2,ea..c.h,ΣraCn)k,in(gλ1σ,kλ2∈,.Σ..k,λC). 1 mk 32::LeCto(im∗p,ukt∗e)S=coarerkig(=i,km)maxinj∈SCc/o{rekki}.iλOjumtpiuntσπsj∈=Σjσdik∗⋆∗(.σik,σsj). Tk = mk Xi=116xX<y6n1(σik(x)=σik(y)), which does not depend on the type of output ranking. Also, Theorem III.5. If d is pseudometric,then min-Pick-Permis notethatwk forpartialrankingsdoesnotsatisfy theequality ⋆ xy a 2-approximationalgorithm for the min−⋆ problems. wxky+wykx =λk,althoughthetriangleinequalitywxy+wyz > w still holds. As the proof of Theorem III.2 only requires xz Proof: By the definition of the min−⋆ problem, each the later inequality, the same rounding procedure offers a 2- class contains at least one permutation, which without loss approximation.Also, in the optimization problem (7) one has of generality we denote by σk ∈ Σk, k ∈ [C], that satisfies 1 to use the definition σ(x) for partial rankings. TABLEI TABLEII COMPARISONOFRANKAGGREGATIONMETHODS:OBJECTIVEVALUE MITOCHONDRIALDNA(MTDNA)AGGREGATION (STANDARDDEVIATION) d Aggregated Sequences med−τ 1107217123091123192021 A.d med−τ mmKT-Conv 210 1335315142526616322834 φ1 0.5 0.7 0.9 1.0 4242718362931833225 mmKT-Conv 14.5(1.1) 16.3(1.4) 17.8(1.3) 17.9(1.5) 12721736203291011351230 Pick-Rnd-Perm 17.8(1.4) 19.9(2.1) 21.5(1.8) 21.6(2.1) Pick-Opt-Perm 267 21919182833781626143413 Pick-Opt-Perm 15.9(1.8) 18.1(1.8) 20.0(1.7) 20.0(1.6) 24153225422236315 FASLP-Pivot 15.3(1.4) 17.7(2.1) 19.4(2.2) 19.7(2.3) 12177231232030216910 FASLP-Pivot 269 1115192825271832833241334 B.d med−S 144352926163631225 φ1 0.5 0.7 0.9 1.0 mmSP-Conv 23.3(1.7) 26.0(2.1) 28.1(2.3) 28.4(2.2) Pick-Rnd-Perm 27.0(2.4) 29.9(2.7) 32.4(2.7) 32.1(2.6) In our experiment,we used the mtDNA dataset from [5]. The Pick-Opt-Perm 24.5(1.9) 27.5(2.4) 29.9(2.3) 29.9(2.1) dataset contains 11 metazoan genomes with 36 gene-blocks SP-Matching 26.3(3.0) 30.5(3.6) 35.3(3.5) 35.9(3.4) in some arrangement.We removedthe “signs” of gene orders C.dmin−τ and let each genome represent one class, so that C = 11 φ1 0.5 0.7 0.9 1.0 and mk = 1 for all k; we fixed λk = 1. Table II shows the min-mmKT-Conv 6.9(1.9) 8.6(2.3) 9.8(2.6) 10.0(2.4) results. Due to page limitations, we relegate the significantly min-Pick-Perm 8.4(1.7) 10.5(1.9) 11.8(2.2) 12.0(1.8) morespaceconsumingempiricalstudyofweightedmulticlass FASLP-Pivot 9.3(1.9) 11.1(2.2) 12.9(2.7) 13.1(2.2) mtDNA aggregation to the extended version of the paper. D.dmin−S REFERENCES φ1 0.5 0.7 0.9 1.0 min-mmSP-Conv 11.9(2.6) 14.1(2.8) 16.1(3.6) 16.3(3.1) [1] Nir Ailon. Aggregation of partial rankings, p-ratings and top-m lists. min-Pick-Perm 13.9(2.4) 16.7(2.7) 18.9(3.0) 18.9(2.5) Algorithmica, 57(2):284–300, 2010. SP-Matching 17.1(3.5) 22.4(4.3) 26.2(4.2) 27.2(4.0) [2] NirAilon, MosesCharikar, andAlantha Newman. Aggregating incon- sistentinformation:rankingandclustering.JournaloftheACM(JACM), 55(5):23, 2008. IV. SIMULATIONS [3] Christian Bachmaier, Franz J Brandenburg, Andreas Gleißner, and Andreas Hofmeier. On the hardness of maximum rank aggregation We compare the performance of three families of al- problems. Journal ofDiscreteAlgorithms,31:2–13, 2015. gorithms: Convex optimization procedures with rounding [4] JohnBartholdiIII,CraigATovey,andMichaelATrick.Votingschemes forwhichitcanbedifficulttotellwhowontheelection. SocialChoice (mmKT-Conv, mmSP-Conv, min-mmKT-Conv, min-mmSP- andwelfare,6(2):157–165, 1989. Conv), permutation selection (Pick-Rnd-Perm, Pick-Opt- [5] Guillaume Bourque and Pavel A Pevzner. Genome-scale evolution: Perm, min-Pick-Perm) and algorithms used for traditional reconstructing gene orders in the ancestral species. Genome research, 12(1):26–36, 2002. min-median rank aggregation (FASLP-Pivot [2] and SP- [6] KamalikaChaudhuri,KevinChen,RaduMihaescu,andSatishRao. On Matching[9]).Thecomparisonshowsthatalgorithmsbasedon the tandem duplication-random loss model of genome rearrangement. convexoptimizationyieldsignificantlybetterresultsthannaive In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pages 564–570. Society forIndustrial and Applied selection methods, and that traditionalaggregationalgorithms Mathematics, 2006. are poor candidates for solving MinMax problems. [7] Persi Diaconis and Ronald L Graham. Spearman’s footrule as a First, we evaluate the proposed algorithms on synthetic measure of disarray. Journal of the Royal Statistical Society. Series B(Methodological), pages262–268, 1977. data. The synthetic data is generated based on what we call a [8] Liviu PDinu andRadu Ionescu. Anefficient rank basedapproach for two-levelMallowsmodel:First, we generatethe permutations closeststringandclosestsubstring. PLoSOne,7(6):e37576, 2012. {σ1,...,σC} independentlybased on the Mallows distribution [9] Cynthia Dwork, Ravi Kumar, Moni Naor, and Dandapani Sivakumar. P(σk) ∝ φdτ(σk,e) [13]. Then, for each class k ∈ [C], we Rank aggregation methods for the web. In Proceedings of the 10th 1 international conference on World Wide Web, pages 613–622. ACM, generate m permutations σk,...,σk independently accord- 2001. ingtotheMkallowsdistributio1nP(σikm)k∝φd2τ(σik,σk).Wesetthe [10] REroinkaVldeeF.aCgoinm,pRaraivnigKanudmaagr,grMegoahtianmgmraandkiMngashwdiiatnh,tiDes.SIinvaPkruomceaer,dianngds number of classes to C = 3, fix φ = 0.7 and let each class of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on 2 contain m = 10 permutations. To control the distance be- Principles ofdatabasesystems,pages47–58.ACM,2004. k [11] JohnGKemeny. Mathematics withoutnumbers. Daedalus,88(4):577– tweendifferentclasses,wechooseφ1from{0.5,0.7,0.9,1.0}. 591,1959. The objective function values for 100 independent samples, [12] Claire Kenyon-Mathieu and Warren Schudy. How to rank with few errors. In Proceedings of the thirty-ninth annual ACM symposium on obtained by different algorithms, are shown Table I. Theoryofcomputing, pages95–103.ACM,2007. Our next test example comes from evolutionary biology, [13] ColinLMallows. Non-nullrankingmodels.i.Biometrika,44(1/2):114– and is concernedwith MitochondrialDNA (mtDNA) genome 130,1957. [14] RichardPStanley. Enumerativecombinatorics. Number49.Cambridge aggregation. The aggregate in this case corresponds to an university press,2011. ancestral genome. The most common used rearrangement [15] Glenn Tesler. Efficient algorithms for multichromosomal genome distance between two nuclear genomes is based on rever- rearrangements. JournalofComputer andSystem Sciences, 65(3):587– 609,2002. sals [15], but mitochondrialDNA rearrangementstudies have [16] Anke Van Zuylen and David P Williamson. Deterministic algorithms also involved the Kendall τ distance [6]. In the latter case, for rank aggregation and other ranking and clustering problems. In the authors only considered the median problem C = 1, InternationalWorkshoponApproximationandOnlineAlgorithms,pages 260–273.Springer, 2007. although the min-max problem is equally relevant [8], [3].

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.