Strategic aspects of the probabilistic serial rule for the allocation of goods HARISAZIZandSERGEGASPERSandNICKMATTEIandTOBYWALSH,NICTAand UniversityofNewSouthWales,Australia andNINANARODYTSKA,UniversityofToronto,Canada Theprobabilisticserial(PS)ruleisoneofthemostprominentrandomizedrulesfortheassignmentproblem. Itiswell-knownforitssuperiorfairnessandwelfareproperties.However,PSisnotimmunetomanipulative behaviourbythe agents. We examine computationaland non-computational aspects of strategisingunder the PS rule. Firstly, we study the computational complexity of an agent manipulating the PS rule. We 4 presentpolynomial-timealgorithmsforoptimalmanipulation.Secondly,weshowthatexpectedutilitybest 1 responsescancycle.Thirdly,weexaminetheexistenceandcomputationofNashequilibriumprofilesunder 0 thePSrule.WeshowthatapureNashequilibriumisguaranteedtoexistunderthePSrule.Fortwoagents, 2 weidentifytwodifferenttypesofpreferenceprofilesthatarenotonlyinNashequilibriumbutcanalsobe n computed in linear time. Finally, we conduct experiments to check the frequency of manipulability of the a PSruleunderdifferentcombinationsofthenumberofagents,objects,andutilityfunctions. J CategoriesandSubjectDescriptors:F.2.2[AnalysisofAlgorithmsandProblemComplexity]:Nonnu- 5 mericalAlgorithmsandProblems—Computationsondiscretestructures;I.2.11[Artificial Intelligence]: 2 DistributedArtificialIntelligence—MultiagentSystems;J.4[ComputerApplications]:SocialandBehav- ioralSciences—Economics ] GeneralTerms:Algorithms,Economics,Theory T G AdditionalKeyWordsandPhrases:fairdivision,strategyproofness,randomassignment,probabilisticserial rule,Nashdynamics,bestresponses. . s c 1. INTRODUCTION [ The assignment problem is one of the most fundamental and important problems in 1 economics and computer science [see e.g., Bogomolnaia and Moulin 2001; Ga¨rdenfors v 1973;HyllandandZeckhauser1979;Azizetal.2013b;SabanandSethuraman2013a]. 3 Agents express preferences over objects and, based on these preferences, the objects 2 areallocatedtotheagents.Arandomizedorfractionalassignmentruletakesthepref- 5 erences of the agents into account in order to allocate each agent a fraction of the 6 object. If the objects are indivisible, the fraction can also be interpreted as the proba- . 1 bilityofreceivingtheobject.Randomizationiswidespreadinresourceallocationsince 0 it is one of the most natural ways to ensure procedural fairness [Budish et al. 2013]. 4 Randomizedassignmentshavebeenusedtoassignpublicland,radiospectratobroad- 1 castingcompanies,andUSpermanentvisastoapplicants[Footnote1in Budishetal. : v 2013]. i Typical criteria for randomized assignment being desirable are fairness and wel- X fare.Theprobabilisticserial(PS)ruleisanordinalrandomized/fractionalassignment r a rule that fares better on both counts than any other random assignment rule [Bogo- molnaia and Heo 2012; Bogomolnaia and Moulin 2001; Budish et al. 2013; Katta and Sethuraman2006;Kojima2009;Yilmaz2010;SabanandSethuraman2013b].Inpar- ticular, it satisfies strong envy-freeness and efficiency with respect to both stochastic dominance(SD)anddownwardlexicographic(DL)relations[BogomolnaiaandMoulin 2001;SchulmanandVazirani2012;Kojima2009].SDisoneofthemostfundamental relations between fractional allocations because one allocation is SD-preferred over another iff for any utility representation consistent with the ordinal preferences, the Emails: [email protected], [email protected], [email protected], [email protected],[email protected] Technical Report:January2014. 2 H.Azizetal. formeryieldsatleastasmuchexpectedutilityasthelatter.DLisarefinementofSD and based on lexicographic comparisons between fractional allocations. Generaliza- tions of the PS rule have been recommended in many settings [see e.g., Budish et al. 2013]. The PS rule also satisfies some desirable incentive properties. If the number of objects is not more than the number of agents, then PS is weak strategyproof with respect to stochastic dominance [Bogomolnaia and Moulin 2001]. However, PS is not immunefrommanipulation.1 PS works as follows. Each agent expresses linear orders over the set of houses (we use the term house throughout the paper though we stress any object could be allo- cated with these mechanisms). Each house is considered to have a divisible probabil- ity weight of one, and agents simultaneously and with the same speed consume the probability weight of their most preferred house. Once a house has been consumed, theagentproceedstoeatthenextmostpreferredhousethathasnotbeencompletely consumed. The procedure terminates after all the houses have been consumed. The randomallocationofanagentbyPSistheamountofeachobjecthehaseaten.2 Weexaminethefollowingnaturalquestionsforthefirsttime:whatisthecomputa- tional complexity of an agent computing a different preference to report so as to get a better PS outcome? How often is a preference profile manipulable under the PS rule?. 3 The complexity of manipulation of the PS rule has bearing on another issue that hasrecentlybeenstudied—preferenceprofilesthatareinNashequilibrium.Ekiciand Kesten [2012] showed that when agents are not truthful, the outcome of PS may not satisfy desirable properties related to efficiency and envy-freeness. Because the PS rule is manipulable it is important to understand how hard, computationally, it is for an agent to compute a beneficial misreporting as this may make it difficult in prac- tice to exploit the mechanism. It is also interesting to identify preference profiles for which no agent has an incentive to unilaterally deviate to gain utility with respect to hisactualpreferences.Hence,weconsiderthefollowingproblem:forapreferencepro- file,doesa(pure)Nashequilibriumexistornotandifitexistshowefficientlycanitbe computed? In order to compare random allocations, an agent needs to consider relations be- tween random allocation. We consider three well-known relations between lotter- ies [see e.g., Bogomolnaia and Moulin 2001; Schulman and Vazirani 2012; Saban and Sethuraman2013b;Cho2012]:(i)expectedutility(EU),(ii)stochasticdominance(SD), and (iii) downward lexicographic (DL). For EU, an agent seeks a different allocation that yields more expected utility. For SD, an agent seeks a different allocation that yieldsmoreexpectedutilityforallcardinalutilitiesconsistentwiththeordinalprefer- ences.ForDL,anagentseeksanallocationthatgivesahigherprobabilitytothemost preferred alternative that has different probabilities in the two allocations. Through- out the paper, we assume that agents express strict preferences, i.e., they are not in- differentbetweenanytwohouses. Contributions. We initiate the study of computing best responses and checking for Nash equilibrium for the PS mechanism — one of the most established randomized rulesfortheassignmentproblem.Wepresentapolynomial-timealgorithmtocompute 1Anotherwell-establishedrulerandomserialdictator(RSD)isstrategyproofbutitisnotenvy-freeandnot asefficientasPS[BogomolnaiaandMoulin2001].Moreover,incontrasttoPS,thefractionalallocations underRSDare#P-completetocompute[Azizetal.2013a]. 2AlthoughPSwasoriginallydefinedforthesettingwherethenumberofhousesisequaltothenumberof agents,itcanbeusedwithoutanymodificationforfewerormorehousesthanagents[seee.g.,Bogomolnaia andMoulin2001;Kojima2009]. 3Thisproblemofcomputingtheoptimalmanipulationhasalreadybeenstudiedingreatdepthforvoting rules[seee.g.,FaliszewskiandProcaccia2010;Faliszewskietal.2010]. Technical Report:January2014. Strategicaspectsoftheprobabilisticserialrule 3 the DL best response for multiple agents and houses. The algorithm works by care- fully simulating the PS rule for a sequence of partial preference lists. For the case of twoagents4,wepresentapolynomial-timealgorithmtocomputeanEUbestresponse for any utilities consistent with the ordinal preferences. The result for the EU best response relies on an interesting connection between the PS rule and the sequential allocation rule for discrete objects. We leave open the problem of computing the ex- pectedutilityresponseforarbitrarynumberofagents.Thefactthatasimilarproblem hasalsoremainedopenforsequentialallocation[BouveretandLang2011]givessome indicationofthechallengeoftheproblem. We then examine situations in which all agents are strategic. We first show that expectedutilitybestresponsescancycle.Nashdynamicsinmatchingtheoryhasbeen activeareaofresearchespeciallyforthestablematchingproblem[seee.g.,Ackermann et al. 2011]. We then prove that a (pure) Nash equilibrium exists for any number of agentsandhouses.Tothebestofourknowledge,thisisthefirstproofoftheexistence ofaNashequilibriumforthePSrule.Forthecaseoftwoagentswepresenttwodiffer- entlinear-timealgorithmstocomputeapreferenceprofilethatisinNashequilibrium with respect to the original preferences. One type of equilibrium profile results in the sameassignmentastheonebyoriginalprofile. Finally,weperformanexperimentalstudyofthefrequencyofmanipulabilityofthe PS mechanism. We investigate, under a variety of utility functions and preference distributions,thelikelihoodthatsomeagentinaprofilehasanincentivetomisreport hispreference.TheexperimentsidentifysettingsandutilitymodelsinwhichPSisless susceptibletomanipulation. 2. PRELIMINARIES An assignment problem (N,H,(cid:31)) consists of a set of agents N = {1,...,n}, a set of housesH ={h ,...,h }andapreferenceprofile(cid:31)=((cid:31) ,...,(cid:31) )inwhich(cid:31) denotes 1 m 1 n i a complete, transitive and strict ordering on H representing the preferences of agent i over the houses in H. Since each (cid:31) will be strict throughout the paper, we will also i refertoitsimplyas(cid:31) . i Afractionalassignmentisa(n×m)matrix[p(i)(j)]suchthatforalli∈N,andh ∈ j (cid:80) H, 0 ≤ p(i)(j) ≤ 1; and for all j ∈ {1,...,n}, p(i)(j) = 1 The value p(i)(j) is the i∈N fractionofhouse h thatagentigets.Eachrow p(i) = (p(i)(1),...,p(i)(m)) represents j theallocationofagenti.Afractionalassignmentcanalsobeinterpretedasarandom assignment where p(i)(j) is the probability of agent i getting house h . We will also j denotep(i)(j)byp(i)(h ). j Relations between random allocations. A standard method to compare lotter- ies is to use the SD (stochastic dominance) relation. Given two random as- signments p and q, p(i) (cid:31)SD q(i) i.e., a player i SD prefers allocation p(i) (cid:80) i (cid:80) to q(i) if p(i)(h ) ≥ q(i)(h ) for all h ∈ H and (cid:80) hj∈{hk:hk(cid:31)ih}(cid:80) j hj∈{hk:hk(cid:31)ih} j p(i)(h )> q(i)(h )forsomeh∈H. hj∈{hk:hk(cid:31)ih} j hj∈{hk:hk(cid:31)ih} j Given two random assignments p and q, p(i) (cid:31)DL q(i) i.e., a player i DL prefers i allocationp(i)toq(i)ifp(i)(cid:54)=q(i)andforthemostpreferredhousehsuchthatp(i)(h)(cid:54)= q(i)(h),wehavethatp(i)(h)>q(i)(h). When agents are considered to have cardinal utilities for the objects, we denote by u (h)theutilitythatagentigetsfromhouseh.Wewillassumethattotalutilityofan i agent equals the sum of the utilities that he gets from each of the houses. Given two 4Thetwo-agentcaseisalsoofspecialimportancesincevariousdisputesarisebetweentwoparties. Technical Report:January2014. 4 H.Azizetal. randomassignmentspandq,p(i) (cid:31)EU q(i)i.e.,aplayeriEU(expectedutility)prefers (cid:80) i (cid:80) allocationp(i)toq(i)iff u (h)p(i)(h)> u (h)q(i)(h). h∈H i h∈H i Since for all i ∈ N, agent i compares assignment p with assignment q only with respect to his allocations p(i) and q(i), we will sometimes abuse the notation and use p (cid:31)SD q for p(i) (cid:31)SD q(i). A random assignment rule takes as input an assignment i i problem(N,H,(cid:31))andreturnsarandomassignmentwhichspecifieshowmuchfraction orprobabilityofeachhouseisallocatedtoeachagent. 3. THEPROBABILISTICSERIALRULEANDITSMANIPULATION Recall that the Probabilistic Serial (PS) rule is a random assignment algorithm in whichweconsidereachhouseasinfinitelydivisible.Ateachpointintime,eachagent is consuming his most preferred house that has not completely been consumed and each agent has the same unit speed. Hence all the houses are consumed at time m/n andeachagentreceivesatotalofm/nunitofhouses.Theprobabilityofhouseh being j allocatedtoiisthefractionofhouseh thatihaseaten.ThePSfractionalassignment j canbecomputedintimeO(mn).Wereferthereaderto[BogomolnaiaandMoulin2001] or[Kojima2009]foralternativedefinitionsofPS.Thefollowingexampleadaptedfrom [Section7, BogomolnaiaandMoulin2001]showshowPSworks. Example3.1 (PSrule). Consider an assignment problem with the following pref- erenceprofile. (cid:31) : h ,h ,h (cid:31) : h ,h ,h (cid:31) : h ,h ,h 1 1 2 3 2 2 1 3 3 2 3 1 Agents2and3starteatingh simultaneouslywhereasagent1eatsh .When2and3 2 1 finishh ,agent3hasonlyeatenhalfofh .Thetimingoftheeatingcanbeseenbelow. 2 1 h h h Agent1 1 1 3 h h h Agent2 2 1 3 h h h Agent3 2 3 3 0 1 3 1 2 4 Time (cid:32)3/4 0 1/4(cid:33) ThefinalallocationcomputedbyPSisPS((cid:31) ,(cid:31) ,(cid:31) )= 1/4 1/2 1/4 . 1 2 3 0 1/2 1/2 Consider the assignment problem in Example 3.1. If agent 1 misreports his pref- (cid:32)1/3 1/2 1/6(cid:33) erences as follows: (cid:31)(cid:48): h ,h ,h , then PS((cid:31)(cid:48),(cid:31) ,(cid:31) ) = 1/3 1/2 1/6 . Then, if 1 2 1 3 1 2 3 1/3 0 2/3 u (h ) = 7, u (h ) = 6, and u (h ) = 0, then agent 1 gets more expected utility when 1 1 1 2 1 3 he reports (cid:31)(cid:48). In the example, although truth-telling is a DL best response, it is not 1 necessarilyanEUbestresponseforagent1. Examples1and2of[Kojima2009]showthatmanipulatingthePSmechanismcan lead to an SD improvement when each agent can be allocated more than one house. In light of the fact that the PS rule can be manipulated, we examine the complex- ity of a single agent computing a manipulation, in other words, the best response for the PS rule.5 We then study the existence and computation of Nash equilibria. For E∈{SD,EU,DL},wedefinetheproblemEBESTRESPONSE:given(N,H,(cid:31))andagent 5Notethatifanagentisrisk-averseanddoesnothaveinformationabouttheotheragent’spreferences,then hismaximinstrategyistobetruthful.Thereasonisthatifallallagentshavethesamepreferences,then theoptimalstrategyistobetruthful. Technical Report:January2014. Strategicaspectsoftheprobabilisticserialrule 5 i ∈ N, compute a preference (cid:31)(cid:48) for agent i such that there exists no preference (cid:31)(cid:48)(cid:48) i i such that PS(N,H,((cid:31)(cid:48)(cid:48),(cid:31) )) (cid:31)E PS(N,H,((cid:31)(cid:48),(cid:31) )). For a constant m, the problem i −i i i −i EBESTRESPONSE cancanbesolvedbybruteforcebytryingouteachofthem!prefer- ences.Hencewewon’tassumethatmisaconstant. We establish some more notation and terminology for the rest of the paper. We will often refer to the PS outcomes for partial lists of houses and preferences. We will de- note by PS((cid:31)L,(cid:31) )(i), the allocation that agent i receives when his preferences are i −i restrictedtothelistLwhereLisanorderedlistofasubsetofhouses.Whenanagent runsoutofhousesinhispreferencelist,hedoesnoteatanyotherhouses.Thelength of a list L is denoted |L|, and we refer to the kth house in L as L(k). In the PS rule, theeatingstarttimeofahouseisthetimepointatwhichthehousestartstobeeaten bysomeagent.InExample3.1,theeatingstarttimesofh ,h andh are0,0and0.5, 1 2 3 respectively. 4. LEXICOGRAPHICBESTRESPONSE In this section, we present a polynomial-time algorithm for DLBESTRESPONSE. Lex- icographic preferences are well-established in the assignment literature [see e.g., Sa- ban and Sethuraman 2013b; Schulman and Vazirani 2012; Cho 2012]. Let (N,H,(cid:31)) be an assignment problem where N = {1,...,n} and H = {h ,...,h }. We will show 1 m how to compute a DL best response for agent 1 ∈ N. It has been shown that when m ≤ n, then truth-telling is the DL best response but if m > n, then this need not be thecase[SabanandSethuraman2013b;SchulmanandVazirani2012;Kojima2009]. Recallthatapreference(cid:31)(cid:48) isaDLbestresponseforagent1ifthefractionalalloca- 1 tionagent1receivesbyreporting(cid:31)(cid:48) isDLpreferredtoanyfractionalallocationagent 1 1 receives by reporting another preference. That is, there is no preference (cid:31)(cid:48)(cid:48) such 1 that his share of a house h when reporting (cid:31)(cid:48)(cid:48) is strictly larger than when reporting 1 (cid:31)(cid:48) whiletheshareofallhouseshepreferstoh(accordingtohistruepreference(cid:31) )is 1 1 thesamewhetherreporting(cid:31)(cid:48) or(cid:31)(cid:48)(cid:48). 1 1 Our algorithm will iteratively construct a partial preference list for the i most pre- ferredhousesofagent1.Withoutlossofgenerality,denote(cid:31) :h ,h ,...,h . 1 1 2 m For any i,1 ≤ i ≤ m, denote H = {h ,...,h }. A (partial) preference of agent 1 i 1 i restrictedtoH isapreferenceoverasubsetofH .NotethatapreferenceforH need i i i notlistallthehousesinH .Forthepreferenceofagent1restrictedtoH ,thePSrule i i computesanallocationwherethepreferenceofagent1isreplacedwiththispreference and the preferences of all other agents remain unchanged. Recall that agent 1 can only be allocated a non-zero fraction of a house if this house is in the preference list hesubmits.ThenotionsofDLbestresponseandDLpreferredfractionalassignments withrespecttoasubsetofhousesH aredefinedaccordinglyforrestrictedpreferences i ofagent1. For a house h ∈ H, let PS1(L,h) denote the fraction of house h that the PS rule assignstoagent1whenhereportsthe(partial)preferenceL. We start with a simple lemma showing that a DL best response for agent 1 for the wholesetH canbenobetterandnoworseonH thanaDLbestresponseforH . i i LEMMA 4.1. Leti∈{1,...,m}.ADLbestresponseforagent1onH givesthesame fractionalassignmenttothehousesinH asaDLbestresponseforagent1onH . i i PROOF. Wehavethatapreferenceforagent1onHicanbeextendedtoapreference forallhousesthatgivesthesamefractionalallocationtoagent1forthehousesinH . i Namely,theremaininghousesH\H canbeappendedtotheendofhispreferencelist, i givingthesameallocationtothehousesinH asbefore. i Technical Report:January2014. 6 H.Azizetal. On the other hand, consider a DL best response (cid:31)(cid:48) for agent 1 on H, giving a frac- 1 tional allocation p to agent 1. Restricting this preference to H gives a fractional allo- i cation q for H . If q is DL preferred to p , i.e., the fractional allocation p restricted i |Hi to H , then q = p , otherwise we would have a contradiction to (cid:31)(cid:48) being a DL best i |Hi 1 responseasperthepreviousargumentthatwecanextendanypreferenceforH toH i givingthesamefractionalallocationtoagent1forthehousesinH . i Our algorithm will compute a list L such that L ⊆ H .6 The list L will be a DL best i i i i response for agent 1 with respect to H . Suppose the algorithm has computed L . i i−1 Then,whenconsideringH =H ∪{h },itneedstomakesurethatthenewfractional i i−1 i allocation restricted to the houses in H remains the same (due to Lemma 4.1). For i−1 the preference to be optimal with respect to H , the algorithm needs to maximize the i fractionalallocationofh toagent1underthepreviousconstraint. i OuralgorithmwillcomputeacanonicalDLbestresponsethathasseveraladditional properties. Definition 4.2. A preference L for H is no-0 if L contains no house h with i i i PS1(L ,h)=0. i AnyDLbestresponseforagent1forH canbeconvertedintoano-0DLbestresponse i byremovingthehousesforwhichagent1obtainsafractionof0. Definition 4.3. Forano-0preferenceL forH ,thestingyorderingforapositionj is i i determinedbyrunningthePSrulewiththepreferenceL (1)⊕···⊕L (j−1)foragent i i 1 where ⊕ denotes concatenation. It orders the houses from (cid:83)|Li| L (k) by increasing k=j i eatingstarttimes,andwhen2housesh,h(cid:48) havethesameeatingstarttime,weorder hbeforeh(cid:48) iffh(cid:31) h(cid:48). 1 Intuitively, houses occurring early in this ordering are the most threatened by the otheragentsatthetimepointwhenagent1comestopositionj.Thefollowingdefini- tiontakesintoaccountthattheeatingstarttimesoflaterhousesmaychangedepend- ingonagent1’sorderingofearlierhouses. Definition 4.4. A preference L for H is stingy if it is a no-0 DL best response for i i agent1onH ,andforeveryj ∈{1,...,i},L (j)isthefirsthouseinthestingyordering i i for this position such that there exists a DL best response starting with L (1)⊕···⊕ i L (j). i Wenotethat,duetoLemma4.1,thereisauniquestingypreferenceforeachH . i Example 4.5. Considerthefollowingassignmentproblem. (cid:31) :h ,h ,h ,h ,h ,h (cid:31) :h ,h ,h ,h ,h ,h 1 1 2 3 4 5 6 2 3 6 4 5 1 2 Thepreferencesh ,h ,h ,h andh ,h ,h ,h arebothno-0DLbestresponsesforagent 3 1 4 2 3 2 4 1 1 with respect to H , allocating h (1),h (1),h (1/2),h (1/2) to agent 1. When running 4 1 2 3 4 the PS rule with h as the preference list, h ’s eating start time comes first among 3 4 {h ,h ,h }.However,thereisnoDLbestresponseforH startingwithh ,h .Thenext 1 2 4 4 3 4 houseinthestingyorderingish .Thepreferenceh ,h ,h ,h isthestingypreference 1 3 1 4 2 forH . 4 Thenextlemmashowsthatwhenagent1receivesahousepartially(afractiondiffer- ent from 0 and 1) in a DL best response, a stingy preference would not order a less preferredhousebeforethathouse. 6Whenwetreatalistasasetwerefertothesetofallelementsoccurringinthelist. Technical Report:January2014. Strategicaspectsoftheprobabilisticserialrule 7 LEMMA 4.6. Let Li be a stingy preference for Hi. Suppose there is a hj ∈ Hi such that0<PS1(L ,h )<1.Then,P ⊆H ,whereL =P ⊕h ⊕S. i j j i j PROOF. Forthesakeofcontradiction,assumeP containsahousehksuchthathj (cid:31)1 h (i.e., j < k). Let K denote all houses h in P such that h (cid:31) h . Since L is no-0, k k j 1 k i PS1(L ,h ) > 0 for all h ∈ K. But then, removing the houses in K from L gives a i k k i preferencethatisstrictlyDLpreferredtoL sincethisincreasesagent1’sshareofh i j whileonlythesharesoflesspreferredhousesdecrease.ThiscontradictsL beingaDL i bestresponseforH ,andthereforeprovesthelemma. i Thenextlemmashowshowthehousesallocatedcompletelytoagent1areorderedin astingypreference. LEMMA 4.7. LetLi beastingypreferenceforHi.Ifhj,hk ∈Hi aretwohousessuch that PS1(L ,h ) = PS1(L ,h ) = 1, with L = P ⊕h ⊕M ⊕h ⊕S, then either the i j i k i j k eatingstarttimeofh issmallerthanh ’seatingstarttimewhenagent1reportsP,or j k itisthesameandh (cid:31) h . j 1 k PROOF. Supposenot.Butthen,Liisnotstingysinceswappinghj andhk inLigives thesamefractionalallocationtoagent1. WenowshowthatwheniteratingfromasetofhousesH toH ,theprevioussolution i−1 i canbereuseduptothelasthousethatagent1receivespartially. LEMMA 4.8. Let Li−1 and Li be stingy preferences for Hi−1 and Hi, respectively. Suppose there is a h ∈ H such that 0 < PS1(L ,h) < 1. Then the prefixes of L i−1 i−1 i−1 andL coincideuptoh. i PROOF. Suppose not. By Lemma 4.1, PS1(Li,h) = PS1(Li−1,h). Let Pi−1 = Pi denote a maximum common prefix of L and L , and write L = P ⊕ x ⊕ i−1 i i−1 i−1 i−1 M ⊕h⊕S and L = P ⊕x ⊕M ⊕h⊕S . By Lemma 4.6, h (cid:31) h , and there- i−1 i−1 i i i i i 1 i fore, h ∈ S . Since L and L are no-0, we have that PS1(L ,x ) > 0 and i i i−1 i i−1 i−1 PS1(L ,x ) > 0. Now, if PS1(L ,x ) < 1, then since at least one other agent eats i i i−1 i−1 x concurrently with agent 1 when he reports L , he loses a non-zero fraction of i−1 i−1 x when instead he reports L and eats x after having exhausted P , we have that i−1 i i i PS1(L ,x ) < PS1(L ,x ), a contradiction to Lemma 4.1. Similarly, we obtain i i−1 i−1 i−1 a contradiction when PS1(L ,x ) < 1. Therefore, PS1(L ,x ) = PS1(L ,x ) = 1. i i i−1 i−1 i i Now,byLemma4.1,wealsohavethatPS1(L ,x )=PS1(L ,x )=1.Butonlyone i−1 i−1 i i of x ,x can come earlier in the stingy ordering. The other one contradicts Lemma i i−1 4.7. We are now ready to describe how to obtain L from L . See Algorithm 1 for the i i−1 pseudocode. The subroutine EST(N,H,(cid:31)) executes the PS rule for (N,H,(cid:31)) and for eachitem,recordsthefirsttimepointwheresomeagentstartseatingit.Itreturnsthe eatingstarttimesest(h)foreachhouseh∈H. Let p be the last position in L such that the house L (p) is partially allocated i−1 i−1 toagent1.Incaseagent1receivesnohousepartially,setp:=0andinterpretL (p) i−1 as an imaginary house before the first house of L . By Lemma 4.8, we have that i−1 L (s) = L (s) for all s ≤ p. By Lemma 4.1, we have that the fractional assignment i−1 i resultingfromL mustwhollyallocateallhousesL (p+1),...,L (|L |)toagent i i−1 i−1 i−1 1,andallocateashareof0toallhousesinH \L . i−1 i−1 It remains to find the right ordering for {L (s) : p+1 ≤ s ≤ |L |}∪{h }. By i−1 i−1 i Lemmas4.6and4.7,theprefixesofL andL coincideuptoh.Wewilldescribeinthe i−1 i next paragraph how to determine the position q where h should be inserted. Having i determined this position one may then need to re-order the subsequent houses. This isbecauseinsertingh inthelistmaychangetheeatingstarttimesofthesubsequent i Technical Report:January2014. 8 H.Azizetal. Input: (N,H,(cid:31)) Output: DLBestresponseofagent1 1 L1←h1 //Bestresponseforagent1w.r.t.H1={h1} 2 fori=2tondo //Computeabestresponsew.r.t.H2,...,Hn 3 p←0 4 if∃q∈{1,...,i−1}suchthat0<PS1(Li−1,Li−1(q))<1then 5 p←max{q∈{1,...,i−1}:0<PS1(Li−1,Li−1(q))<1} 6 endif 7 forq←p+1to|Li|+1do //Newhousehiinsertedafterpositionp 8 Lqi ←Li−1(1)⊕···⊕Li−1(q−1)⊕hi 9 while|Lqi|≤|Li−1|do //Completethelistaccordingtothestingyordering 10 est←EST(N,H,(Lqi,(cid:31)2,...,(cid:31)n)) 11 S←{h∈Li−1\Lqi :est(h)isminimum} 12 hs←firsthouseamongSin(cid:31)1 13 Lqi ←Lqi ⊕hs 14 endwhile 15 ifPS1(Lqi,hi)=0then 16 Lqi ←Li−1 17 endif 18 endfor 19 q←p //DeterminewhichLqisstingy i 20 worse[p−1]←true 21 finished←false 22 whilefinished=falsedo 23 if∃h∈Hi−1suchthatPS1(Lqi,h)(cid:54)=PS1(Li−1,h)then 24 worse[q]←true 25 q←q+1 26 else 27 worse[q]←false 28 ifPS1(Lqi,h1)>0andPS1(Lqi,h1)<1then 29 ifworse[q−1]=falsethen 30 q←q−1 31 endif 32 finished←true 33 elseifPS1(Lqi,h1)=1then 34 est←EST(N,H,(Lqi(1)⊕···⊕Lqi(q−1),(cid:31)2,...,(cid:31)n)) 35 if∃h∈{Lqi(q+1),...,Lqi(|Lqi|)}suchthatest(h)≤est(hi)then 36 q←q+1 37 else 38 finished=true 39 endif 40 endif 41 endif 42 endwhile 43 Li←Lqi 44 endfor 45 return Ln Algorithm1:DLbestresponsefornagents houses. This leads us to the following insertion procedure. The list Lq obtained from i L byinsertingh atpositionq,withp<q ≤|L |+1,isdeterminedasfollows.Start i−1 i i with Lq := L (1)⊕···⊕L (q−1)⊕h . While |Lq| ≤ |L |, we append to the end i i−1 i−1 i i i−1 ofLq thefirsthouseamongL \Lq inthestingyorderingforthisposition.Afterthe i i−1 i while-loopterminates,runthePSrulefortheresultinglistLq.Incaseweobtainthat i PS1(Lq,h )=0,weremoveh againfromthislist(andactuallyobtainLq =L ). i i i i i−1 The position q where h is inserted is determined as follows. Start with q := p. We i haveanarrayworsekeepingtrackofwhetherthelistsLp,...,Li produceaworseout- i i come for agent 1 than the list L . Set worse[p−1] := true. As long as the list L has i−1 i Technical Report:January2014. Strategicaspectsoftheprobabilisticserialrule 9 not been determined, proceed as follows. Obtain Lq from L by inserting h at po- i i−1 i sition q, as described earlier. Consider the allocation of agent 1 when he reports Lq. i If this allocation is not the same for the houses in H as when reporting L , then i−1 i−1 set worse[q] := true, otherwise set worse[q] := false. If worse[q], then increment q. This is because, by Lemma 4.1, this preference would not be a DL best response with re- spect to H . Otherwise, if 0 < PS1(Lq,h ) < 1, then we can determine h ’s position. If i i i i worse[q −1], then set L := Lq, otherwise set L := Lq−1. This position for h is opti- i i i i i mal since moving h later in the list would decrease its share to agent 1. Otherwise, i we have that worse[q] = false and PS1(Lq,h ) ∈ {0,1}. This will be the share agent 1 i i receives of h . If PS1(Lq,h ) = 0, then set L := L . Otherwise (PS1(Lq,h ) = 1), it i i i i i−1 i i stillremainstocheckwhetherthecurrentpositionforh givesastingypreference.For i this,runthePSrulewiththepreferenceLq(1)⊕···⊕Lq(q−1)foragent1.Ifh ’seating i i i start time is smaller than the eating start time of each house Lq(r) with r > q, then i setL :=Lq,otherwiseincrementq. i i Thus,givenL ,thepreferenceL canbecomputedbyexecutingthePSruleO(m) i−1 i times. The DL best response computed by the algorithm is L . Since the PS rule can m beimplementedtoruninlineartimeO(nm),therunningtimeofthisDLbestresponse algorithmisO(nm3). THEOREM 4.9. DLBESTRESPONSE canbesolvedinO(nm3)time. Example 4.10. Considerthefollowinginstance. (cid:31) :h ,h ,h ,h ,h ,h ,h ,h ,h ,h 1 1 2 3 4 5 6 7 8 9 10 (cid:31) :h ,h ,h ,h ,h ,h ,h ,h ,h ,h 2 8 3 5 2 10 1 6 7 4 9 (cid:31) :h ,h ,h ,h ,h ,h ,h ,h ,h ,h 3 9 4 7 1 2 6 5 3 8 10 After having computed L = h ,h , the algorithm is now to consider H . Since 2 1 2 3 PS1(L ,h ) = PS1(L ,h ) = 1, the algorithm first considers L1 = h ,h ,h . Note 2 1 2 2 3 3 2 1 that h and h have been swapped with respect to L since agent 2 starts eating h 1 2 2 2 before agent 3 starts eating h when agent 1 reports the preference list consisting 1 of only h . It turns out that PS1(L1,h ) = PS1(L1,h ) = PS1(L1,h ) = 1. Thus, 3 3 1 3 2 3 3 worse[1] = false. Since h does not come first in the stingy ordering, the algorithm 3 needstoverifywhethermovingh laterwillstillgiveaDLbestresponsewithrespect 3 toH .ItthenconsidersL2 =h ,h ,h .However,thisallocatesonlyhalfofh toagent 3 3 1 3 2 3 1, implying worse[2] = true. Since worse[1] = false, the algorithm sets L = L1. The DL 3 3 bestresponsecomputedbythealgorithmisL =h ,h ,h ,h . 10 3 2 1 6 Agent1 h1 h2 h2 h4 h4 h3 h3 Agent2 h5 h5 h6 h6 h7 h7 h4 Agent3 h1 h8 h8 h9 h9 h10 h10 Agent4 h1 h11 h11 h12 h12 h13 h13 0 1 1 4 2 7 9 10 3 3 3 3 3 (cid:31) : h ,h ,h ,h ,... (cid:31) : h ,h ,h ,h ,h ,h ... 1 1 2 3 4 2 5 6 7 2 4 14 (cid:31) : h ,h ,h ,h ,h ,... (cid:31) : h ,h ,h ,h ,... 3 1 8 9 10 3 4 1 11 12 13 Fig. 1: Illustration of constructing a DL best response for agent 1 for the preference profilespecifiedabove. Technical Report:January2014. 10 H.Azizetal. Example 4.11. Figure 1 depicts how the DL best response of agent 1 looks like. Afterh isinserted,thestartingeatingtimeh isbeforeh .Butafterh isinsertedin 1 3 4 2 to form L , then the starting eating time of h comes before h because agent 2 won’t 2 4 3 be able to eat h . After h is inserted to build L , it turns out that agent 2 will not be 2 4 4 able to eat h at all. That is why h is shaded in the eating line of agent 2 because it 4 2 willalreadybeeatenbythetimeagent2considerseatingitattime10/3. The DL optimal best response algorithm carefully builds up the DL optimal prefer- enceslistwhileensuringitisstingy. We note that a DL best response is also an SD best response. A best response was defined as a response that is not dominated. Hence a DL-best response is one which nootherresponseDL-dominates.ThismeansthatnootherresponseSD-dominates(as DLisarefinementofSD)it.Hence,aDLbestresponseisalsoaSDbestresponse.One may wonder whether an algorithm to compute the DL best response also provides us withanalgorithmtocomputeanEUbestresponse.However,aDLbestresponsemay notbeanEUbestresponseforthreeormoreagents.Considerthepreferenceprofilein Example 3.1. Since the number of houses is equal to the number of agents, reporting thetruthfulpreferenceisaDLbestresponse[SchulmanandVazirani2012].However, wehaveshownadifferentpreferenceforagent1wherehemayobtainhigherutility. 5. EXPECTEDUTILITYBESTRESPONSE InthissectionwepresentanalgorithmtocomputeanEUbestresponsefortwoagents for the PS rule. First, we reveal a tight connection between a well-known mechanism for sequential allocation of indivisible houses and the PS mechanism (Section 5.1). Thenwedemonstratehowtheexpectedutilitybestresponsealgorithmforthesequen- tial allocation of indivisible houses proposed by Kohler and Chandrasekaran [1971] canbeusedtobuildabestresponseforthe PS algorithm(Section5.2). 5.1. Aconnectionbetweenallocationmechanismsfordivisibleandindivisiblehouses We can obtain the same allocation given by the PS algorithm using the alternation policy,whichisasimplemechanismfordividingdiscretehousesbetweenagents.The alternatingpolicyletstheagentstaketurnsinpickingthehousethattheyvaluemost: the first agent takes his most preferred house, then the second agent takes his most preferredhousefromtheremaininghouses,andsoon.Weusethenotation1212...to denote the alternation policy. To obtain the allocation of the PS algorithm using the alternationpolicywesplitourhousesintohalvesandtreatthemasindivisiblehouses andadjustagents’preferencesoverthesehalvesinanaturalway. Recall that H = {h ,...,h } is the set of houses. Assume (cid:31) : h ,...,h and the 1 m 2 1 m preference of agent 1 is a permutation of h ,...,h as follows (cid:31) : h ,...,h . We 1 m 1 π(1) π(m) denote (cid:31)(k) the kth preferred house of the agent i, and by (cid:31)−1(h ) we mean the i 1 i positionofh in(cid:31) . i 1 We split each house h , i = 1,...,m, into halves and treat these halves as indi- i visible houses. Given h , we say that h1 and h2 are two halves of h . Given the set i i i i of houses H, we denote HCLONED the set of all halves of all houses in H, so that HCLONED = {h1,h2,...,h1 ,h2 }. Given (cid:31) and (cid:31) , we introduce profiles (cid:31)CLONED and 1 1 m m 1 2 1 (cid:31)CLONED thatareobtainedbystraightforwardsplittingofhousesintohalvesin(cid:31) and 2 1 (cid:31) : (cid:31)CLONED= h1 ,h2 ,...,h1 ,h2 and (cid:31)CLONED= h1,h2,...,h1 ,h2 . We call this 2 1 π(1) π(1) π(m) π(m) 2 1 1 m m transformationtheorder-preservingbisection. Definition 5.1. Let (cid:31) be a preference over a subset of half-houses S ⊆ HCLONED. s Thepreference(cid:31) hastheconsecutivitypropertyifandonlyif (cid:31)−1(h1)+1=(cid:31)−1(h2) s s i s i Technical Report:January2014.