LARGE-SCALEOPTIMIZATIONFORDATAPLACEMENTPROBLEM LAZIMAANSARI BachelorofScience,MilitaryInstituteofScienceandTechnology,2010 AThesis SubmittedtotheSchoolofGraduateStudies oftheUniversityofLethbridge inPartialFulfillmentofthe RequirementsfortheDegree MASTEROFSCIENCE DepartmentofMathematicsandComputerScience UniversityofLethbridge LETHBRIDGE,ALBERTA,CANADA (cid:2)c LazimaAnsari,2017 LARGE-SCALEOPTIMIZATIONFORDATAPLACEMENTPROBLEM LAZIMAANSARI DateofDefence: August17,2017 Dr. DayaGaur Supervisor Professor Ph.D. Dr. ShahadatHossain CommitteeMember Professor Ph.D. Dr. RobertBenkoczi CommitteeMember AssociateProfessor Ph.D. Dr. HowardCheng Chair, Thesis Examination Com- AssociateProfessor Ph.D. mittee Dedication Idedicatethisthesistomylovingparents. IcannotexpresshowluckyIamtohave parentswholoveendlessly. Thankyousomuchforbelievinginme. iii Abstract Large-scale optimization of combinatorial problems is one of the most challenging areas. These problems are characterized by large sets of data (variables and constraints). In this thesis, we study large-scale optimization of the data placement problem with zero storage cost. The goal in the data placement problem is to find the placement of data objects in a set of fixed capacity caches in a network to optimize the latency of access. Data placementproblemarisesnaturallyinthedesignofcontentdistributionnetworks. Wereport on an empirical study of the upper bound and the lower bound of this problem for large sized instances. We also study a semi-Lagrangean relaxation of a closely related k-median problem. Inthisthesis,westudythetheoryandpracticeofapproximationalgorithmforthe dataplacementproblemandthek-medianproblem. iv Acknowledgments I would like to express my sincere gratitude to my supervisor Dr. Daya Gaur for his guid- ance and support throughout the whole learning process. I would also like to thank my supervisory committee members Dr. Robert Benkoczi and Dr. Shahadat Hossain for their preciousadviceandinspiration. I am thankful to all the members of optimization research group of the University of Leth- bridgefortheirsupport. IamgratefultoNabiandUmairfortheirhelp. Ialsowanttothank AnamaySarkarwithwhomIworkedonsomeimportantpartsofmythesis. A very special thanks goes to my husband and best friend Imtiaz, who has been a constant source of support and encouragement during all the challenges of my life. This accom- plishment would not have been possible without him. Thanks for making me realize that thedreamscanactuallycometrue. Thanksforeverything. I would also like to thank my family for the support they provided me through my entire life. I must acknowledge my parents, sister, parents-in-law. Without their unconditional love,sacrifices,andprayers,Iwouldnothavefinishedthisthesis. I want to thank my friends Fatema, Tasnuba and Jeeshan who made my life easier in Canada. v Contents Contents vi ListofTables viii ListofFigures ix 1 Introduction 1 1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Organizationofthethesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 PreliminariesandRelatedConcepts 4 2.1 Optimizationproblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Linearprogramming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1 Simplexmethod . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.2 Othermethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Integerprogramming(IP) . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Integerprogrammingsolutionmethods . . . . . . . . . . . . . . . . . . . . 11 2.5 Lagrangeanrelaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 BasicFormulation . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.2 Subgradientoptimization . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.3 Semi-Lagrangeanrelaxation(SLR) . . . . . . . . . . . . . . . . . 17 2.6 Localsearchheuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6.1 Generalalgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3 k-medianproblem 21 3.1 Problemdefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Relatedresearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Computingtheupperbound . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.1 Localsearchmethod . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.2 Asimpleranalysisoflocalsearchmethodforthek-medianproblem 26 3.4 Computingthelowerbound . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.1 Semi-Lagrangeanrelaxationfork-median . . . . . . . . . . . . . . 30 3.4.2 TheAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5 ExperimentsandResults . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 vi CONTENTS 4 Dataplacementproblem 40 4.1 ProblemDefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 Relatedresearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.3 Computingtheupperbound . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.1 Localsearchmethod . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.2 A simpler analysis of local search method for the uncapacitated facilitylocationproblem . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 Computingthelowerbound . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4.1 Lagrangeanrelaxation1(LR1-DP) . . . . . . . . . . . . . . . . . 53 4.4.2 Lagrangeanrelaxation2(LR2-DP) . . . . . . . . . . . . . . . . . 56 4.5 ExperimentsandResults . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.5.1 Generationoftestinstances . . . . . . . . . . . . . . . . . . . . . 58 4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5 ConclusionandFutureworks 68 5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2 Futurework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Bibliography 70 vii List of Tables 2.1 Notationsusedinsubgradientprocedure . . . . . . . . . . . . . . . . . . . 16 3.1 Notationsusedintheanalysisoflocalsearchmethodforthek-medianprob- lem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2 Experimentalresultsfork-medianproblem . . . . . . . . . . . . . . . . . 35 3.2 Experimentalresultsfork-medianproblem . . . . . . . . . . . . . . . . . 36 3.2 Experimentalresultsfork-medianproblem . . . . . . . . . . . . . . . . . 37 4.1 Notationsusedintheanalysisoflocalsearchmethodfortheuncapacitated facilitylocationproblem . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2 Experimentalresultsforthedataplacementproblem . . . . . . . . . . . . 61 viii List of Figures 3.1 Anexamplemappingη:F∗ →F . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Assignmentofclients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.1 Assignmentofclienttofacility . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2 Assignmentofclienttofacility . . . . . . . . . . . . . . . . . . . . . . . . 51 4.3 AcomparisonofdualitygapfortwoLagrangeanrelaxationswhilevarying thenumberofcaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 A comparison of time for two Lagrangean relaxations while varying the numberofcaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.5 AcomparisonofdualitygapfortwoLagrangeanrelaxationswhilevarying thecachecapacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.6 A comparison of time for two Lagrangean relaxations while varying the cachecapacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.7 AcomparisonofdualitygapfortwoLagrangeanrelaxationswhilevarying thenumberofclients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.8 A comparison of time for two Lagrangean relaxations while varying the numberofclients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.9 AcomparisonofdualitygapfortwoLagrangeanrelaxationswhilevarying thenumberofobjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.10 A comparison of time for two Lagrangean relaxations while varying the numberofobjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 ix Chapter 1 Introduction Facility location problems have occupied a major place in Operations Research since the early 1960s. They model situations such as the placements of warehouses [35], factories, firestations,hospitalsandsoon. Everylocationproblemconsistsoffourbasiccomponents [13]. • Asetoflocationswherethefacilitiesareplaced. Foreachsuchlocation,wearegiven acostofopening(orstoring)thefacilities. • A set of demand points or clients who need certain services from the facilities and havetobeassignedtoafacilitysuchthattheirrequirementsarefulfilled. • Alistofrequirementstobemetbythefacilitiesandassignmentofclientstofacilities. • Acostfunctionassociatedwiththeassignmentofthedemandpointstothefacilities. Atypicalobjectiveistoselectasetoffacilitiestoopeninordertooptimizethegivenfunc- tion. Varioustypesoffacilitylocationproblemscanbeobtainedusingtheabovementioned features. The location problem that we study in this thesis is the data placement problem [3]. Data placement problem have been studied extensively in the areas of database man- agement [24] and cooperative caching in networks [18, 4, 33]. This problem also arises naturally in the modeling and operation of Content Distribution Networks [10, 9, 50, 19]. Insuchanetwork,thecomputersystemsneedtooptimizethedistributionofInternetpack- etstotheusersbyreplicatingandcachingthedataatmultiplelocationsinthenetwork. This reduces the load on the server and also helps to eliminate network congestion. Due to the 1

