Technische Universit(cid:228)t M(cid:252)nchen Fakult(cid:228)t f(cid:252)r Informatik Lehrstuhl f(cid:252)r E(cid:30)ziente Algorithmen Algorithmic Methods for Lowest Common Ancestor Problems in Directed Acyclic Graphs Johannes Nowak Vollst(cid:228)ndiger Abdruck der von der Fakult(cid:228)t f(cid:252)r Informatik der Technischen Universit(cid:228)t M(cid:252)nchen zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. A. Kemper, Ph.D. Pr(cid:252)fer der Dissertation: 1. Univ.-Prof. Dr. E. W. Mayr 2. Univ.-Prof. Dr. F. J. Esparza Estaun Die Dissertation wurde am 04.02.2008 bei der Technischen Universit(cid:228)t M(cid:252)nchen eingereicht und durch die Fakult(cid:228)t f(cid:252)r Informatik am 15.09.2009 angenommen. Abstract Lowest common ancestor (LCA) problems in directed acyclic graphs (dags) have attracted scientific interestintherecentyears. Directedacyclicgraphsarepowerfultoolsformodelingcausalitysystems or other kind of entity dependencies, and efficient solutions to the respective lowest common ances- tor problems are indispensable computational tools with regard to proper analysis of these systems. Similar problems in trees are widely understood, however, the generalizations to dags fall short of achievingcomparableefficiencies. In this work, we develop and analyze algorithmic techniques for solving LCA problems in dags. Themainfocusisonall-pairsLCAproblems,i.e.,problemsthatrequiretheanswerstotherespective LCA queries for all vertex pairs in the dag. In particular, the basic problems studied are finding one arbitrary (representative) LCA for all vertex pairs and listing all LCAs for all vertex pairs. We identify and describe in-depth three basic algorithmic approaches that lead to efficient solutions to theseproblems,namelydynamicprogramming,matrixmultiplication,andapathcoverapproach. Thedynamicprogrammingmethodincombinationwithusingtransitivereductionyieldsalgorithms that are efficient in the average case and – as a result of an experimental study also described in this thesis – in practice. However, the running times depend on the number of edges in the transitive reduction of the input dags and are hence vulnerable to special constructs such as dense bipartite graphs. Matrix multiplication approaches lead to general upper time bounds for the two problems that improve upon hitherto best solutions. More specifically, we prove that representative LCAs for all vertexpairscanbecomputedintimeO(n2.575),andallLCAsforallvertexpairscanbecomputedin time O(n3.334) for a dag with n vertices. We note in this place that any improvement of the matrix multiplicationexponentautomaticallyimprovestheseboundsfortheLCAproblems. Thethirdalgorithmicapproach,apathcovertechnique,yieldsefficientsolutionsforthetwoprob- lems in dags G with small width w(G), namely O(n2w(G)logn) for computing representative and O(n2w(G)2log2n) for computing all LCAs. However, perhaps the most important result connected with the path cover technique is an improved algorithm for finding representative LCAs in general that restricts the class of dags for which the upper bound O(n2.575) is actually attained significantly. Furtherresearchinthisdirectionmightultimatelyimprovethisbound. Additionally,wereviewalgorithmicapplicationsofthepresentedtechniques,namelyproblemvari- antsindynamicsettings,inweighteddags,andunderspace-efficiencyconsiderations. Althoughsome oftheuppertimeboundsthatwepresentinthisworkmightturnouttobetight,furtherstudyofthese andalikeproblemsseemstobeapromisingdirectionforfutureresearch. Finally,wepresenttheresultsofanexperimentalstudyofsomeofthealgorithmsdescribedinthis work, in particular, the algorithms that are based on dynamic programming. As a consequence, we concludethatbothrepresentativeandallLCAsforallvertexpairscanbecomputedreasonablyfastin practice,i.e.,withruntimeclosetoO(n2). Acknowledgments Inthefirstplace,IthankmyadvisorErnstW.Mayrforhishelpful,encouragingguidanceandgener- oussupportthroughoutmytimeofdoctoralstudies. Furthermore,Iamindebtedtoallthecurrentand formermembersoftheEfficientAlgorithmsGroupforprovidingastimulatingandmotivatingwork- ing environment. In particular, I am thankful to Stefan Eckhardt, Moritz Maaß, and Hanjo Täubig who have contributed significantly to my academic progress. Special thanks goes to Arno Buchner (and his wife) for providing food on the week-ends. Finally, I thank my parents exceptionally, not onlyfortheircontributiontothisthesisbutformyeducationingeneral. CONTENTS 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Preliminaries 7 2.1 ElementaryConcepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 AnalysisofAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 CommonAncestorProblemsinDirectedAcyclicGraphs . . . . . . . . . . . . . . . 13 3 Dynamic Programming Algorithms 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 All-PairsRepresentativeLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 All-PairsAllLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4 Matrix-Multiplication-Based Algorithms 35 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3 RepresentativeLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.4 AllLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.4.1 Fixed-VertexLCAVariants . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.4.2 All-PairsAllLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5 Path Cover Techniques 51 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 v vi Contents 5.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.3 AFirstNon-TrivialAll-PairsAllLCASolution . . . . . . . . . . . . . . . . . . . . 53 5.4 GeneralizedPathCoverApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.5 CombiningSmallWidthandLowDepth . . . . . . . . . . . . . . . . . . . . . . . . 59 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 6 Algorithmic Applications 67 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.2 (L)CAProblemsinWeightedDags . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.2.1 CommonAncestorProblemVariants. . . . . . . . . . . . . . . . . . . . . . 70 6.2.2 LowestCommonAncestorProblemVariants . . . . . . . . . . . . . . . . . 76 6.3 DynamicAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.1 FullyDynamicLowestCommonAncestors . . . . . . . . . . . . . . . . . . 79 6.3.2 IncrementalLCAAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6.4 Space-EfficientLCAComputations . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 7 Experimental Analysis 91 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 7.2 ExperimentalSetup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.2.2 TestData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7.3 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.3.1 All-PairsRepresentativeLCA . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.3.2 All-PairsAllLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Summary of Notation 105 Bibliography 107 LIST OF FIGURES 2.1 TransitiveClosureandReduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 (Lowest)CommonAncestors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 LowerBoundforallLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1 DynamicProgrammingforRepresentativeLCAs . . . . . . . . . . . . . . . . . . . 24 3.2 SimpleAlgorithmforallLCAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 DynamicProgrammingforallLCAs . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4 NaiveMerging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 CommonAncestorExistencetoReachability . . . . . . . . . . . . . . . . . . . . . 36 4.2 MatrixSampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3 MaximumWitnessesbyRectangularMatrixMultiplication . . . . . . . . . . . . . . 40 4.4 MatrixMultiplicationtoAll-PairsFixedVertexLCA . . . . . . . . . . . . . . . . . 43 4.5 MatrixMultiplicationtoFixed-Vertex-PairsAllLCA . . . . . . . . . . . . . . . . . 45 5.1 SpecialDFSTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2 ExampleoftheCombinedAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.1 ShortestDistanceCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.2 All-PairsShortestDistanceCAtoAll-PairsShortestDistances . . . . . . . . . . . . 72 6.3 MatrixSampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6.4 ProblemRelationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 7.1 RealWorldDensities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7.2 AverageVertexDegrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 7.3 ComparisonofRepresentativeLCAAlgorithms . . . . . . . . . . . . . . . . . . . . 96 7.4 AsymptoticEvaluationofRMQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.5 AsymptoticEvaluationofDP-LCA . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 7.6 AsymptoticEvaluationofRMQandDP-LCA(RealWorld) . . . . . . . . . . . . . . 99 vii viii ListofFigures 7.7 ComparisonofallLCAAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 100 7.8 AsymptoticEvaluationofDP-APA-lazy . . . . . . . . . . . . . . . . . . . . . . . . 100 7.9 AsymptoticEvaluationofDP-APA-it . . . . . . . . . . . . . . . . . . . . . . . . . 101 7.10 AsymptoticEvaluationofTC-APAandPC-APA . . . . . . . . . . . . . . . . . . . 101 7.11 EvaluationofLCASetSizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 LIST OF ALGORITHMS 1 GreedyChainCoverConstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2 DP-AlgorithmforRepresentativeLCA . . . . . . . . . . . . . . . . . . . . . . . . . 24 3 SimpleAlgorithmforallLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 DP-AlgorithmforallLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5 MergewithForbiddenSets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 AllLCAUsingRepresentativeLCAs . . . . . . . . . . . . . . . . . . . . . . . . . . 54 7 AllLCAUsingRepresentativeLCAs(Improved) . . . . . . . . . . . . . . . . . . . . 55 8 PreprocessingforCombinedAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . 61 9 AlgorithmforRepresentativeLCA . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 10 DP-AlgorithmforShortestDistanceCA . . . . . . . . . . . . . . . . . . . . . . . . . 74 ix
Description: