Data Structures and Algorithm Analysis Edition 3.2 (C++ Version) Clifford A. Shaffer DepartmentofComputerScience VirginiaTech Blacksburg,VA24061 September15,2011 Update3.2.0.2 Foralistofchanges,see http://people.cs.vt.edu/˜shaffer/Book/errata.html Copyright©2009-2011byCliffordA.Shaffer. ThisdocumentismadefreelyavailableinPDFformforeducationaland othernon-commercialuse. Youmaymakecopiesofthisfileand redistributeinelectronicformwithoutcharge. Youmayextractportionsof thisdocumentprovidedthatthefrontpage,includingthetitle,author,and thisnoticeareincluded. Anycommercialuseofthisdocumentrequiresthe writtenconsentoftheauthor. Theauthorcanbereachedat [email protected]. Ifyouwishtohaveaprintedversionofthisdocument,printcopiesare publishedbyDoverPublications (seehttp://store.doverpublications.com/048648582x.html). Furtherinformationaboutthistextisavailableat http://people.cs.vt.edu/˜shaffer/Book/. Contents Preface xiii I Preliminaries 1 1 DataStructuresandAlgorithms 3 1.1 APhilosophyofDataStructures 4 1.1.1 TheNeedforDataStructures 4 1.1.2 CostsandBenefits 6 1.2 AbstractDataTypesandDataStructures 8 1.3 DesignPatterns 12 1.3.1 Flyweight 13 1.3.2 Visitor 13 1.3.3 Composite 14 1.3.4 Strategy 15 1.4 Problems,Algorithms,andPrograms 16 1.5 FurtherReading 18 1.6 Exercises 20 2 MathematicalPreliminaries 25 2.1 SetsandRelations 25 2.2 MiscellaneousNotation 29 2.3 Logarithms 31 2.4 SummationsandRecurrences 32 2.5 Recursion 36 2.6 MathematicalProofTechniques 38 iii iv Contents 2.6.1 DirectProof 39 2.6.2 ProofbyContradiction 39 2.6.3 ProofbyMathematicalInduction 40 2.7 Estimation 46 2.8 FurtherReading 47 2.9 Exercises 48 3 AlgorithmAnalysis 55 3.1 Introduction 55 3.2 Best,Worst,andAverageCases 61 3.3 AFasterComputer,oraFasterAlgorithm? 62 3.4 AsymptoticAnalysis 65 3.4.1 UpperBounds 65 3.4.2 LowerBounds 67 3.4.3 ΘNotation 68 3.4.4 SimplifyingRules 69 3.4.5 ClassifyingFunctions 70 3.5 CalculatingtheRunningTimeforaProgram 71 3.6 AnalyzingProblems 76 3.7 CommonMisunderstandings 77 3.8 MultipleParameters 79 3.9 SpaceBounds 80 3.10 SpeedingUpYourPrograms 82 3.11 EmpiricalAnalysis 85 3.12 FurtherReading 86 3.13 Exercises 86 3.14 Projects 90 II FundamentalDataStructures 93 4 Lists,Stacks,andQueues 95 4.1 Lists 96 4.1.1 Array-BasedListImplementation 100 4.1.2 LinkedLists 103 4.1.3 ComparisonofListImplementations 112 Contents v 4.1.4 ElementImplementations 114 4.1.5 DoublyLinkedLists 115 4.2 Stacks 120 4.2.1 Array-BasedStacks 121 4.2.2 LinkedStacks 124 4.2.3 ComparisonofArray-BasedandLinkedStacks 125 4.2.4 ImplementingRecursion 125 4.3 Queues 129 4.3.1 Array-BasedQueues 129 4.3.2 LinkedQueues 134 4.3.3 ComparisonofArray-BasedandLinkedQueues 134 4.4 Dictionaries 134 4.5 FurtherReading 145 4.6 Exercises 145 4.7 Projects 149 5 BinaryTrees 151 5.1 DefinitionsandProperties 151 5.1.1 TheFullBinaryTreeTheorem 153 5.1.2 ABinaryTreeNodeADT 155 5.2 BinaryTreeTraversals 155 5.3 BinaryTreeNodeImplementations 160 5.3.1 Pointer-BasedNodeImplementations 160 5.3.2 SpaceRequirements 166 5.3.3 ArrayImplementationforCompleteBinaryTrees 168 5.4 BinarySearchTrees 168 5.5 HeapsandPriorityQueues 178 5.6 HuffmanCodingTrees 185 5.6.1 BuildingHuffmanCodingTrees 186 5.6.2 AssigningandUsingHuffmanCodes 192 5.6.3 SearchinHuffmanTrees 195 5.7 FurtherReading 196 5.8 Exercises 196 5.9 Projects 200 6 Non-BinaryTrees 203 vi Contents 6.1 GeneralTreeDefinitionsandTerminology 203 6.1.1 AnADTforGeneralTreeNodes 204 6.1.2 GeneralTreeTraversals 205 6.2 TheParentPointerImplementation 207 6.3 GeneralTreeImplementations 213 6.3.1 ListofChildren 214 6.3.2 TheLeft-Child/Right-SiblingImplementation 215 6.3.3 DynamicNodeImplementations 215 6.3.4 Dynamic“Left-Child/Right-Sibling”Implementation 218 6.4 K-aryTrees 218 6.5 SequentialTreeImplementations 219 6.6 FurtherReading 223 6.7 Exercises 223 6.8 Projects 226 III SortingandSearching 229 7 InternalSorting 231 7.1 SortingTerminologyandNotation 232 7.2 ThreeΘ(n2)SortingAlgorithms 233 7.2.1 InsertionSort 233 7.2.2 BubbleSort 235 7.2.3 SelectionSort 237 7.2.4 TheCostofExchangeSorting 238 7.3 Shellsort 239 7.4 Mergesort 241 7.5 Quicksort 244 7.6 Heapsort 251 7.7 BinsortandRadixSort 252 7.8 AnEmpiricalComparisonofSortingAlgorithms 259 7.9 LowerBoundsforSorting 261 7.10 FurtherReading 265 7.11 Exercises 265 7.12 Projects 269 Contents vii 8 FileProcessingandExternalSorting 273 8.1 PrimaryversusSecondaryStorage 273 8.2 DiskDrives 276 8.2.1 DiskDriveArchitecture 276 8.2.2 DiskAccessCosts 280 8.3 BuffersandBufferPools 282 8.4 TheProgrammer’sViewofFiles 290 8.5 ExternalSorting 291 8.5.1 SimpleApproachestoExternalSorting 294 8.5.2 ReplacementSelection 296 8.5.3 MultiwayMerging 300 8.6 FurtherReading 303 8.7 Exercises 304 8.8 Projects 307 9 Searching 311 9.1 SearchingUnsortedandSortedArrays 312 9.2 Self-OrganizingLists 317 9.3 BitVectorsforRepresentingSets 323 9.4 Hashing 324 9.4.1 HashFunctions 325 9.4.2 OpenHashing 330 9.4.3 ClosedHashing 331 9.4.4 AnalysisofClosedHashing 339 9.4.5 Deletion 344 9.5 FurtherReading 345 9.6 Exercises 345 9.7 Projects 348 10 Indexing 351 10.1 LinearIndexing 353 10.2 ISAM 356 10.3 Tree-basedIndexing 358 10.4 2-3Trees 360 10.5 B-Trees 364 10.5.1 B+-Trees 368 viii Contents 10.5.2 B-TreeAnalysis 374 10.6 FurtherReading 375 10.7 Exercises 375 10.8 Projects 377 IV AdvancedDataStructures 379 11 Graphs 381 11.1 TerminologyandRepresentations 382 11.2 GraphImplementations 386 11.3 GraphTraversals 390 11.3.1 Depth-FirstSearch 393 11.3.2 Breadth-FirstSearch 394 11.3.3 TopologicalSort 394 11.4 Shortest-PathsProblems 399 11.4.1 Single-SourceShortestPaths 400 11.5 Minimum-CostSpanningTrees 402 11.5.1 Prim’sAlgorithm 404 11.5.2 Kruskal’sAlgorithm 407 11.6 FurtherReading 409 11.7 Exercises 409 11.8 Projects 411 12 ListsandArraysRevisited 413 12.1 Multilists 413 12.2 MatrixRepresentations 416 12.3 MemoryManagement 420 12.3.1 DynamicStorageAllocation 422 12.3.2 FailurePoliciesandGarbageCollection 429 12.4 FurtherReading 433 12.5 Exercises 434 12.6 Projects 435 13 AdvancedTreeStructures 437 13.1 Tries 437 Contents ix 13.2 BalancedTrees 442 13.2.1 TheAVLTree 443 13.2.2 TheSplayTree 445 13.3 SpatialDataStructures 448 13.3.1 TheK-DTree 450 13.3.2 ThePRquadtree 455 13.3.3 OtherPointDataStructures 459 13.3.4 OtherSpatialDataStructures 461 13.4 FurtherReading 461 13.5 Exercises 462 13.6 Projects 463 V TheoryofAlgorithms 467 14 AnalysisTechniques 469 14.1 SummationTechniques 470 14.2 RecurrenceRelations 475 14.2.1 EstimatingUpperandLowerBounds 475 14.2.2 ExpandingRecurrences 478 14.2.3 DivideandConquerRecurrences 480 14.2.4 Average-CaseAnalysisofQuicksort 482 14.3 AmortizedAnalysis 484 14.4 FurtherReading 487 14.5 Exercises 487 14.6 Projects 491 15 LowerBounds 493 15.1 IntroductiontoLowerBoundsProofs 494 15.2 LowerBoundsonSearchingLists 496 15.2.1 SearchinginUnsortedLists 496 15.2.2 SearchinginSortedLists 498 15.3 FindingtheMaximumValue 499 15.4 AdversarialLowerBoundsProofs 501 15.5 StateSpaceLowerBoundsProofs 504 15.6 FindingtheithBestElement 507 x Contents 15.7 OptimalSorting 509 15.8 FurtherReading 512 15.9 Exercises 512 15.10Projects 515 16 PatternsofAlgorithms 517 16.1 DynamicProgramming 517 16.1.1 TheKnapsackProblem 519 16.1.2 All-PairsShortestPaths 521 16.2 RandomizedAlgorithms 523 16.2.1 Randomizedalgorithmsforfindinglargevalues 523 16.2.2 SkipLists 524 16.3 NumericalAlgorithms 530 16.3.1 Exponentiation 531 16.3.2 LargestCommonFactor 531 16.3.3 MatrixMultiplication 532 16.3.4 RandomNumbers 534 16.3.5 TheFastFourierTransform 535 16.4 FurtherReading 540 16.5 Exercises 540 16.6 Projects 541 17 LimitstoComputation 543 17.1 Reductions 544 17.2 HardProblems 549 17.2.1 TheTheoryofNP-Completeness 551 17.2.2 NP-CompletenessProofs 555 17.2.3 CopingwithNP-CompleteProblems 560 17.3 ImpossibleProblems 563 17.3.1 Uncountability 564 17.3.2 TheHaltingProblemIsUnsolvable 567 17.4 FurtherReading 569 17.5 Exercises 570 17.6 Projects 572 Contents xi VI APPENDIX 575 A UtilityFunctions 577 Bibliography 579 Index 585