Table Of ContentAPPROXIMATE DYNAMIC PROGRAMMING
POLICIES AND PERFORMANCE BOUNDS FOR
AMBULANCE REDEPLOYMENT
ADissertation
PresentedtotheFacultyoftheGraduateSchool
ofCornellUniversity
inPartialFulfillmentoftheRequirementsfortheDegreeof
DoctorofPhilosophy
by
MatthewScottMaxwell
May2011
(cid:13)c 2011MatthewScottMaxwell
ALLRIGHTSRESERVED
APPROXIMATEDYNAMICPROGRAMMINGPOLICIESAND
PERFORMANCEBOUNDSFORAMBULANCEREDEPLOYMENT
MatthewScottMaxwell,Ph.D.
CornellUniversity2011
Ambulance redeployment is the practice of dynamically relocating idle ambu-
lances based upon real-time information to reduce expected response times for
future emergency calls. Ambulance redeployment performance is often mea-
sured by the fraction of “lost calls” or calls with response times larger than a
given threshold time. This dissertation is a collection of four papers detailing
results for designing ambulance redeployment policies and bounding the per-
formanceofanoptimalambulanceredeploymentpolicy.
InthefirstpaperambulanceredeploymentismodeledasaMarkovdecision
process, and an approximate dynamic programming (ADP) policy is formu-
lated for this model. Computational results show that the ADP policy is able to
outperform benchmark policies in two different case studies based on real-life
data. Results ofpracticalconcernincludinghowtheADPpolicyperforms with
varyingcallarrivalratesandvaryingambulancefleetsizesarealsoincluded.
In the second paper we discuss ADP tuning procedures, i.e., the process
of selecting policy parameters to improve performance. We highlight limita-
tions present in many ADP tuning procedures and propose direct-search tun-
ingmethodstoovercometheselimitations. Tofacilitatedirect-searchtuningfor
ambulance redeployment, we reformulate the ADP policy using the so-called
“post-decisionstate”formulation. Thisreformulationallowspolicydecisionsto
becomputedwithoutcomputationallyexpensivesimulationsandmakesdirect-
searchtuningcomputationallyfeasible.
InthethirdpaperweprovethatmanyADPpoliciesareequivalenttoasim-
pler class of policies called nested compliance table (NCT) policies that assign
ambulances to bases according to the total number of available ambulances.
Furthermore,weshowthatifambulancesarenotassignedtothebasesdictated
by the NCT policy, the ADP-based policies will restore compliance to the NCT
policywithoutdispatcherintervention.
In the fourth paper we derive a computationally tractable lower bound on
the minimum fraction of lost calls and propose a heuristic bound based upon
simulation data from a reference policy, i.e., a policy we believe to be close to
optimal. In certain circumstances, both bounds can be quite loose so we intro-
duce a stylized model of ambulance redeployment and show empirically that
forthismodelthelowerboundisquitetight.
BIOGRAPHICALSKETCH
Matthew Maxwell was born in St. George, Utah and survived his childhood
and teenage years by skimboarding in the Virgin River, making forts near said
river, and reading MS-DOS textbooks as his leisure vacation reading. He also
attempted an underground miniature golf course with one of his brothers, but
the big hole they dug eventually got swallowed up when his parents put in
a swimming pool. Matt still can’t decide if he’d rather have the unfinished
miniaturegolfcourseortheswimmingpool,buthe’sleaningtowardsthepool.
Before going to college, Matt served a two-year mission in Japan for the
ChurchofJesusChristofLatter-daySaints. Uponhisreturn,hebeganhisstud-
ies at Brigham Young University, graduating with honors in 2006 with a B.S. in
ComputerScienceandminorsinJapaneseandMathematics.
Mattandhiswife,Caroline,haveatwo-year-olddaughternamedMadeline.
In June of 2011, after completing his Ph.D. at Cornell University, Matt and his
family will be moving to Cary, North Carolina where Matt will begin his job as
OperationsResearchSpecialistworkingonhotelrevenuemanagementforSAS.
iii
TomywifeCaroline.
AndmyparentsLeonandElizabeth.
iv
ACKNOWLEDGEMENTS
First and foremost, I acknowledge my advisors Shane Henderson and Huseyin
Topaloglu for their unmeasurable support in this research. They have been a
pleasuretoworkwith.
MuchofmyresearchbuildsuponthepreviousworkofMateoRestrepowho
deserves credit for laying the foundation of approximate dynamic program-
ming policies for ambulance redeployment. I also appreciate Armann Ingolf-
sson, Andrew Mason, and Oddo Zhang for advice, insight, and data that they
havecontributedtomyresearch.
I thank Mark Lewis and David Ruppert for serving on my committee and
assistingmeinmydissertationwork. Ialsothanktherestofthefacultyandstaff
of the School of Operations Research and Information Engineering, especially
KathyKing,whopossessesaninexhaustiblewealthofinformation.
Reaching back to my distant undergraduate past, I feel an immense sense
of gratitude to my undergraduate advisor Sean Warnick. His inspiration and
untiringenthusiasmisintoxicating. ItisnotanunderstatementtosaythatSean
had a large impact on shaping the trajectory of my life. I also acknowledge
Roger Hansen at the U.S. Bureau of Reclamation for being a superlative super-
visor. My undergraduate research revolved heavily around my work with the
Bureau,andRogerwastheonewhofacilitatedit.
Andmostimportantly,Iacknowledgemywifeformarryingme—evenafter
I told her I planned on graduate school—and always supporting me. I also
acknowledge my daughter who gave me the most adorable type of motivation
imaginable.
This research was partially supported by National Science Foundation
grants DMI 0400287, DMI 0422133, and CMMI 0758441 and a U.S. Department
v
ofHomelandSecurityGraduateFellowship.
vi
TABLEOFCONTENTS
BiographicalSketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
TableofContents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
ListofFigures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Approximate Dynamic Programming for Ambulance Redeploy-
ment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Tuning Approximate Dynamic Programming Policies for Ambu-
lanceRedeploymentviaDirectSearch . . . . . . . . . . . . . . . . 5
1.4 Equivalence Results for Approximate Dynamic Programming
andComplianceTablePoliciesforAmbulanceRedeployment . . 6
1.5 PerformanceBoundsforAmbulanceRedeployment . . . . . . . . 7
2 ApproximateDynamicProgrammingforAmbulanceRedeployment 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 AmbulanceRedeploymentasaMarkovDecisionProcess . . . . . 14
2.2.1 StateSpace . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 FundamentalDynamics . . . . . . . . . . . . . . . . . . . . 19
2.2.4 TransitionCosts . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.5 ObjectiveFunctionandOptimalityEquation . . . . . . . . 21
2.3 ApproximateDynamicProgramming . . . . . . . . . . . . . . . . 23
2.4 BasisFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 ComputationalResults . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.1 ExperimentalSetup . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.2 BaselinePerformance . . . . . . . . . . . . . . . . . . . . . 38
2.5.3 ContributionsofDifferentBasisFunctions . . . . . . . . . 41
2.5.4 ComparisonwithRandomSearch . . . . . . . . . . . . . . 43
2.5.5 AdditionalRedeployments . . . . . . . . . . . . . . . . . . 45
2.5.6 VaryingFleetSizes . . . . . . . . . . . . . . . . . . . . . . . 46
2.5.7 VaryingCallArrivalRates . . . . . . . . . . . . . . . . . . . 48
2.5.8 EffectofTurn-outTime . . . . . . . . . . . . . . . . . . . . 50
2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3 Tuning Approximate Dynamic Programming Policies for Ambulance
RedeploymentviaDirectSearch 52
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 ApproximateDynamicProgramming . . . . . . . . . . . . . . . . 56
3.2.1 ADPPolicies . . . . . . . . . . . . . . . . . . . . . . . . . . 57
vii
3.2.2 TuningADPpolicies . . . . . . . . . . . . . . . . . . . . . . 59
3.3 LimitationsofCommonADPTuningApproaches . . . . . . . . . 60
3.3.1 LimitationsofRegression-BasedApproaches . . . . . . . . 61
3.3.2 LimitationsofLP-BasedApproaches . . . . . . . . . . . . . 63
3.4 AmbulanceRedeployment . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.1 AmbulanceRedeploymentasanMDP . . . . . . . . . . . . 68
3.4.2 ADPPolicyforAmbulanceRedeployment . . . . . . . . . 75
3.5 SimulationOptimizationTuningResults . . . . . . . . . . . . . . . 78
3.6 Post-DecisionStateFormulation . . . . . . . . . . . . . . . . . . . 83
3.6.1 TruncatedMicrosimulationPolicy . . . . . . . . . . . . . . 84
3.6.2 Limiting Behavior of the Truncated Microsimulation
ValueFunctionApproximation . . . . . . . . . . . . . . . . 87
3.6.3 ComputationalResults . . . . . . . . . . . . . . . . . . . . . 90
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4 Equivalence Results for Approximate Dynamic Programming and
ComplianceTablePoliciesforAmbulanceRedeployment 94
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2 Problemformulation . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.1 NCTpolicyformulation . . . . . . . . . . . . . . . . . . . . 98
4.2.2 ADP-CTpolicyformulation . . . . . . . . . . . . . . . . . . 98
4.3 EquivalenceofADP-CTandNCTpolicies . . . . . . . . . . . . . . 101
4.3.1 NCTandTOpolicytransformations . . . . . . . . . . . . . 102
4.3.2 ADP-CTandTOtransformations . . . . . . . . . . . . . . . 103
4.3.3 Equivalenceresults . . . . . . . . . . . . . . . . . . . . . . . 104
4.4 Additionalresults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4.1 Interpretingφ+ values . . . . . . . . . . . . . . . . . . . . . 105
b
4.4.2 Necessityofnon-increasingφ+ . . . . . . . . . . . . . . . . 105
b
4.4.3 Convexfunctionsφ satisfynon-increasingφ+(n ) . . . . . 106
b b b
4.4.4 Out-of-compliancerobustness . . . . . . . . . . . . . . . . 107
4.5 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5 PerformanceBoundsforAmbulanceRedeployment 111
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2 ProblemFormulation . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.3 StochasticLowerBoundontheResponseTime . . . . . . . . . . . 119
5.4 LowerBound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.5 HeuristicLowerBound . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.6 StylizedModelBounds . . . . . . . . . . . . . . . . . . . . . . . . . 131
A AppendixforChapter3 136
A.1 MicrosimulationValueFunctionDerivation . . . . . . . . . . . . . 136
A.2 TruncatedMicrosimulationValueFunctionDerivation . . . . . . 136
A.3 ProofofTheorem3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 138
viii
Description:process, and an approximate dynamic programming (ADP) policy is formu- . 3.3.2 Limitations of LP-Based Approaches . (2006), and Nair and Miller-Hooks (2006). The .. sue in our discussion of Approximate Policy Iteration.