Scenario Aggregation using Binary Decision Diagrams for Stochastic Programs with Endogenous Uncertainty Utz-Uwe Haus · Carla Michini · Marco Laumanns 7 1 0 2 Abstract Modeling decision-dependent scenario probabilities in stochastic pro- n grams is difficult and typically leads to large and highly non-linear MINLPs that a are very difficult to solve. In this paper, we develop a new approach to obtain a J compactrepresentationoftherecoursefunctionusingasetofbinarydecisiondia- 7 grams(BDDs)thatencodeanestedcoverofthescenarioset.TheresultingBDDs 1 can then be used to efficiently characterizethe decision-dependent scenario prob- ] abilitiesby a set of linear inequalities,which essentiallyfactorizes the probability C distribution and thus allows to reformulate the entire problem as a small mixed- O integer linear program. The approach is applicable to a large class of stochastic . programswithmultivariatebinaryscenariosets,suchasstochasticnetworkdesign, h networkreliability,orstochasticnetworkinterdictionproblems.Computationalre- t a sults show that the BDD-based scenario representation reduces the problem size, m and hence the computationtime, significantcompared to previous approaches. [ Keywords multistage stochastic optimization, exact reformulation, scenario 2 aggregation,reliabilityoptimization v 5 5 1 Introduction 0 4 0 When modeling a stochastic optimization problem as a stochastic program, one . usuallyassumesthatthescenarioprobabilitiesarefixedandgiveninputtheprob- 1 lem. In this setup, decisions can influence the outcome in each scenario but not 0 7 their probability of realizing. This standard way of reflecting the effect of uncer- 1 tainty in the model has been referred to as exogenous uncertainty, while the term : endogenous uncertainty has been introduced to describe situationswheredecisions v can actuallyinfluence the stochasticprocess itself and not only its outcomes [28]. i X While endogenous uncertainty is straightforwardto express in the framework r of Markov decision processes, its use in stochastic programming has remained a very rare, because the resulting models appear very cumbersome to formulate CrayEMEAResearchLab,CraySwitzerland,Hochbergerstr.60C,4057BaselSwitzerlandE- mail:[email protected]·WisconsinInstituteforDiscovery,UniversityofWisconsin–Madison, 330NorthOrchardStreet,MadisonWI53715,USAE-mail:[email protected]·IBMResearch, 8803Rueschlikon,Switzerland,E-mail:[email protected] 2 Utz-UweHausetal. and notoriouslyhard to solve. [19,20] distinguish two types of endogenous uncer- tainty, according to whether the decisions can influence the temporal unfolding of events or the probability distribution. Models of the first type are usually for- mulated by enumerating the various scenario trees that correspond to the differ- ent possible sequencing of events and couple them via disjunctiveformulationsof non-anticipativity constraints (see e.g. [12]. Solution approaches include implicit enumerationschemes [28], branch-and-bound coupled with Lagrangeduality[20], branch-and-cut[11],Lagrangerelaxationofthenon-anticipativityconstraints[22], decompositiontechniques [21] or heuristics [26]. In this paper we focus on problems of the second type of endogenous uncer- tainty,where decisions can influence the probabilitydistribution of the scenarios. The straightforward way to model this involves products or other non-linear ex- pressionsofdecisionvariablesthusleadstohighlynon-linearmodelsthatarevery hard to solve [37,29]. [1] as well as [17] have used convexification techniques to deal with these polynomials,while [37] have relied on linear approximations.[42] solve the non-linear stochastic program directly with a new constraint program- mingapproachusingnewtypeofextremeresourceconstraintincombinationwith an efficient propagationalgorithm. In order to deal with the usually exponentially large scenario sets, most ap- proaches had to resort to scenario sampling and work with a representative sub- set of scenarios. Recently [38] and [39] proposed an exact scenario bundling ap- proach, where scenarios that are equivalent with respect to the recourse function aremerged.In[42]scenariosaregroupedaccordingtotheshortestsurvivingpaths in the underlying network. An importantclass of stochastic optimizationproblems that typically include endogenousuncertaintycomesfromtherelatedareasofstochasticnetworkdesign, network reliability, and network interdiction. Common to these problems are an underlying graph whose edges or nodes are subject to random failures. The deci- sions to be made are in order to change their failure (or survival) probabilities of edgesornodessuchthattheresultingfailureprocesshassomefavorableproperties, suchasconnectivity,shortestpathlengthsorotherperformanceindicatorsrelated tothefunctionorprocesswhichthegraphexpresses.Suchproblemscannaturally be formulated as two-stagestochastic programs, where those tactical or “design” decisionsconstitutethefirststage,thefailureprocess is representedby a (usually exponentially)largesetofscenarios,andtheresultingperformanceofthenetwork in each failure scenario is captured by a recourse function (which might involve auxiliarysecond-stagedecisions, e.g., for determiningflows or shortest paths). We consider a general setting where scenarios can be formally described as randombinaryvectors.Thisholds inparticularforstochasticnetworkdesignand network reliability problems described above. In many such cases, the recourse function is monotonous with respect to the scenario. Take for example the short- est path length between a given pair of nodes in a graph: if scenarios indicate a set of surviving edges after some disaster, then the shortest path length can never increase when fewer edges survive. Our main idea is exploit this structure for scenario aggregation,and our main contributionis a new method for scenario aggregationbymeansofbinarydecisiondiagrams.Theapproachnotonlyenables a substantial reduction of the (initially exponentially large) scenario set without any accuracy loss, but also to efficiently characterize the decision-dependent sce- nario probabilities by a set of linear inequalities. This essentially factorizes the ScenarioAggregationusingBDDs 3 probability distribution and thus allows to reformulate the entire problem as a small mixed-integerlinear program. Ourmotivationandrunningexamplewillbetheshortestpathproblemhinted at above: Example 1 LetG=(V,E)beagraphwhoseedgeshavesome(independent)failure probabilitiespe ∈[0,1](e∈E).Afteradisastrouseventsomeedgeswillhavefailed, and each failure scenario is described by a set of surviving edges ξ ⊆ E. We are interested answering the followingtwo questions: 1. What is the expected value of the shortest path lengths between a set of dis- tinguished nodes? 2. If we can take some actions to modify the failure probabilitiespe, how should this be done optimally in order to to minimize the expected shortest path length? If the graph is a road network, and edges fail due to an earthquake, this is the settingof[37].Thesetofallfailurescenarioscanbetotallyorderedbycomparing the lengths of the shortest path between two (fixed) nodes: If in failure scenario there exists an st-path of length at most α, then in all scenarios where only a subset of these edges fails, this path still exists. Scenarios with the same shortest path length will be considered equal in the total order. Of course, depending on the application, many other scenario orderings are conceivable, e.g., according to longest st-path, number of disjoint st-paths, or size of the largest connected component of the graph. Thesuccessiveshortestpathproblemiscloselyrelatedtotheclassicst-reliability problem introduced by [33,34] and [46] and has been studied by many authors in numerous variantssince then. Here the networkis a system of circuitswith edges corresponding to components which have some probabilityof failing.The system asawholeoperatesifthereexistssomest-path,andtheprobabilityofhavingsuch apathiscalledst-reliability.Thisfundamentalproblemanditsvariousextensions have a vast number of applicationsin reliabilityengineering [6]. Thequestionalsonaturallyarisesinthesettingofinterdictionproblems,where typically one considers decisions to be actions of an attacker to weaken the net- work structure (to increase failure probability of edges, reduce capacity, etc.) in ordertohamperthenetwork’soperationalcapability.Forasurveyontypicalnet- work interdiction problems and corresponding modeling and solution approaches see [13]. Theapproachwedevelopinthispaperisnotlimitedtocombinatorialstochas- ticoptimizationproblems.Itcanalsobeusedincomputingtheexpectedobjective functionvalueforlinearprogramswithvaryingright-hand-sidecoefficients,where each coefficient independently attains one of two values with a given probability. Consideramaximizationproblemwithlessthanorequalconstraints,andassume that: (i) in the nominal problem all right-hand-side coefficients are at the larger of the twovalues;(ii) thescenarios aregivenby sets of rows wherethe coefficient takes the smaller value. Then the objective function is monotonously decreasing when takingsupersets amongthescenarios.In a similarway,irreducibleinconsis- tent linear systems and maximalconsistent subsets of LPs [36,2] can be studied. 4 Utz-UweHausetal. 2 Problem Setting We consider a two-stagestochastic optimizationproblem minE [f(ξ)] ξ|x Cx≤d (1) xe ∈{0,1} e∈E ξ =(ξe)e∈E ∈{0,1}|E| where we minimize the expected value (over all realized failure scenarios) of the recourse function f(ξ) : 2E → R conditioned on first-level decisions xe, where e ∈ E has an associated elementary failure event ξe whose probability may be influencedbydecisionxe.Theprobabilitydistributionofthescenariodistribution thus changes depending on the first-level decisions. The first-level decisions are to be taken subject to some set of linear constraints, e.g., a budget restriction e∈Exe ≤ β. Typically the evaluation of the recourse function f(ξ) will itself amount to solving an optimizationproblem. We will interpretscenarios as sets of P ‘positiveevents’,e.g.,thesets of survivingedges in a networkaftersomedisaster. In order to make our assumption on the problem structure more precise, we define the followingproperties and objects related to the recourse function. Definition 1 (aggregable recourse function) The recourse function f of a 2- stage stochasticoptimizationproblem is called aggregable if 1. f does not depend on the first-stagedecision x, 2. f is counter-monotonewith taking subsets of scenarios, ξ1 ⊆ξ2 ⇒f(ξ1)≥f(ξ2) (2) for all ξ1,ξ2 ∈2E, 3. the probabilitiesof events e∈E are independent. We denote the critical values of f, i.e. the different values f takes, by C(f) = {α : α=f(ξ),ξ∈2E}.MoreoverwedenotebyTMα(f)={ξ¯∈2E :f(ξ)>α, ξ= 1E −ξ¯} the sets of edge failures that forces f to take values strictly greater than α. Finally wedefine the set of minimalsurvivablescenarios for each criticalvalue by Mα(f)={ξ ∈2E : f(ξ)=α,∀ξ′ ⊂ξ:f(ξ′)>f(ξ)}, with α∈C(f). TheelementsofMα(f)arethetransversalofthecluttergivenbytheminimal members of TMα(f). Example 1 (continued) Inourrunningexample,computationoff amountstocom- puting a shortest path length f in a graph whose edge set depends on the sce- SP nario:theedgesthatfailedaremissinginthescenariocomparedtotheinitialgraph G. Clearly, shortest path length is counter-monotone with removing supersets of edges, i.e., considering subsets of the surviving edges of some other scenario. An- otherwayoflookingatthisistosaythatthesublevelsetsL−α(f)={ξ : f(ξ)≤α} of f are co-monotone (wrt. inclusion) with the natural ordering of real numbers αi <αj. Each set Mα(f) consists of the simple paths of length α in the graph. ScenarioAggregationusingBDDs 5 The set C(f) allowsus to rewrite the objectivefunction as follows: E [f(ξ)]= αProb[f(ξ)=α|x]. ξ|x α∈XC(f) WearethusinterestedincomputingtheprobabilitiesProb[f(ξ)=α],firstwithout taking into account possible decision variables xe, and then integratingthem. 3 Aggregation of Nested Scenario Covers In this section, we develop our approach to provide an effective scenario aggre- gation by using nested scenario covers. We first describe how scenario sets can be encoded as BDDs in such a way that these sets represent the sublevel sets of the recourse function. We then proceed to show how scenario probabilities can be represented as a flow of probability on the DAGs represented by the BDDs. Finally, we describe how those probability flows can be shaped using the binary first-stagedecisionvariablesvialinearinequalitiesand aderivetheresultingMIP formulation. 3.1 Scenario Aggregationusing BDDs Wewillfromnowonconsidertheset C(f)asan orderedset(α0,...,αN)withthe naturalordering< of thereal numbers. This induces an orderingof the set M(f) as (Mα0(f),...,MαN(f)). EachMα(f) induces amonotoneBooleanfunctionΦ≤α on thescenarioswhose minimal true points are the members of Mα(f) by “Φ≤α(ξ) = 1 if and only if f(ξ)≤α.”Itiswell-definedbecauseof (2).NotethatΦ≤α(ξ)describesthesupport of the cumulative distribution function Prob[f(ξ)≤ α]. We refer to [14] for more details on Boolean functions. In the reliability analysis setting the function Φ≤α is called system state function, and we can assume that the system is a coherent binary system. Then Mα(f) are the path sets of the system and Prob[Φ≤α(ξ)=1] is the reliability,see [4]. WeproposetoencodeeachfunctionΦ≤α forα∈C(f)intheformofa(reduced, ordered) binary decision diagram (BDD), see [30,8]. Intuitively, a BDD can be understood as a directed acyclic graph with a single source and a single sink, in which each source-sink path encodes one or more feasible points of a Boolean function. The graph is layered, with the layers indexed by the variables of the function. The key property is that every node has at most 2 children, and for eachsub-dagthatincludesasinktherearenoisomorphiccopiesintheBDD.This canmakethemmuchmorecompactthanconventionaldecisiontreesencodingthe same feasible points. Although BDDs can be of exponential size compared to the function they encode, for various classes of functions one can find compact BDD encodings. Notethat finding a minimalsize BDD is NP-hard, even for monotone Booleanfunctions[45],andnoteveryBDDconstructionalgorithmwillbeoutput- polynomial. Despite all of these caveats, BDDs have been highly successful in many applications,see [32] for an overview.Furthermore,good softwarefor BDD 6 Utz-UweHausetal. handlingexists,whichprovidesvariousheuristicBDDsizeminimizationstrategies, e.g, cudd of [44]. Recently,itwas shownthatBDDs encoding independencesystems canalways be constructed in output-polynomial time, if equivalence of minors in the asso- ciated circuit system can be checked efficiently by a top-down compilation rule of [25]. We will use this to prove Lemma 3. Definition 2 (BDD) Let E = (e1,...,e|E|) be a finite linearly ordered set. A binary decision diagram (BDD)B=(E,V,A⊆V ×V,ǫ:V →{1,...,|E|},l((u,v)): A → {0,1}) is either degenerate, B = (E,{⊥},∅,ǫ,l) or B = (E,{⊤},∅,ǫ,l), or consists of a directed acyclic graph on node set V with exactly one root u∗, two leaves ⊤,⊥, arc set A, arc label function l:A→{0,1} and a layering function ǫ: V →{1,...,|E|}. Unless B is degenerate, it must satisfy the followingconditions: – B is a layered digraph, i.e. its node set partitions into layers Li = {u ∈ V : ǫ(u)=i}, i=1,...,|E| and L|E|+1 ={⊤}, such that |L1|=1 (the root layer) and L is the terminal layer. |E|+1 – Arcs only extend to layers with higher index, i.e. ∀(u,v)∈A, ǫ(u)<ǫ(v). – Every node u ∈ V \{⊥,⊤} is the tail of exactly two differently labeled arcs, i.e. ∀u ∈ V \{⊥,⊤}∃v0,v1 ∈ V : v0 6= v1,(u,v0) ∈ A,(u,v1) ∈ A,l((u,v0)) 6= l((u,v1)). – For any two distinct nodes u,v of V, the sub-BDDs rooted at u and v are not isomorphic,i.e. Bu 6=∼Bv. HereBudenotesthesub-BDDofBdefinedbythesub-dagof(V,A)rootedatu, with node and arc label functions obtained by restriction of ǫ and l to V(Bu) and A(Bu), respectively. BDD-isomorphism =∼ is defined as isomorphism of directedgraphswithidenticalvariablesetandidenticallyevaluatingfunctions ǫ and l. Foreachnodeu∈V,theoutgoingarclabeledby1iscalledtheTrue-arc,andthe outgoingarc labeled by 0 is called the False-arc. NotethatBDDsaccordingtothisdefinitionarecalledreduced, ordered BDD inthe classical literature [30,8,32]. (In drawing BDDs one often suppresses the ⊥-node and all arcs δin(⊥), since these can be easily reconstructed.) Fori=1,...,|E|,thelayerwidthwioflayerLiisdefinedaswi =|Li|.TheBDD- width isthenw=maxi=1,...,|E|wi.Thetotal size ofaBDD is|V|+1= |iE=|1+1wi. Each BDD B represents a Boolean function: P ΦB(x)= xǫ(u)∧ (¬xǫ(u)). (3) u∗-P⊤_apnath(Pu:,lv(()^ue,vd)g)e=o1f P(u,:lv()(^ue,dvg)e)=o0f Furthermore, every Boolean function has a BDD representation (which is es- sentiallyunique given the ordering of E). Remark 1 Let Φ(x) be a monotone Boolean function and ΦD(x) = ¬Φ(¬x) its dual. If B = (E,V,A,ǫ,l) is a BDD representing Φ(x), then B′ = (E,V,A,ǫ,l′) with l′((u,v)) =1−l((u,v)) represents ¬ΦD(x), i.e. except for the labeling, Φ(x) and ΦD have the same BDD. ScenarioAggregationusingBDDs 7 This is a direct consequence of the fact that the BDD not only encodes all feasiblepointsof the monotoneBoolean function but also all infeasiblepoints (as paths from u∗ to ⊥) and formula(3). TheisomorphismbetweenBDDsforamonotoneBooleanfunctionanditsdual function is useful if the set Mα are given as a list of minimal true elements: this list provides a covering-type formulation for the dual function and the output- polynomialtimetop-down BDD constructionrule using matrixminors of [25]. 3.2 IntegratingScenario Probabilities Lemma 1 Given a BDD B encoding a Boolean function Φ : 2E → {0,1} and a scenario distribution such that the events e∈E are independently distributed random variables, the value Prob[B]:=Prob[Φ(ξ)=1] can be computed in linear time. NotethatlineartimeherereferstolinearinthesizeoftheBDDB(and|E|);in theproofweonlyhavetoshowthatwecanobtainasystemofequationscomputing the probability that is shaped like the BDD and does not have excessively large subexpressions. Proof The construction is similar to the classical computation of probabilities in a scenario tree, with the difference being that recurring computationsin subtrees areavoidedbecausetheBDD structure(‘noisomorphicsub-BDDs’) mergesthese cases. Let pi ∈ [0,1] be the probability of event ei ∈ E. We proceed by induction: If B is the BDD consisting of only the ⊤-terminal,it encodes the entire cube {0,1}|E| and Prob[Φ(ξ)=1]=1. Similarly,if it consistsonly of the ⊥ terminal,in encodes the empty set, so Prob[Φ(ξ)=1]=0. Assume now that the statementhas been proven for BDDs with k−1>0 layers. LetBbeaBDDwithklayers.Theuniquerootnodeu∗ hasexactly2childrenthat we’llcallTrue-child and FALSE-child;lettheeventfortherootlayerbe ǫ(u∗)=e1. Initiallyassumethatthesub-BDDshavetheirrootsinthelayerL2 directlyfollow- inge1. Then,denotingtheprobabilitiescomputedforthesub-BDDs by pTrue-child and pFalse-child, Prob[Φ(ξ)=1]=p1pTrue-child+(1−p1)pFalse-child, (4) since scenarios contain/don’t contain event e1 with these probabilities and the probabilitiesofthecompletionshavebeencomputedinductivelyforthesub-BDDs of size at most k−1. If a sub-BDD, say the one for ξ1 = 1, has its root in a layer l > 2, then this sub-BDD encodes the set of solutions (1,⋆,...,⋆,ξ′) where ⋆ denotes that 0 or 1 l−2 can be chosen arbitrarily for every (ξ′) ∈ {0,1}|E|−l.For simplicity assume that | {z } only the True-child of u∗ starts at layer l. Then l−1 Prob[Φ(ξ)=1]=p1 (pi+(1−pi)) pTrue-child+(1−p1)pFalse-child ! (5) i=2 Y =p1pTrue-child+(1−p1)pFalse-child, 8 Utz-UweHausetal. since the scenarios do not depend on events on layer 2,...,l−1 for the choice of ξ1 =1. 3.3 Shaping the Scenario Distribution Equations (4) and (5) givea recursive set of linear equations (startingwith p = ⊤ 1,p⊥ =0 for the leafs) that can be used to directly turn a BDD encoding Mα(f) intoasetoflinearconstraintstocomputeProb[Φ≤α(ξ)=1]usingasmanyvariables andequationsastheBDDhasnodes(eachnodehasexactlyonedefiningequation). We now show how decisions influencing the probabilities pi can be incorporated into these formulasas well. Lemma 2 Given a BDD B encoding a Boolean function Φ : 2E → {0,1} and a scenario distribution such that the events ei∈E are independently distributed random variables depending on decisions (xi)i=1,...,|E|, the value Prob{ξ|x}[Φ(ξ) = 1] can be computed in linear time. Proof This follows immediately from (5) because the events are assumed to be independent: Let pi(x) denote the conditioned probabilities, and pTrue-sub(x), pFalse-sub(x) the conditionedsub-BDD probabilities.Then Prob[Φ(ξ)=1]=p1(x)pTrue-sub(x)+(1−p1(x))pFalse-sub(x), (6) just as in (4). Again, equation (6) gives a recursive set of equations (starting with p = ⊤ 1,p⊥ = 0 for the leafs), but depending on how pi(x) is defined they may not be linear. In the next section we will show that if the decisions are binary (or can takeonly a fixed number of values), the problem can be formulatedas a compact mixed-integerprogrammingproblem. 3.4 MIP Model for AggregatedScenario Probabilities Let us now consider for ease of presentation the case where |E| binary decision variables xi ∈{0,1} influence the nominalevent probabilitype such that pi if xi=0 pi(x)= (pi+∆i if xi=1 where ∆i ∈[−pi,1−pi] is the boost or decrease of the probabilityof event ei ∈E when taking decision xi =1. With this definition, and in light of equation (6), we define for each arc (u,v) in the BDD Bα≤ =(E,A,ǫ,l) pi(x) if (u,v)∈A,ǫ(u)=i,l((u,v))=1 p(u,v)(x)= ((1−pi(x)) if (u,v)∈A,ǫ(u)=i,l((u,v))=0, that is, depending on the decision xi all arcs in the BDD that start at node u associatedwithevent ei areassignedtheappropriateprobabilitydepending on xi ScenarioAggregationusingBDDs 9 and whether they correspond to the event being in the scenario (l((u,v))=1) or not (l((u,v))=0). Introducing intermediatevariables for each BDD node we can thus formulate the following MIP constraints to compute the probability of all scenarios of Bα≤ subject to decision variables x=(xi)i=1,...,|E|: p≤α(u)=1 u=⊤∈V(Bα≤) p≤α(u)=0 u=⊥∈V(Bα≤) Pα=(p≤α,x) : p≤α(u)=p(u,v)(x)p≤α(v)+p(u,w)(x(u)p,≤αv)(,w(u),w)∈A(Bα≤),v6=w Itiseasytoseethpa≤αtxt∈∈he[{00c,,o11n]}VsEt(rBaα≤in)tsofPα canbewrittenaslinearinequaliti(e7s), which we avoided above to unclutter the formulation. For example, in the case (u,v),(u,w)∈A(Bα≤),v6=w,l((u,v))=1,l((u,w))=0 where ǫ(u)=i: p≤α(u)=p(u,v)(x)p≤α(v)+p(u,w)(x)p≤α(w) can be written using inequalities p≤α(u)≤(pi+∆i)p≤α(v)+(1−pi−∆i)p≤α(w)+(1−xi) p≤α(u)≤pip≤α(v)+(1−pi)p≤α(w)+xi p≤α(u)≥(pi+∆i)p≤α(v)+(1−pi−∆i)p≤α(w)−(1−xi) p≤α(u)≥pip≤α(v)+(1−pi)p≤α(w)−xi. ThentheformulationofPα contains|V(Bα≤)|+|E|variablesand4|V(Bα≤)|+1 constraints(plus box constraintsfor all variables). Clearly, similar models can be built if xi is allowed to take a finite number of discrete values, using 2 constraints per possible value of xi, or disjunctive, or constraintprogrammingtechniques. ToobtainaMIPmodelforaggregableproblemsoftheform(1)wecansimply combine blocks of the form (7) using inclusion-exclusion on the scenario sets, exploiting: min αp=α α∈C(f) p=α0 =p≤α0(u∗) u∗ root node of Bα≤0 p=αi =p≤αi(u∗)−p=αi−1 u∗ root node of Bα≤i,αi ∈C(f)\{α0} (8) Cx≤d (p≤αi,x)∈Pαi αi ∈C(f) Note that the x variables will be shared between Pαi and Pαj for αi,αj ∈C(f). When canweexpecta polynomial-sizeMIP formulationusingaggregatedsce- narios? This depends on two prerequisites: We need to be sure that the set C(f) and the BDDs Bα≤ are small. The first condition may be satisfied automatically forsomeproblemclasses,ormaybea consequenceofthewaythattheinstanceis 10 Utz-UweHausetal. encoded, e.g., giving an explicit list of relevant objective values. The second con- ditionishardtocontrolingeneral,butoftenonecanidentifyparametricproblem subclassessuchthatthewidthofBα≤ isappropriatelyboundedforeachfixedvalue of the parameter: Bounds on the BDD width have been studied widely. For our purposes we will use the results of [25] and [35]. Toactuallyconstructtheformulationwealsoneedtoensurethattheconstruc- tionofeachBα≤ canbeperformedin(output-)polynomialtime.Theeasiestwayto ensurethisistospecifyatop-down-compilationrule,suchthateachBDD nodeis touched only once in the construction,and decisions about whether to merge the child subtrees are made immediately in the node in polynomial time. For mono- tone Boolean functions, which encode the members of an independence system, this can be achieved if equivalence of minors of the associated circuit system can be checked efficiently, see [25]. A prominent example is that of the graphic ma- troid[43],i.e.whenscenariosareforestsorsimplecyclesofagraph.By Remark1 this is also always the case if the sets M≤α (or the dual sets) are explicitly given. Thus if M≤α is explicitlygiven or can be enumerated in polynomialtime,efficient BDDconstructionreducestothequestionwhethertheBDDwidthispolynomially bounded. In light of this reformulationtechnique we make the followingdefinition. Definition 3 (polynomiallyaggregable2-stagestochasticoptimizationprob- lem) Anaggregable2-stagestochasticoptimizationproblemiscalledpolynomially aggregable if – the number of criticalvalues C(f) of f is polynomial,and – all BDDs Bα≤(f) can be constructed in polynomialtime. Itfollowsfrom[25,Thm.6]thatwheneverthematrixcontainingtheincidence vectorsoftheelementsofTMα(f)hasbandwidthk,then Bα≤(f)has sizeatmost n2k−1, which is polynomial if k ∈ O(logn). Hence 2-stage stochastic optimiza- tion problems with bandwidth-limitedsets TMα(f) and a polynomialnumber of criticalvalues C(f) are polynomiallyaggregable: Lemma 3 2-stage stochastic optimization problems with input size σ, a polynomial number of critical values C(f), and an explicitly given sets TMα(f) whose incidence matrix has bandwidth k∈log(σ) are polynomially aggregable. Note that our definition of (8) contains as a special case the union of products (UPP) problem (this is the case where the decisions xe have no influence on the probabilities). [4] show that UPP is NP-hard, even when the sets M(f) are ex- plicitly given – thus unless P =NP the BDD construction will be exponential in these cases. 4 Applications 4.1 Pre-disaster Planning Problem The pre-disaster planning problem we consider first is an instance of the running example1weusedthroughout.Asmentionedbefore,thealgorithmicchallengesare